OpenClaw System Prompt: What Your Agent Actually Sees Before It Replies

Hex · April 30, 2026 · 8 min read

Read from search, close with the playbook

If this post helped, here is the fastest path into the full operator setup.

Search posts do the first job. The preview, homepage, and full playbook show how the pieces fit together when you want the whole operating system.

Read the free preview See the tone and depth before you buy anything. Visit the homepage Get the full value prop, proof, and operator overview in one place. Get the Playbook, $19.99 Email-first checkout, instant delivery, full refund if it is not useful.

Most people debug AI agents from the wrong end. They stare at the answer and ask, “Why did it say that?” The better question is earlier: what did the agent actually see before it replied?

OpenClaw has a documented answer. Every agent run gets a custom OpenClaw-owned system prompt, plus the assembled context for that run. That context includes the system prompt, conversation history, tool calls, tool results, and attachments. It is bounded by the model's context window, which is why prompt hygiene is not cosmetic. It changes what the model can reason about.

This post is the operator version of that map. Not the mythology. Not “the agent remembers everything.” The real pieces: prompt sections, injected workspace files, skills, tools, history, context engines, and the parts that are safety guidance versus hard enforcement.

Context is everything the model receives

The OpenClaw docs define context as everything sent to the model for a run. The beginner model is simple:

System prompt: OpenClaw-built rules, tools, skills list, time/runtime data, and injected workspace files.
Conversation history: the recent user and assistant messages included for the session.
Tool calls and results: command output, file reads, attachments, images, audio, files, and similar run evidence.

That definition matters because context is not the same thing as memory. Memory can live on disk and be reloaded later. Context is the current working set inside the model window. If a fact is in memory/YYYY-MM-DD.md but not retrieved or injected for the current run, the model may not see it. If a huge tool result is still in the working context, the model may spend valuable tokens rereading noise.

This is why I care so much about memory discipline, session pruning, and compaction. They are not abstract internals. They decide what survives into the next useful answer.

The system prompt is OpenClaw-owned

The system prompt docs are explicit: OpenClaw builds a custom prompt for every agent run, and it does not use the default pi-coding-agent prompt. OpenClaw assembles the prompt and injects it into the run.

The exact rendered text can change by version, runtime, channel, enabled tools, and workspace. But the documented structure is intentionally compact and sectioned. The important sections include tooling, safety, skills when available, workspace, documentation, injected workspace files, sandbox state when enabled, date and time, reply tags when supported, heartbeats when enabled, runtime metadata, and reasoning visibility.

In practice, that means your agent is not just answering from the visible chat. It is answering from a layered prompt that says what tools exist, where the workspace is, what local docs should be used, which workspace files were injected, whether sandboxing applies, and what runtime it is operating inside.

The workspace files are part of the prompt budget

OpenClaw can inject workspace bootstrap files under Project Context. The docs list files such as AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md, and long-term memory when present. Sub-agent sessions are smaller: the docs say they only inject AGENTS.md and TOOLS.md.

That is powerful, and it is easy to abuse. If AGENTS.md becomes a giant operating manual, every run pays for it. If TOOLS.md turns into a dumping ground for stale infrastructure notes, the agent sees a larger prompt before it has done any useful work. OpenClaw trims large bootstrap files and exposes config knobs for per-file and total injection limits, but trimming is a last line of defense, not a writing strategy.

A good workspace bootstrap should behave like this:

AGENTS.md      = operating rules and pointers
SOUL.md        = voice, tone, persona
TOOLS.md       = local environment notes
MEMORY.md      = durable long-term index
memory/*.md    = topic and daily notes loaded on demand

The AGENTS template also makes a useful cultural point: runtime-provided startup context should be used first, and manual rereads are for missing or deeper context. That keeps the agent from burning tokens re-reading files it already has.

If your agent keeps giving generic answers, the problem may be the operating context, not the model. Get ClawKit and set up the workspace files, memory flow, and prompt hygiene that keep an agent useful.

Skills are advertised, then loaded on demand

OpenClaw does not have to paste every skill's full instructions into every prompt. The context docs describe a compact skills list: name, description, and location. The model is expected to read the matching SKILL.md only when the task calls for it.

That design is the right tradeoff. A browser automation skill, a GitHub issue skill, and a weather skill should not all be paid for in full just because they exist. The base prompt only needs enough metadata for the model to choose the right skill. Then the detailed instruction file enters context when it is actually useful.

This is also why skill descriptions matter. They are routing labels inside the prompt. A vague skill description wastes the mechanism; a precise one tells the model when to load the file and when to ignore it. If you build custom skills, treat the description like an API boundary, not marketing copy.

Tools have visible and invisible context cost

The docs call out two tool costs. First, the tool list text appears in the system prompt so the model knows what it can use. Second, tool schemas are sent as JSON to the model provider so structured calls can be made. Those schemas count toward the context window even though you do not see them as normal text.

That explains a common operator surprise: “I barely wrote anything, why is the prompt already large?” Because the model may have received the tool list, the tool schemas, injected workspace files, and runtime metadata before conversation history is even considered.

Use tools because they make the agent grounded. Just do not pretend they are free. A good OpenClaw setup makes the important tools available, keeps workspace files lean, and relies on inspection commands when you need to understand the total prompt shape.

Inspect the context instead of guessing

OpenClaw gives operators direct inspection surfaces. The context docs list these as the fast path:

/status
/context list
/context detail
/usage tokens
/compact

/context list gives a rough breakdown of injected files, tool overhead, and session usage. /context detail goes deeper into per-file, per-tool schema, per-skill entry, and system prompt sizes. /usage tokens can append token usage to normal replies. /compact summarizes older history to free room when a thread has become too heavy.

The practical habit is simple: when an agent is acting stale, bloated, or oddly anchored to old work, inspect context before blaming “AI randomness.” You may find a huge bootstrap file, stale tool output, missing memory retrieval, or a thread that should have been compacted two tasks ago.

Prompt modes change what sub-agents see

OpenClaw can render smaller prompts for sub-agents. The system prompt docs describe three runtime prompt modes: full, minimal, and none. full includes the normal sections. minimal is used for sub-agents and omits sections such as skills, memory recall, self-update, model aliases, user identity, reply tags, messaging, silent replies, and heartbeats. Tooling, safety, workspace, sandbox, current date and time when known, runtime, and injected context stay available. none returns only the base identity line.

That is why a sub-agent can be efficient without being blind. It gets enough context to do the assigned work, but it does not need the entire main-session personality and memory surface. If you delegate coding, research, or browser work, that separation is part of the value.

The context engine decides what gets assembled

A context engine controls how OpenClaw builds model context for each run. The docs define its job as deciding which messages to include, how to summarize older history, and how to manage context across sub-agent boundaries.

Most operators will use the built-in legacy engine, which is the default. Plugin engines exist for teams that want different assembly, compaction, or recall behavior. The lifecycle has four main points: ingest a new message, assemble the context before a model run, compact when the context window is full or when /compact is used, and perform after-turn work.

The docs show two useful checks:

openclaw doctor
cat ~/.openclaw/openclaw.json | jq '.plugins.slots.contextEngine'

If you are experimenting with a plugin engine, the configured slot is the important part:

{
  plugins: {
    slots: {
      contextEngine: "legacy"
    }
  }
}

I would not start here unless you are already comfortable with the normal context tools. First make your workspace files lean, verify memory retrieval, and learn what /context detail says. Then consider a custom engine if your operating model truly needs it.

Safety guidance is not the same as enforcement

The system prompt includes safety guardrails. The docs also say those guardrails are advisory. They guide model behavior, but hard enforcement belongs to tool policy, exec approvals, sandboxing, and channel allowlists.

That distinction is not cynicism; it is responsible operations. A prompt can say “do not do dangerous things.” The runtime should still make dangerous actions harder or require approval. If you are running agents with real tools, use both layers: good instructions for judgment, hard controls for boundaries.

The operator takeaway

Your OpenClaw agent's answer is downstream of the context it received. That context is built from an OpenClaw-owned system prompt, workspace bootstrap files, skill metadata, tool descriptions and schemas, conversation history, tool evidence, attachments, and the active context engine's assembly choices.

So when the answer is wrong, do not only edit the next message. Ask better operational questions. Did the agent see the right workspace files? Is memory stored but not loaded? Are stale tool results bloating the run? Did a sub-agent get a minimal prompt when it needed a more explicit task brief? Is the safety boundary only written in prose, or is it enforced by tools and approvals?

Once you see the prompt as infrastructure, the whole system gets easier to improve. Better bootstrap files, cleaner skills, smaller tool noise, deliberate compaction, and hard runtime controls beat hoping the next model will magically infer your operating system.

Want the complete guide? Get ClawKit — $9.99