Most people who are frustrated with OpenClaw are actually frustrated with one of three things: unclear prompts, a SOUL.md that's fighting itself, or an underpowered model. These are all fixable. Here's the full playbook.
Start With Your SOUL.md
Your SOUL.md is applied to every single request. A bad SOUL.md affects every response. Get this right first.
Common SOUL.md mistakes
Too long with contradictions:
# Bad example
- Be brief
- Always provide complete explanations
- Keep responses under 50 words
- Include full context and background
Contradictory rules produce inconsistent behaviour. Pick one approach and stick to it.
Too vague:
# Bad example
- Be helpful
- Respond naturally
- Use good judgment
These do almost nothing. The model defaults to "helpful, natural, good judgment" already. Your SOUL.md should override defaults, not restate them.
Missing format preferences: Without format rules, the model guesses. Specify: bullet points vs prose, length limits, whether you want explanations or just answers, how to handle uncertainty.
A well-structured SOUL.md
# My Assistant
## Response format (always apply)
- Default to bullet points over paragraphs for lists
- Max 3 bullets unless I ask for more
- No preamble ("Great question!", "Of course!")
- No sign-off ("Let me know if you need anything else!")
- When uncertain: say so explicitly rather than guessing
## Tone
- Direct. Skip diplomatic softening.
- Technical where relevant — I'm a developer
- Brief acknowledgements are fine; long ones aren't
## Action rules
- Confirm before any external action (email send, file delete, API call)
- If a task has multiple steps, list them and ask before starting
## My context
- Role: [your work context]
- Timezone: [your timezone]
- Current priority: [active project or focus]
Write Better Prompts
The anatomy of a good OpenClaw prompt
[Action verb] + [specific object] + [format/constraints] + [what to do with it]
Weak: "Tell me about this email" Strong: "Summarise this email in 3 bullet points. Flag any action required on my side. Draft a reply if one is needed — don't send without confirmation."
Weak: "Help with this code" Strong: "Review this Python function for bugs and performance issues. Lead with the most critical issue. Show the corrected version. [paste code]"
Give format instructions upfront
Don't make the model guess what you want:
Answer as a table with columns: [column names]
Answer as a numbered list, max 5 items
Answer in one sentence only
Answer as a step-by-step process, each step on a new line
Be explicit about constraints
Under 100 words.
No jargon — explain as if to a non-technical person.
Based only on what I've told you — don't add assumptions.
Use [specific terminology / unit / format].
Reference memory explicitly
OpenClaw can use past conversations, but it doesn't always know which ones are relevant. Help it:
Based on what you know about the [project name] project from our conversations last week...
Using the notes I stored on [topic]...
Given the decision we made about [subject]...
Manage Context Quality
Memory hygiene
Stale or contradictory memory degrades response quality over time. Periodically:
Show me everything you have stored about [topic].
Then clean up:
Delete the notes about [topic] from [timeframe] — that's no longer accurate.
Update the record for [project] to replace [old info] with [new info].
Keep project context current
When a project changes significantly, update the memory:
Update my project context for [project name]:
- We've moved from [old approach] to [new approach]
- Current status: [status]
- Open decisions: [list]
This supersedes what you had before.
Don't over-stuff context
Sending very long messages or pasting enormous documents into every query slows responses and can confuse the model. For large reference material:
- Summarise the key facts and store them in memory
- Reference the stored version instead of re-pasting the full document
Model Selection Matters
The single biggest quality lever after prompting is your model choice.
| Task Type | Best Model |
|---|---|
| Complex reasoning, long docs | Claude Sonnet or GPT-4o |
| Following complex SOUL.md rules | Claude Sonnet |
| High-volume simple tasks | Gemini Flash or GPT-4o mini |
| Coding tasks | Claude Sonnet or GPT-4o |
| Cost-critical daily use | Gemini Flash |
| Offline / privacy-critical | Local (Llama 3.1 13B+) |
If you're getting consistently mediocre responses, try bumping to a stronger model before assuming the prompt is the problem.
Debugging Poor Responses
Step 1: Isolate SOUL.md vs prompt
Test with SOUL.md temporarily disabled (or a minimal version) to see if the instructions are causing the issue:
# Temporarily in config.yml
soul_md: false
If quality improves, the issue is in your SOUL.md. If not, it's the prompt or model.
Step 2: Check for memory conflicts
What do you currently know about [the topic I'm asking about]?
If the response shows stale or contradictory information, clean it up before re-asking.
Step 3: Try a more explicit prompt
If the response is off-target, it's almost always because the prompt was ambiguous. Rewrite it with explicit format, constraints, and goal.
Step 4: Try a different model
If everything else looks right and quality is still low, switch to a stronger model and retest.
Patterns That Work Consistently
The Clarify-Then-Act pattern
Before you start, tell me in one sentence what you're going to do.
If that's right, do it. If not, I'll correct you.
Prevents long, wrong outputs by catching misunderstandings upfront.
The Stepwise pattern
Break this task into steps.
Complete step 1 and show me the output.
Wait for my approval before continuing to step 2.
Good for high-stakes tasks where each stage matters.
The Constrained Draft pattern
Write a first draft. Then review it against these criteria:
1. [criterion 1]
2. [criterion 2]
3. [criterion 3]
Show me both the draft and your self-review.
Related reading: