In the last article I wrote about running OpenClaw on a VPS.
It worked.
But it never really felt like mine.
So I rebuilt the entire stack locally on a Mac Mini.
Not because it was easier.
Because it gave me full control of the environment.
And once you start running agents every day, control becomes the real requirement.
The system quickly stops being “an agent”
Standing up OpenClaw is straightforward.
Run the wizard.
Create an agent.
Connect a channel.
Done.
But once the system becomes part of your workflow, new problems appear.
Memory.
Costs.
Context size.
Where work outputs live.
How agents interact with tools.
You stop building an agent.
You start operating a system.
Architecture overview
The easiest way to understand the setup is as layers.
At the top sit the agents: Kodi, Rook, and Lab, each with a distinct role.
Beneath them, the OpenClaw gateway connects those agents to memory, tooling, automation, and the underlying workspace and provider infrastructure that makes the system work.
flowchart TD
TG[Telegram Interface]
TS[Tailscale Access]
K[Kodi Operator Agent]
R[Rook Revenue Agent]
L[Lab Model Testing Agent]
G[OpenClaw Gateway]
Q[QMD Memory and Session State]
A[Cron Heartbeat and Ollama Jobs]
T[Local Tooling and Providers]
C[Codex Cursor Gemini NotebookLM]
P[Anthropic OpenRouter Ollama]
X[Supabase GitHub Google]
V[Obsidian Vault Kodi Workspace]
W[Rook Revenue Workspace and Vault Mirror]
TG --> K
TG --> R
TG --> L
TS --> G
K --> G
R --> G
L --> G
G --> Q
G --> A
G --> T
T --> C
T --> P
T --> X
Q --> V
Q --> W
A --> V
A --> WSplitting agents changes everything
The base setup assumes one agent.
That breaks down quickly.
So the system now runs three.
Kodi handles operations and coordination.
Rook scouts ideas and revenue opportunities.
Lab is a safe place to test models and fallback strategies.
Each agent has isolated state, model routing, and its own Telegram bot interface.
That separation removed a surprising amount of noise.
Experiments no longer break production behaviour.
Cheap tasks no longer trigger expensive models.
Debugging becomes much easier.
The vault is the control plane
Instead of the default OpenClaw workspace, the main agent operates inside my Obsidian vault.
The vault runs an IPARAG structure a digital organisation framework that expands on Tiago Forte’s popular PARA method.
Ideas.
Projects.
Areas.
Resources.
Archive.
Governance.
That structure matters more than it sounds.
Agents read from it.
Write to it.
Organise work inside it.
Humans and agents are operating inside the same system.
That shared environment makes coordination far simpler.
Memory has to be inspectable
QMD-backed retrieval plus file-based memory keeps the system inspectable.
That gives the agents durable context.
But more importantly, it keeps the memory visible.
You can see what the agent knows.
Correct it.
Remove it.
Opaque memory systems make that almost impossible.
The biggest unsolved problem: context
The hardest issue I’m still working through is context size.
Every request carries a surprising amount of instruction and system context.
Agent rules.
Workspace files.
Memory references.
Tool instructions.
That front-loading adds up quickly.
It works.
But it is inefficient.
The real optimisation problem for agent systems is not just model choice.
It is how much context the system sends on every call.
That is where most token spend hides.
Coding agents are tools, not co-workers
Another lesson was how to handle implementation work.
Instead of forcing the OpenClaw agents to write and debug code themselves, they delegate to local tools.
Codex.
Gemini.
Cursor.
The OpenClaw agents prepare the brief.
The coding tools do the heavy work.
Then the agents review the results.
This keeps agent threads short and avoids endless debugging loops inside chat.
Codex became the real sidekick
The tool I rely on most now is the Codex app.
OpenClaw itself lives as a project inside Codex.
Whenever something behaves strangely, Codex helps investigate.
It reviews logs.
Checks OpenRouter token usage.
Surfaces configuration mistakes.
It is also useful for optimisation.
I regularly have it scan the system looking for improvements.
Sometimes that comes from log analysis.
Sometimes from ideas pulled in from other builders writing about similar systems.
The setup is constantly evolving.
Cheap automation matters
Some jobs run on powerful models.
Most should not.
Selected cron and maintenance jobs run on smaller models or local Ollama.
Daily briefings.
Maintenance tasks.
Simple housekeeping.
Those tasks do not need reasoning power.
They need reliability.
Running them locally keeps costs predictable.
Security needed attention early
The moment agents connect to real services, security becomes real too.
Two changes made a big difference.
Secrets moved to environment variables instead of config files.
And the gateway stays local on loopback, with Tailscale Serve layered on top for secure access.
That is a much safer posture than exposing a raw endpoint.
Not perfect.
But materially better.
It works, But it still has moments
The system works well now.
But it still has its moments.
Model quirks.
Unexpected context issues.
The occasional runaway instruction loop.
Dropped memory.
Failed cron jobs.
That seems to be the nature of agent systems right now.
You do not finish building them.
You operate them.
You tweak them.
The real lesson
Running agents is easy.
Operating them is the real work.
The intelligence is only one piece.
The architecture around it matters more.
Memory.
Security.
Context discipline.
Model routing.
Without those pieces you have a demo.
With them you start to have a system.