Skip to main content

Architecture Study: OpenClaw

Date: 2026-03-22 | Source: github.com/openclaw/openclaw


What OpenClaw Is

OpenClaw is an open-source, self-hosted personal AI assistant platform that enables LLMs to autonomously execute multi-step tasks across messaging platforms, devices, and physical robots. Created by Peter Steinberger (founder of PSPDFKit), it launched as "Clawdbot" in November 2025, was renamed after an Anthropic trademark complaint, and by March 2026 had 247,000 GitHub stars — making it one of the most popular open-source AI projects. Steinberger joined OpenAI in February 2026, transitioning the project to an open-source foundation.

OpenClaw positions itself as an "operating system for AI agents" rather than a chatbot wrapper. Its robotics integration via the RosClaw bridge is directly relevant to Auraison's user-plane architecture.


OpenClaw Architecture

Hub-and-Spoke with a Central Gateway

OpenClaw uses a Gateway as its single control plane — a WebSocket server (ws://127.0.0.1:18789) that routes messages, assembles context, invokes LLMs, executes tools, and delivers responses.

User (any of 20+ messaging channels)
→ Channel Adapter (normalize to unified message)
→ Access Control (allowlists, DM pairing, group mention filters)
→ Context Assembly (session history + AGENTS.md + SOUL.md + TOOLS.md + vector search)
→ Model Invocation (Claude, GPT, Gemini, Ollama, local models)
→ Tool Execution (ReAct loop: reason → act → observe → repeat)
→ Response Delivery (format → channel adapter → persist turn)

The Agentic Loop (ReAct Pattern)

The core loop implements ReAct (Reason + Act):

  1. Gateway receives a message and assembles context (session history, dynamic system prompt from workspace files, semantically similar prior conversations via vector search)
  2. Context streams to the configured LLM provider
  3. LLM either responds directly or emits a structured tool call
  4. Runtime intercepts the tool call, executes it (bash, browser, ROS2 publish, MCP server call, Docker sandbox), and feeds the result back as a new message
  5. Loop continues until the LLM signals completion or configured limits are hit
  6. A Lane Queue enforces serial execution per session to prevent concurrent conflicts

Key difference from Auraison: the loop is streaming and iterative within a single invocation, not synchronous subprocess calls that block until completion.

Robotics Integration: RosClaw Bridge

For physical robot control, OpenClaw bridges to ROS 2 via RosClaw:

User (messaging app) → OpenClaw Gateway → RosClaw Plugin → rosbridge_server (WebSocket)
→ ROS 2 DDS → Physical Robot

Seven core ROS 2 tools:

ToolFunction
ros2_publishPublish to any ROS 2 topic (/cmd_vel, /gimbal, /lights)
ros2_subscribe_onceRead a single message from a topic (sensor data)
ros2_service_callCall a ROS 2 service
ros2_action_goalSend a Nav2/MoveIt2 action goal
ros2_param_get/setRead/write ROS 2 parameters
ros2_list_topicsDiscover available topics
ros2_camera_snapshotCapture RGB-D frame for VLM analysis

Critical design principle: "Agents are not real-time controllers, so ROS 2 handles time-sensitive loops." OpenClaw handles the decision layer; ROS 2 handles classical control. This clean separation is identical to Auraison's philosophy but implemented more concretely.

Skill System

Skills are Markdown folders with a SKILL.md file containing natural language instructions. They are NOT injected wholesale into every prompt — only compact references are loaded, and the model actively selects which skills to consult. This is skill-on-demand loading.

Additionally, every skill on ClawHub (3,200+ skills) is an MCP server. The agent discovers and calls MCP tools natively, meaning the skill ecosystem is protocol-native, not prompt-native.

Memory and State

  • SQLite + Markdown as the entire backing store
  • Vector search (semantic similarity) over prior conversations
  • BM25 keyword search for exact matches
  • Workspace files (AGENTS.md, SOUL.md, TOOLS.md, MEMORY.md) configure behavior without touching code
  • Heartbeat mechanism: a 30-minute proactive loop evaluates HEARTBEAT.md, enabling scheduled autonomous behavior rather than purely reactive patterns

Multi-Agent Patterns

  • Session-based isolation: different channels/groups map to isolated agent instances with separate workspaces, models, and behaviors
  • Inter-agent communication: sessions_list (discover), sessions_send (message), sessions_history (fetch transcripts), sessions_spawn (create new sessions)
  • Trust encoded in session IDs: agent:id:main = full access, agent:id:channel:dm:id = sandboxed

Safety and Guardrails

Layered security model:

LayerMechanism
NetworkGateway binds to loopback by default; remote via SSH/Tailscale
AuthenticationToken/password for WebSocket; device pairing with crypto challenge-response
DM PoliciesPairing (approval workflow), allowlist, open, or disabled
Tool AuthorizationDangerous tools denied by default; policy precedence: Tool > Provider > Global > Agent > Group > Sandbox
SandboxingUntrusted sessions in ephemeral Docker containers
Prompt InjectionExplicitly stated as "not solved" — mitigated via model selection and tool lockdown

Third-party guardrail ecosystem:

  • ClawGuard — cryptographic proof of guardrail enforcement at runtime
  • OpenGuardrails — real-time prompt injection defense (50+ patterns)
  • OpenClaw PRISM — zero-fork runtime security across 10 lifecycle hooks
  • NemoClaw (NVIDIA) — enterprise guardrails layer

OpenClaw-RL: Continuous Learning from Deployment

A fully decoupled asynchronous architecture with four independent loops:

  1. Policy serving (SGLang) — serves the current policy for inference
  2. Rollout collection — gathers interaction trajectories from live usage
  3. PRM judging — evaluative signals (+1/-1/0) on action quality
  4. Policy training (Megatron) — updates the model

Key innovation: Hindsight-Guided On-Policy Distillation (OPD) — converts directive corrections from users/environment into token-level supervision without explicit annotation pipelines. Agents improve through normal usage.

Simulation: ClawBody

MuJoCo simulation with real-time motor command translation — train in simulation, deploy to hardware. Same agent code works in both environments (sim-to-real parity).


Architectural Contrast with Auraison

1. Agentic Loop: Streaming ReAct vs. Synchronous Subprocess

OpenClaw: The agent loop is streaming and iterative within a single long-lived process. The LLM reasons, calls a tool, observes the result, and continues — all within one WebSocket session. Tool results flow back immediately. The Lane Queue serializes per-session but doesn't block the entire system.

Auraison: run_agent() calls subprocess.run() with capture_output=True — fully synchronous and blocking. The agent runs to completion, exits, and returns JSON. There is no streaming, no mid-execution observation by the caller, and no way to compose tool results across agents within a single reasoning chain.

# Auraison's current pattern (base.py)
result = subprocess.run(cmd, capture_output=True, text=True, cwd=str(REPO_ROOT))
return json.loads(result.stdout)

Impact: Auraison cannot implement the ReAct observe-reason-act loop across agent boundaries. If NotebookAgent submits a job and ClusterAgent needs to verify the cluster first, these are two separate subprocess invocations with no shared reasoning context.

Benefit of adopting OpenClaw's pattern: A persistent agent runtime (even if still using claude -p under the hood) with streaming tool results would enable multi-step workflows within a single reasoning chain. The --resume session_id mechanism exists in Auraison's codebase but is unused — it could implement session continuity.

2. Skill Definition: Protocol-Native MCP vs. Prompt + allowedTools

OpenClaw: Skills are MCP servers with typed tool schemas. The agent discovers available tools via the MCP protocol, gets structured JSON schemas, and calls them with validated parameters. 3,200+ skills exist in the ClawHub ecosystem. Skill-on-demand loading means only relevant skill instructions enter the context window.

Auraison: Agent capabilities are defined by a natural-language system prompt and an --allowedTools string like "Bash(kubectl *),Bash(ray *),Read". There's no skill registry, no typed schema, no way to enumerate capabilities programmatically.

Impact: Auraison agents are opaque — you can't inspect what an agent can do without reading its prompt. You can't compose skills from different agents. You can't test individual skills in isolation.

Benefit of adopting OpenClaw's pattern: Auraison already operates in an MCP-rich environment (Claude Code has native MCP support). Converting agent capabilities to MCP tools would give: typed schemas, programmatic discovery, composability across agents, and access to the broader MCP ecosystem. The ros-mcp-server in the turtlebot-maze reference app is already doing this for ROS 2.

3. Context Assembly: Dynamic Workspace Files vs. Static System Prompts

OpenClaw: Context is assembled dynamically from workspace files (AGENTS.md, SOUL.md, TOOLS.md, MEMORY.md) plus vector search over prior conversations. The system prompt is composed at runtime from these files, not hardcoded. Behavior changes by editing Markdown, not code.

Auraison: System prompts are hardcoded strings in Python files:

# notebook_agent.py
SYSTEM_PROMPT = (
"You are the NotebookAgent. Submit Ray Jobs to the KubeRay cluster, "
"poll status, and trigger copyback when complete. ..."
)

Impact: Changing agent behavior requires code changes, redeployment, and a new subprocess invocation. There's no runtime introspection of what the agent "knows."

Benefit of adopting OpenClaw's pattern: Auraison's .claude/agents/ directory already contains YAML frontmatter + system prompts — this is halfway to OpenClaw's workspace-file pattern. Enriching these with dynamic context assembly (lakehouse query results, recent job history, cluster state) would make agents context-aware without prompt engineering.

4. Memory: Vector Search + Workspace Files vs. In-Memory Dict

OpenClaw: SQLite with vector extensions provides semantic search over prior conversations. BM25 provides keyword matching. MEMORY.md captures persistent knowledge. The agent can recall relevant prior interactions and learn from past failures.

Auraison: Job state is an in-memory Python dict (_jobs: dict in api/jobs.py). Agents have no memory across invocations — each subprocess.run() starts fresh. The --resume session_id flag exists but relies on Claude Code's internal session state, which is ephemeral.

Impact: Auraison agents cannot learn from past job failures, recall successful patterns, or build up operational knowledge over time. Every invocation is a cold start.

Benefit of adopting OpenClaw's pattern: Auraison's data plane (DuckDB + DuckLake) is far more capable than OpenClaw's SQLite. The missing piece is wiring agent memory INTO the lakehouse — storing agent traces, tool call results, and operational patterns in DuckLake tables that agents can query in future invocations. The data plane design doc already envisions this ("episodic memory", "procedural memory") but it's not implemented.

5. ROS 2 Integration: Direct Bridge vs. Layered Abstraction

OpenClaw (RosClaw): Seven typed tools (ros2_publish, ros2_subscribe_once, ros2_action_goal, etc.) that map directly to ROS 2 primitives via rosbridge WebSocket. Clean, minimal, immediately usable.

Auraison: The turtlebot-maze app uses ros-mcp-server (MCP over rosbridge WebSocket :9090), which is architecturally identical to RosClaw. But this lives in a separate repo and isn't integrated into the control plane — it's a user-plane application detail.

Impact: Auraison's control plane has no awareness of ROS 2 capabilities. The ClusterAgent manages Kubernetes but can't inspect what ROS nodes are running or what topics are available.

Benefit of adopting OpenClaw's pattern: Promoting the ROS 2 bridge to a first-class control-plane capability (a RosAgent with ros2_list_topics, ros2_subscribe_once, ros2_camera_snapshot tools) would enable the control plane to observe and reason about user-plane robot state — completing the control loop.

6. Multi-Agent Communication: Session Primitives vs. No Inter-Agent Protocol

OpenClaw: Agents communicate via session primitives (sessions_send, sessions_spawn, sessions_history). Agent A can spawn Agent B, send it a task, and read its transcript. Trust boundaries are encoded in session IDs.

Auraison: Agents don't communicate. The FastAPI router orchestrates them procedurally — submit_notebook_job() calls NotebookAgent, the router checks the result, then calls WandBAgent separately. There's no mechanism for one agent to delegate to or consult another.

Impact: Complex workflows (submit job → monitor → evaluate experiment → decide whether to retrain → retrain) require the API layer to implement all orchestration logic in Python. The agents are "dumb workers," not collaborating peers.

Benefit of adopting OpenClaw's pattern: Inter-agent session primitives would enable the planned AgentOps subsystem without building a custom orchestration engine. NotebookAgent could spawn a WandBAgent session to evaluate results, read its transcript, and decide next steps — all within a single reasoning chain.

7. Proactive Behavior: Heartbeat vs. Reactive-Only

OpenClaw: The Heartbeat mechanism evaluates HEARTBEAT.md every 30 minutes, enabling scheduled autonomous behavior — the agent can proactively check systems, summarize overnight events, or trigger maintenance without human prompting.

Auraison: Purely reactive — agents run only when the API router invokes them. There's no mechanism for proactive monitoring, scheduled health checks, or autonomous maintenance.

Benefit of adopting OpenClaw's pattern: A heartbeat for the ClusterAgent would enable proactive cluster monitoring — checking GPU utilization, detecting failing nodes, alerting on W&B metric regressions — without requiring a human to hit the API endpoint. This is the "self-managing infrastructure" vision from Auraison's README, but OpenClaw has a concrete implementation pattern.

8. Safety: Ecosystem vs. allowedTools

OpenClaw: Layered security with policy precedence (Tool > Provider > Global > Agent > Group > Sandbox), ephemeral Docker sandboxes for untrusted sessions, and a third-party guardrail ecosystem (ClawGuard, OpenGuardrails, PRISM, NemoClaw). Tool authorization is fine-grained per-session.

Auraison: Security is --allowedTools strings per agent. No sandboxing, no per-session policies, no guardrail framework. The blast-radius containment is real (subprocess isolation) but coarse.

Benefit: Auraison's v1.5 Guardrails Engine plan (per-role constraint checks, per-tool-call evaluation) aligns with OpenClaw's model. The policy precedence hierarchy (Tool > Provider > Global > Agent) is a concrete blueprint Auraison could adopt directly.

9. Continuous Learning: OpenClaw-RL vs. No Learning Loop

OpenClaw-RL: Agents improve through normal usage via Hindsight-Guided On-Policy Distillation — corrections from users/environment become training signals without annotation pipelines. Four decoupled async loops (policy serving, rollout collection, judging, training).

Auraison: No learning loop. Agent behavior is fixed by the system prompt and the underlying Claude model. The data plane has the infrastructure (DuckLake for storing traces, W&B for experiment tracking) but no feedback mechanism from operational outcomes to agent behavior.

Benefit: OpenClaw-RL's architecture maps cleanly onto Auraison's existing infrastructure: SGLang policy serving → vLLM on torch.dev.gpu; rollout collection → agent trace storage in DuckLake; training → TRL jobs via the existing training pipeline. The pieces exist; the feedback loop doesn't.


What Auraison Has That OpenClaw Doesn't

Auraison advantageOpenClaw gap
Data plane / lakehouseSQLite + Markdown; no analytical query layer, no time travel, no schema evolution
GPU orchestrationNo compute scheduling; relies on external infrastructure
Training pipelineNo SFT/DPO/GRPO integration; OpenClaw-RL is a separate project
World modelsNo Cosmos integration; no sim-to-real pipeline (ClawBody is MuJoCo-only, no photorealistic transfer)
Digital twinsNo persistent asset state tracking
Plane separationSingle-process monolith; Gateway is the entire system
Enterprise readinessPersonal assistant; no multi-tenancy, no billing, no audit compliance

Summary: What Auraison Should Adopt

OpenClaw PatternAuraison BenefitImplementation Effort
Streaming ReAct loopMulti-step agent workflows without procedural orchestrationMedium — leverage --resume and streaming output from claude -p
MCP-native skillsTyped, discoverable, composable agent capabilitiesLow — MCP infrastructure already exists (ros-mcp-server, Claude Code MCP)
Dynamic context assemblyContext-aware agents that adapt to current system stateLow — extend .claude/agents/ YAML with DuckLake queries
Agent memory via lakehouseAgents learn from past operations; no cold startsMedium — wire agent traces into DuckLake tables
Inter-agent sessionsAgent-to-agent delegation without API-layer orchestrationMedium — implement session primitives in run_agent()
Heartbeat / proactive behaviorSelf-managing infrastructure (the stated vision)Low — cron + claude -p with cluster health prompt
Policy-precedence securityFine-grained, layered guardrails replacing coarse --allowedToolsMedium — aligns with planned v1.5 Guardrails Engine
ROS 2 bridge as control-plane toolControl plane observes and reasons about robot stateLow — promote ros-mcp-server to a control-plane agent tool

The core insight: OpenClaw proves that a claude -p-style agent runtime (LLM + tools + ReAct loop) can scale to 247K-star adoption when the composition primitives are right — MCP for tools, workspace files for behavior, session primitives for multi-agent, and heartbeat for proactive behavior. Auraison has better infrastructure (lakehouse, GPU orchestration, world models) but worse agent plumbing. The highest-leverage move is adopting OpenClaw's agent runtime patterns on top of Auraison's superior data and compute layers.