Architecture Study: OpenClaw

Date: 2026-03-22 | Source: github.com/openclaw/openclaw

What OpenClaw Is

OpenClaw is an open-source, self-hosted personal AI assistant platform that enables LLMs to autonomously execute multi-step tasks across messaging platforms, devices, and physical robots. Created by Peter Steinberger (founder of PSPDFKit), it launched as "Clawdbot" in November 2025, was renamed after an Anthropic trademark complaint, and by March 2026 had 247,000 GitHub stars — making it one of the most popular open-source AI projects. Steinberger joined OpenAI in February 2026, transitioning the project to an open-source foundation.

OpenClaw positions itself as an "operating system for AI agents" rather than a chatbot wrapper. Its robotics integration via the RosClaw bridge is directly relevant to Auraison's user-plane architecture.

OpenClaw Architecture

Hub-and-Spoke with a Central Gateway

OpenClaw uses a Gateway as its single control plane — a WebSocket server (ws://127.0.0.1:18789) that routes messages, assembles context, invokes LLMs, executes tools, and delivers responses.

User (any of 20+ messaging channels)
  → Channel Adapter (normalize to unified message)
  → Access Control (allowlists, DM pairing, group mention filters)
  → Context Assembly (session history + AGENTS.md + SOUL.md + TOOLS.md + vector search)
  → Model Invocation (Claude, GPT, Gemini, Ollama, local models)
  → Tool Execution (ReAct loop: reason → act → observe → repeat)
  → Response Delivery (format → channel adapter → persist turn)

The Agentic Loop (ReAct Pattern)

The core loop implements ReAct (Reason + Act):

Gateway receives a message and assembles context (session history, dynamic system prompt from workspace files, semantically similar prior conversations via vector search)
Context streams to the configured LLM provider
LLM either responds directly or emits a structured tool call
Runtime intercepts the tool call, executes it (bash, browser, ROS2 publish, MCP server call, Docker sandbox), and feeds the result back as a new message
Loop continues until the LLM signals completion or configured limits are hit
A Lane Queue enforces serial execution per session to prevent concurrent conflicts

Key difference from Auraison: the loop is streaming and iterative within a single invocation, not synchronous subprocess calls that block until completion.

Robotics Integration: RosClaw Bridge

For physical robot control, OpenClaw bridges to ROS 2 via RosClaw:

User (messaging app) → OpenClaw Gateway → RosClaw Plugin → rosbridge_server (WebSocket)
  → ROS 2 DDS → Physical Robot

Seven core ROS 2 tools:

Tool	Function
`ros2_publish`	Publish to any ROS 2 topic (`/cmd_vel`, `/gimbal`, `/lights`)
`ros2_subscribe_once`	Read a single message from a topic (sensor data)
`ros2_service_call`	Call a ROS 2 service
`ros2_action_goal`	Send a Nav2/MoveIt2 action goal
`ros2_param_get/set`	Read/write ROS 2 parameters
`ros2_list_topics`	Discover available topics
`ros2_camera_snapshot`	Capture RGB-D frame for VLM analysis

Critical design principle: "Agents are not real-time controllers, so ROS 2 handles time-sensitive loops." OpenClaw handles the decision layer; ROS 2 handles classical control. This clean separation is identical to Auraison's philosophy but implemented more concretely.

Skill System

Skills are Markdown folders with a SKILL.md file containing natural language instructions. They are NOT injected wholesale into every prompt — only compact references are loaded, and the model actively selects which skills to consult. This is skill-on-demand loading.

Additionally, every skill on ClawHub (3,200+ skills) is an MCP server. The agent discovers and calls MCP tools natively, meaning the skill ecosystem is protocol-native, not prompt-native.

Memory and State

SQLite + Markdown as the entire backing store
Vector search (semantic similarity) over prior conversations
BM25 keyword search for exact matches
Workspace files (AGENTS.md, SOUL.md, TOOLS.md, MEMORY.md) configure behavior without touching code
Heartbeat mechanism: a 30-minute proactive loop evaluates HEARTBEAT.md, enabling scheduled autonomous behavior rather than purely reactive patterns

Multi-Agent Patterns

Session-based isolation: different channels/groups map to isolated agent instances with separate workspaces, models, and behaviors
Inter-agent communication: sessions_list (discover), sessions_send (message), sessions_history (fetch transcripts), sessions_spawn (create new sessions)
Trust encoded in session IDs: agent:id:main = full access, agent:id:channel:dm:id = sandboxed

Safety and Guardrails

Layered security model:

Layer	Mechanism
Network	Gateway binds to loopback by default; remote via SSH/Tailscale
Authentication	Token/password for WebSocket; device pairing with crypto challenge-response
DM Policies	Pairing (approval workflow), allowlist, open, or disabled
Tool Authorization	Dangerous tools denied by default; policy precedence: Tool > Provider > Global > Agent > Group > Sandbox
Sandboxing	Untrusted sessions in ephemeral Docker containers
Prompt Injection	Explicitly stated as "not solved" — mitigated via model selection and tool lockdown

Third-party guardrail ecosystem:

ClawGuard — cryptographic proof of guardrail enforcement at runtime
OpenGuardrails — real-time prompt injection defense (50+ patterns)
OpenClaw PRISM — zero-fork runtime security across 10 lifecycle hooks
NemoClaw (NVIDIA) — enterprise guardrails layer

OpenClaw-RL: Continuous Learning from Deployment

A fully decoupled asynchronous architecture with four independent loops:

Policy serving (SGLang) — serves the current policy for inference
Rollout collection — gathers interaction trajectories from live usage
PRM judging — evaluative signals (+1/-1/0) on action quality
Policy training (Megatron) — updates the model

Key innovation: Hindsight-Guided On-Policy Distillation (OPD) — converts directive corrections from users/environment into token-level supervision without explicit annotation pipelines. Agents improve through normal usage.

Simulation: ClawBody

MuJoCo simulation with real-time motor command translation — train in simulation, deploy to hardware. Same agent code works in both environments (sim-to-real parity).

Architectural Contrast with Auraison

1. Agentic Loop: Streaming ReAct vs. Synchronous Subprocess

OpenClaw: The agent loop is streaming and iterative within a single long-lived process. The LLM reasons, calls a tool, observes the result, and continues — all within one WebSocket session. Tool results flow back immediately. The Lane Queue serializes per-session but doesn't block the entire system.

Auraison: run_agent() calls subprocess.run() with capture_output=True — fully synchronous and blocking. The agent runs to completion, exits, and returns JSON. There is no streaming, no mid-execution observation by the caller, and no way to compose tool results across agents within a single reasoning chain.

# Auraison's current pattern (base.py)
result = subprocess.run(cmd, capture_output=True, text=True, cwd=str(REPO_ROOT))
return json.loads(result.stdout)

Impact: Auraison cannot implement the ReAct observe-reason-act loop across agent boundaries. If NotebookAgent submits a job and ClusterAgent needs to verify the cluster first, these are two separate subprocess invocations with no shared reasoning context.

Benefit of adopting OpenClaw's pattern: A persistent agent runtime (even if still using claude -p under the hood) with streaming tool results would enable multi-step workflows within a single reasoning chain. The --resume session_id mechanism exists in Auraison's codebase but is unused — it could implement session continuity.

2. Skill Definition: Protocol-Native MCP vs. Prompt + allowedTools

OpenClaw: Skills are MCP servers with typed tool schemas. The agent discovers available tools via the MCP protocol, gets structured JSON schemas, and calls them with validated parameters. 3,200+ skills exist in the ClawHub ecosystem. Skill-on-demand loading means only relevant skill instructions enter the context window.

Auraison: Agent capabilities are defined by a natural-language system prompt and an --allowedTools string like "Bash(kubectl *),Bash(ray *),Read". There's no skill registry, no typed schema, no way to enumerate capabilities programmatically.

Impact: Auraison agents are opaque — you can't inspect what an agent can do without reading its prompt. You can't compose skills from different agents. You can't test individual skills in isolation.

Benefit of adopting OpenClaw's pattern: Auraison already operates in an MCP-rich environment (Claude Code has native MCP support). Converting agent capabilities to MCP tools would give: typed schemas, programmatic discovery, composability across agents, and access to the broader MCP ecosystem. The ros-mcp-server in the turtlebot-maze reference app is already doing this for ROS 2.

3. Context Assembly: Dynamic Workspace Files vs. Static System Prompts

OpenClaw: Context is assembled dynamically from workspace files (AGENTS.md, SOUL.md, TOOLS.md, MEMORY.md) plus vector search over prior conversations. The system prompt is composed at runtime from these files, not hardcoded. Behavior changes by editing Markdown, not code.

Auraison: System prompts are hardcoded strings in Python files:

# notebook_agent.py
SYSTEM_PROMPT = (
    "You are the NotebookAgent. Submit Ray Jobs to the KubeRay cluster, "
    "poll status, and trigger copyback when complete. ..."
)

Impact: Changing agent behavior requires code changes, redeployment, and a new subprocess invocation. There's no runtime introspection of what the agent "knows."

Benefit of adopting OpenClaw's pattern: Auraison's .claude/agents/ directory already contains YAML frontmatter + system prompts — this is halfway to OpenClaw's workspace-file pattern. Enriching these with dynamic context assembly (lakehouse query results, recent job history, cluster state) would make agents context-aware without prompt engineering.

4. Memory: Vector Search + Workspace Files vs. In-Memory Dict

OpenClaw: SQLite with vector extensions provides semantic search over prior conversations. BM25 provides keyword matching. MEMORY.md captures persistent knowledge. The agent can recall relevant prior interactions and learn from past failures.

Auraison: Job state is an in-memory Python dict (_jobs: dict in api/jobs.py). Agents have no memory across invocations — each subprocess.run() starts fresh. The --resume session_id flag exists but relies on Claude Code's internal session state, which is ephemeral.

Impact: Auraison agents cannot learn from past job failures, recall successful patterns, or build up operational knowledge over time. Every invocation is a cold start.

Benefit of adopting OpenClaw's pattern: Auraison's data plane (DuckDB + DuckLake) is far more capable than OpenClaw's SQLite. The missing piece is wiring agent memory INTO the lakehouse — storing agent traces, tool call results, and operational patterns in DuckLake tables that agents can query in future invocations. The data plane design doc already envisions this ("episodic memory", "procedural memory") but it's not implemented.

5. ROS 2 Integration: Direct Bridge vs. Layered Abstraction

OpenClaw (RosClaw): Seven typed tools (ros2_publish, ros2_subscribe_once, ros2_action_goal, etc.) that map directly to ROS 2 primitives via rosbridge WebSocket. Clean, minimal, immediately usable.

Auraison: The turtlebot-maze app uses ros-mcp-server (MCP over rosbridge WebSocket :9090), which is architecturally identical to RosClaw. But this lives in a separate repo and isn't integrated into the control plane — it's a user-plane application detail.

Impact: Auraison's control plane has no awareness of ROS 2 capabilities. The ClusterAgent manages Kubernetes but can't inspect what ROS nodes are running or what topics are available.

Benefit of adopting OpenClaw's pattern: Promoting the ROS 2 bridge to a first-class control-plane capability (a RosAgent with ros2_list_topics, ros2_subscribe_once, ros2_camera_snapshot tools) would enable the control plane to observe and reason about user-plane robot state — completing the control loop.

6. Multi-Agent Communication: Session Primitives vs. No Inter-Agent Protocol

OpenClaw: Agents communicate via session primitives (sessions_send, sessions_spawn, sessions_history). Agent A can spawn Agent B, send it a task, and read its transcript. Trust boundaries are encoded in session IDs.

Auraison: Agents don't communicate. The FastAPI router orchestrates them procedurally — submit_notebook_job() calls NotebookAgent, the router checks the result, then calls WandBAgent separately. There's no mechanism for one agent to delegate to or consult another.

Impact: Complex workflows (submit job → monitor → evaluate experiment → decide whether to retrain → retrain) require the API layer to implement all orchestration logic in Python. The agents are "dumb workers," not collaborating peers.

Benefit of adopting OpenClaw's pattern: Inter-agent session primitives would enable the planned AgentOps subsystem without building a custom orchestration engine. NotebookAgent could spawn a WandBAgent session to evaluate results, read its transcript, and decide next steps — all within a single reasoning chain.

7. Proactive Behavior: Heartbeat vs. Reactive-Only

OpenClaw: The Heartbeat mechanism evaluates HEARTBEAT.md every 30 minutes, enabling scheduled autonomous behavior — the agent can proactively check systems, summarize overnight events, or trigger maintenance without human prompting.

Auraison: Purely reactive — agents run only when the API router invokes them. There's no mechanism for proactive monitoring, scheduled health checks, or autonomous maintenance.

Benefit of adopting OpenClaw's pattern: A heartbeat for the ClusterAgent would enable proactive cluster monitoring — checking GPU utilization, detecting failing nodes, alerting on W&B metric regressions — without requiring a human to hit the API endpoint. This is the "self-managing infrastructure" vision from Auraison's README, but OpenClaw has a concrete implementation pattern.

8. Safety: Ecosystem vs. allowedTools

OpenClaw: Layered security with policy precedence (Tool > Provider > Global > Agent > Group > Sandbox), ephemeral Docker sandboxes for untrusted sessions, and a third-party guardrail ecosystem (ClawGuard, OpenGuardrails, PRISM, NemoClaw). Tool authorization is fine-grained per-session.

Auraison: Security is --allowedTools strings per agent. No sandboxing, no per-session policies, no guardrail framework. The blast-radius containment is real (subprocess isolation) but coarse.

Benefit: Auraison's v1.5 Guardrails Engine plan (per-role constraint checks, per-tool-call evaluation) aligns with OpenClaw's model. The policy precedence hierarchy (Tool > Provider > Global > Agent) is a concrete blueprint Auraison could adopt directly.

9. Continuous Learning: OpenClaw-RL vs. No Learning Loop

OpenClaw-RL: Agents improve through normal usage via Hindsight-Guided On-Policy Distillation — corrections from users/environment become training signals without annotation pipelines. Four decoupled async loops (policy serving, rollout collection, judging, training).

Auraison: No learning loop. Agent behavior is fixed by the system prompt and the underlying Claude model. The data plane has the infrastructure (DuckLake for storing traces, W&B for experiment tracking) but no feedback mechanism from operational outcomes to agent behavior.

Benefit: OpenClaw-RL's architecture maps cleanly onto Auraison's existing infrastructure: SGLang policy serving → vLLM on torch.dev.gpu; rollout collection → agent trace storage in DuckLake; training → TRL jobs via the existing training pipeline. The pieces exist; the feedback loop doesn't.

What Auraison Has That OpenClaw Doesn't

Auraison advantage	OpenClaw gap
Data plane / lakehouse	SQLite + Markdown; no analytical query layer, no time travel, no schema evolution
GPU orchestration	No compute scheduling; relies on external infrastructure
Training pipeline	No SFT/DPO/GRPO integration; OpenClaw-RL is a separate project
World models	No Cosmos integration; no sim-to-real pipeline (ClawBody is MuJoCo-only, no photorealistic transfer)
Digital twins	No persistent asset state tracking
Plane separation	Single-process monolith; Gateway is the entire system
Enterprise readiness	Personal assistant; no multi-tenancy, no billing, no audit compliance

Summary: What Auraison Should Adopt

OpenClaw Pattern	Auraison Benefit	Implementation Effort
Streaming ReAct loop	Multi-step agent workflows without procedural orchestration	Medium — leverage `--resume` and streaming output from `claude -p`
MCP-native skills	Typed, discoverable, composable agent capabilities	Low — MCP infrastructure already exists (ros-mcp-server, Claude Code MCP)
Dynamic context assembly	Context-aware agents that adapt to current system state	Low — extend `.claude/agents/` YAML with DuckLake queries
Agent memory via lakehouse	Agents learn from past operations; no cold starts	Medium — wire agent traces into DuckLake tables
Inter-agent sessions	Agent-to-agent delegation without API-layer orchestration	Medium — implement session primitives in `run_agent()`
Heartbeat / proactive behavior	Self-managing infrastructure (the stated vision)	Low — cron + `claude -p` with cluster health prompt
Policy-precedence security	Fine-grained, layered guardrails replacing coarse `--allowedTools`	Medium — aligns with planned v1.5 Guardrails Engine
ROS 2 bridge as control-plane tool	Control plane observes and reasons about robot state	Low — promote ros-mcp-server to a control-plane agent tool

The core insight: OpenClaw proves that a claude -p-style agent runtime (LLM + tools + ReAct loop) can scale to 247K-star adoption when the composition primitives are right — MCP for tools, workspace files for behavior, session primitives for multi-agent, and heartbeat for proactive behavior. Auraison has better infrastructure (lakehouse, GPU orchestration, world models) but worse agent plumbing. The highest-leverage move is adopting OpenClaw's agent runtime patterns on top of Auraison's superior data and compute layers.

What OpenClaw Is​

OpenClaw Architecture​

Hub-and-Spoke with a Central Gateway​

The Agentic Loop (ReAct Pattern)​

Robotics Integration: RosClaw Bridge​

Skill System​

Memory and State​

Multi-Agent Patterns​

Safety and Guardrails​

OpenClaw-RL: Continuous Learning from Deployment​

Simulation: ClawBody​

Architectural Contrast with Auraison​

1. Agentic Loop: Streaming ReAct vs. Synchronous Subprocess​

2. Skill Definition: Protocol-Native MCP vs. Prompt + allowedTools​

3. Context Assembly: Dynamic Workspace Files vs. Static System Prompts​

4. Memory: Vector Search + Workspace Files vs. In-Memory Dict​

5. ROS 2 Integration: Direct Bridge vs. Layered Abstraction​

6. Multi-Agent Communication: Session Primitives vs. No Inter-Agent Protocol​

7. Proactive Behavior: Heartbeat vs. Reactive-Only​

8. Safety: Ecosystem vs. allowedTools​

9. Continuous Learning: OpenClaw-RL vs. No Learning Loop​

What Auraison Has That OpenClaw Doesn't​

Summary: What Auraison Should Adopt​