Skip to main content

AR4-MK3 Digital Twin — Design Document

Date: 2026-03-02 Status: Approved (v1) Epic: auraison-5z3 (Digital Twins) Related: auraison-eh1 (Cosmos-Reason2), auraison-oys (Cosmos-Predict2), auraison-i6l (Cosmos-Transfer2.5), auraison-2a5 (Pydantic AI decoupling)


Problem

The AR4-MK3 is a 6-DOF open-source robotic arm (Annin Robotics) with a Teensy 4.1 controller, optional auxiliary Arduino boards, multiple end-effector variants (pneumatic/servo grippers), and an optional 7th axis. It is an ideal second reference asset for the Auraison digital twin framework alongside TurtleBot.

Two external architecture proposals were evaluated:

  1. A C4-style monolithic twin platform (ChatGPT) — vertically integrated, standalone backend
  2. An Open Physical-AI Stack — layered, decoupled, NVIDIA-as-plugin philosophy

Neither maps cleanly to the Auraison four-plane architecture. This design document:

  • Critiques both proposals
  • Maps the AR4 digital twin to the Auraison planes
  • Introduces a layered decomposition within each plane — reconciling the Open Stack's layer model with our plane separation
  • Extends the existing twins/ schema for industrial arm concerns

Critique of external proposals

ChatGPT C4 design — structural problems

The C4 design proposes a monolithic "Digital Twin Backend" containing State Sync, Model Services, Programs, Calibration, and a "Simulation Runtime". When mapped to Auraison:

  1. Conflates plane concerns. State sync is user-plane (real-time, Ray worker writes), model services are data-plane (persistent schema), orchestration is control-plane (TwinAgent). The monolith puts components with fundamentally different latency and consistency requirements in one container.

  2. No persistent world model. The "Event Log" is mentioned but not designed. Our lakehouse twins accumulate state across jobs. The C4 design treats telemetry as logging, not memory.

  3. No learned world model. The "Shadow Executor" replays programs deterministically. Our Cosmos stack (Predict2 → Transfer2.5 → Reason2) generates visual predictions from learned models — fundamentally different from program replay.

  4. No agent architecture. Who orchestrates the twin lifecycle? No equivalent of TwinAgent. Operations are implicit.

  5. Missing data plane entirely. No lakehouse, no persistent schema, no query layer.

  6. "Twin UI" as separate container doesn't scale. Our Next.js dashboard is a control-plane surface that renders any asset type.

What it gets right: Variant handling (gripper types, extra axis) as a first-class concern. Version coupling (firmware/software/sketch). ROS 2 as a first-class integration path.

Open Physical-AI Stack — better foundation, incomplete mapping

The Open Stack's core principle — "simulation and control must be independent from intelligence" — is correct and maps to our plane separation. Its five layers (World, Control, AI Runtime, World Model, Memory) are a useful decomposition.

Gaps when mapped to Auraison:

  1. No control plane. The Open Stack has no orchestration layer. Who dispatches VLA inference jobs? Who manages the twin lifecycle? Who handles experiment tracking?

  2. No management plane. No billing, no quotas, no access control.

  3. "Memory" is underspecified. "Store trajectories, failures, sensor traces" is correct but needs a concrete schema (our twins/ Parquet tables).

  4. Policy Server placement ambiguous. The Open Stack shows it as a peer of VLA and World Model, but doesn't specify compute placement. In Auraison, the Policy Server runs on torch.dev.gpu as a Ray Serve endpoint — user plane, not control plane.

  5. MoveIt2 as safety layer is correctly identified but not placed in any plane. It belongs in the user plane alongside ros2_control.


Architecture: layered planes

The key insight from this design: each plane contains multiple layers. The Open Stack's layers map into the planes as rows. The planes remain the primary separation (different latency, consistency, failure domains). The layers provide internal structure within each plane.

Diagram 1: Planes (columns) × Layers (rows)

Diagram 2: Layers mapped to KubeRay clusters

Layer mapping table

Open Stack LayerAuraison PlaneClusterComponents
A — WorldUser planeros.dev.gpuGazebo Harmonic, AR4 Teensy serial bridge, /joint_states, camera topics
B — ControlUser planeros.dev.gpuros2_control (trajectory, PID), MoveIt2 (IK, collision, constraints)
C — AI RuntimeUser planetorch.dev.gpuPolicy Server (Ray Serve), VLA model (OpenVLA / GR00T), swappable
D — World ModelUser planesplitCosmos-Predict2 + Transfer2.5 (torch.dev.gpu), Cosmos-Reason2 (ros.dev.gpu)
E — MemoryData planetwins/ Parquet tables, DuckDB, MinIO
(none) — OrchestrationControl planeTwinAgent, PolicyAgent, FastAPI, AgentOps
(none) — GovernanceManagement planeBilling, quotas, observability (v2)

Key principle preserved: NVIDIA Cosmos models are plugins in the user plane, not infrastructure. The Policy Server abstraction means VLA backends are swappable (OpenVLA → GR00T → custom) without touching ROS 2 or the control plane.


AR4 as second reference asset

The AR4-MK3 is registered in the existing twins/assets table alongside TurtleBot. It does not need a new architecture — it needs AR4-specific capability metadata and schema extensions.

Asset registration

TwinAgent.create_twin(
asset_id="ar4-mk3-01",
asset_type="robot",
urdf_path="user-plane/ar4/urdf/ar4_mk3.urdf",
metadata={
"manufacturer": "Annin Robotics",
"model": "AR4-MK3",
"dof": 6,
"controller": "teensy_4.1",
"gripper_type": "servo", # or "pneumatic"
"extra_axis": false,
"firmware_version": "4.2.0",
"software_version": "6.3",
"aux_sketch_version": null
}
)

Capability model

The ChatGPT design correctly identifies variant handling as critical. The AR4's variants (gripper type, extra axis) are encoded in twins/assets.metadata as a capability model:

{
"capabilities": {
"gripper": {"type": "servo", "io_pins": [12, 13], "state_machine": "open_close"},
"extra_axis": {"enabled": false, "range_deg": null, "steps_per_deg": null},
"controller": {"type": "teensy_4.1", "protocol": "serial", "baud": 115200},
"aux_board": {"type": null, "sketch_version": null}
}
}

This is configuration-driven behavior, not code branching. The TwinAgent reads capabilities to determine which sensors to expect, which state machine governs the gripper, and whether a 7th axis exists.


Schema extensions

New table: twins/firmware_versions

Tracks firmware/software/sketch version history per asset. Current version lives in assets.metadata; this table provides the audit trail.

ColumnTypeDescription
version_idVARCHAR PKUUID
asset_idVARCHAR FK → assets
firmware_versionVARCHARTeensy sketch version
software_versionVARCHARAR4 desktop control software version
aux_sketch_versionVARCHARNULL if no aux board
ros2_driver_versionVARCHARNULL if not using ROS 2
validatedBOOLEANTrue if versions are known-compatible
recorded_atTIMESTAMP
recorded_byVARCHARAgent or operator

Extended twins/state_snapshots for 6-DOF arm

The existing state_snapshots schema uses position_x/y/z + quaternion for mobile robots. For a 6-DOF arm, we additionally need the joint vector:

ColumnTypeDescription
joint_positionsDOUBLE[]Array of joint angles (radians), length = DOF
joint_velocitiesDOUBLE[]Array of joint velocities (rad/s)
joint_torquesDOUBLE[]Array of estimated torques (Nm), NULL if not available
gripper_stateVARCHARopen | closed | moving | unknown
gripper_positionDOUBLE0.0 (closed) to 1.0 (open) for servo; NULL for pneumatic
end_effector_poseJSON{x, y, z, qx, qy, qz, qw} in world frame (FK-derived)
moveit_plan_idVARCHARMoveIt2 trajectory ID that produced this motion, NULL if manual

These columns are added to the existing table. For TurtleBot (mobile base), joint_positions is NULL. For AR4 (arm), position_x/y/z is NULL (the base doesn't move). The schema accommodates both via nullable columns — no separate tables needed.

Extended twins/events event types

AR4-specific event types:

arm.homed               — startup homing procedure completed
arm.calibrated — calibration offsets recorded
arm.estop — emergency stop triggered
arm.limit_reached — joint limit hit (payload: {joint, limit_type, value})
gripper.opened — gripper opened
gripper.closed — gripper closed
program.loaded — motion program loaded (payload: {program_id, version})
program.executed — motion program execution completed
program.diverged — predicted vs actual trajectory divergence flagged
firmware.updated — firmware version changed
moveit.plan_generated — MoveIt2 generated a trajectory plan
moveit.collision_check — collision check result (payload: {passed, obstacles})

Runtime reasoning loop (AR4 on Auraison)

The Open Stack's 6-step loop mapped to Auraison planes:

Step 1 — Perception (User plane, ros.dev.gpu)
ROS 2 topics: /joint_states, /camera/rgb, /camera/depth
Ray worker on ros.dev.gpu subscribes via Zenoh bridge
In-job writes: state_snapshots + sensor_readings → MinIO (data plane)

Step 2 — Observation formatting (User plane, torch.dev.gpu)
Policy Server (Ray Serve) receives observation:
obs = {image: rgb, joints: q, task: instruction}

Step 3 — VLA proposes action (User plane, torch.dev.gpu)
VLA model (OpenVLA / GR00T) outputs: "move end-effector +3cm, close gripper"

Step 4 — World model evaluates (User plane, split)
Cosmos-Predict2 (torch.dev.gpu): current frame + action → predicted trajectory video
Cosmos-Transfer2.5 (torch.dev.gpu): synthetic → photorealistic
Cosmos-Reason2 (ros.dev.gpu): feasibility evaluation
→ Predicted snapshots written to data plane (source=predicted)

Step 5 — MoveIt validates (User plane, ros.dev.gpu)
MoveIt2: collision check, IK, trajectory generation
→ moveit.plan_generated event written to data plane

Step 6 — Execute (User plane, ros.dev.gpu)
ros2_control: trajectory execution via Teensy serial bridge
→ Observed state_snapshots written to data plane (source=ros_job)

Post-job — Reconciliation (Control plane)
TwinAgent.sync_twin("ar4-mk3-01", job_id):
Compare predicted vs observed snapshots
Flag divergences as program.diverged events
Update firmware_versions if changed
Set reconciled=True on validated snapshots

Data flow diagram


AR4-specific concerns

Teensy serial bridge

The AR4's Teensy 4.1 communicates via serial USB. In our architecture, this is a ROS 2 hardware interface plugin in ros2_control — same pattern as any ROS 2 robot. The Teensy bridge runs on ros.dev.gpu as part of the ROS 2 stack, not as a separate container.

MoveIt2 as safety layer

The Open Stack correctly identifies MoveIt2 as the critical safety layer between VLA intent and physical execution. VLA outputs high-level actions ("move gripper here"); MoveIt2 translates these into safe trajectories with collision checking and joint limit enforcement. This is Layer B in the user plane — it never leaves ros.dev.gpu.

Version coupling validation

On every job start, the TwinAgent validates that the firmware/software/sketch versions recorded in twins/assets.metadata match the versions reported by the Teensy controller. Mismatches are flagged as firmware.updated events and require re-validation before the job proceeds.

Gripper state machine

Pneumatic and servo grippers have different state machines:

  • Pneumatic: binary (open/closed), controlled by digital IO pins
  • Servo: continuous (0.0–1.0 position), controlled by PWM

The capabilities.gripper.state_machine field in assets.metadata determines which state machine governs gripper_state and gripper_position in state_snapshots.


Evolution path

v1   — AR4 registered as second reference asset; URDF in Gazebo; in-job writes + post-job
reconciliation; firmware_versions table; variant handling via capabilities metadata
v1.5 — Policy Server on torch.dev.gpu (Ray Serve); VLA inference (OpenVLA); MoveIt2 safety
layer; Cosmos Predict → Transfer → Reason → Execute loop for AR4
Redis hot-cache for real-time joint state (6-DOF at 100Hz+)
v2 — GR00T as VLA backend (swappable via Policy Server abstraction)
Cosmos post-trained on AR4 manipulation datasets
MoveIt2 collision checks feed Reason2 feasibility scoring
Program repository: versioned motion programs with provenance
Pydantic AI TwinAgent + PolicyAgent (control plane migration)

Files to create / modify

user-plane/ar4/
urdf/ar4_mk3.urdf AR4-MK3 URDF model
config/ar4_controllers.yaml ros2_control configuration
config/ar4_moveit.yaml MoveIt2 configuration

control-plane/backend/
agents/twin_agent.py Extend for firmware_versions + capabilities
agents/policy_agent.py New: dispatches VLA inference jobs to Policy Server
api/twins.py Extend for /predict endpoint + firmware validation
models/twin.py Extend for AR4 capability model

data-plane/
schema/twins/firmware_versions/ Schema definition for new table