Marine World Model for USV Navigation

Date: 2026-05-20 Status: Draft (v1) Reference application: USV (VRX) Related: User Plane Design, Digital Twins

Problem

The Virtual RobotX (VRX) USV reference application requires a navigation stack that operates under strong, time-varying hydrodynamic disturbances (wind, waves, currents, thruster non-linearities) that no classical controller alone handles robustly. The existing my_usv_components controllers (inverse-kinematics thruster allocation, station-keeping PID, wayfinding goal tracker) follow the standard SINGABOAT-VRX approach: a hand-tuned PID stack on top of Fossen's 6-DOF rigid-body model. This works for the basic VRX tasks but is fragile under environmental forcing and does not transfer across hulls, sea states, or task families.

The literature gap (surveyed 2026-05-20): no public artifact combines (1) a Fossen-grounded physics prior, (2) a learned world model in the Dreamer / LRWM family, and (3) a neural-operator wave-field surrogate, trained inside a VRX-class simulator. The closest precedents are:

LRWM/SPRL (IEEE 2025) — Latent Robust World Model + State Prediction RL for USV navigation under disturbances. World model only; no neural-operator wave field; not integrated with VRX.
DreamerNav (2025) — DreamerV3 with hybrid global/local planning. Indoor robots only.
PI Neural Operators for Wavefield Reconstruction (arxiv 2508.03315) — Wave-field surrogate. Reconstruction task only; not coupled to a navigation policy.

Auraison already has the maritime equivalent of the Cosmos stack we use for TurtleBot Maze in place (see User Plane Design §turtlebot-maze). This document specifies the marine counterpart of the Predict → Transfer → Reason → Execute loop: a Fossen-grounded, Dreamer-class world model with an FNO wave-field surrogate, trained inside VRX on ros.dev.gpu + torch.dev.gpu.

Goals

Provide a robust USV navigation stack for the canonical VRX tasks: station keeping, wayfinding, perception, channel following, dock-and-deliver
Preserve Fossen 6-DOF dynamics as the physics prior — controllers grounded in marine engineering, not learned from scratch
Learn a latent world model that predicts USV evolution under wind/wave/current disturbances (Dreamer / LRWM class)
Train a neural-operator wave-field surrogate (FNO class) that acts both as fast environment-state input to the policy and as a learned simulator surrogate for off-cluster rollouts
Run end-to-end inside VRX (Gazebo Harmonic + vrx_gz) on ros.dev.gpu, with world model inference on torch.dev.gpu
Persist rollouts and predicted trajectories to the data plane lakehouse for fine-tuning and evaluation

Non-goals (v1)

Real-world deployment on a physical USV — VRX simulation only
Cosmos retraining on marine data — Cosmos stack stays as-is for ground/manipulation
Multi-USV swarming or COLREG-compliant traffic rules — single USV, single task at a time
Wind/wave forecasting beyond simulator-provided ground truth — VRX environmental plugins are the source of truth

Architecture: Predict → Constrain → Reason → Execute

The marine loop differs from the Cosmos turtlebot loop in two material ways:

Physics prior is dominant. Fossen's 6-DOF model (vrx_gz hydrodynamic plugins) gives an analytically grounded simulator. We do not need pixel-space world prediction at the fidelity Cosmos provides for ground robots — we need state-space prediction of $(\eta, \nu)$ under disturbances.
Wave field is a first-class disturbance. Unlike indoor or ground environments, the dominant stochastic forcing comes from the surrounding fluid. A neural operator (FNO) surrogate for the local wave field is the maritime equivalent of a perception model.

Layered planes (maritime variant)

The AR4 digital twin design introduces layered planes (Layer A — World, B — Control, C — AI Runtime, D — World Model, E — Memory). The USV reference asset reuses this decomposition with maritime-specific components:

Layer	Plane	Cluster	Components
A — World	User plane	`ros.dev.gpu`	Gazebo Harmonic + `vrx_gz` (WAM-V hull, Fossen 6-DOF hydrodynamic plugins, wind/wave/current generators); `vrx_ros` topic bridges
B — Marine Control	User plane	`ros.dev.gpu`	`my_usv_components`: inverse-kinematics thruster allocation, station-keeping PID, wayfinding goal tracker
C — AI Runtime	User plane	`torch.dev.gpu`	RL policy (SPRL-class) via vLLM/Ray Serve + Zenoh queryable; takes latent world-model state, emits high-level setpoints to Layer B
D — World Model	User plane	split	LRWM latent dynamics (`torch.dev.gpu`) · FNO wave-field operator (`torch.dev.gpu`) · feasibility check (`ros.dev.gpu`)
E — Memory	Data plane	—	`twins/` Parquet tables (USV state, sea-state, predicted vs observed trajectories); ROS bag + Rerun `.rrd` recordings in the lakehouse
Orchestration	Control plane	—	TwinAgent (USV asset), NotebookAgent (training jobs), ClusterAgent

Predict → Constrain → Reason → Execute loop

Editable Mermaid source: images/marine-world-model-predict-loop.mermaid.md

Loop steps, mapped to planes:

Step 1 — Perception (User plane, ros.dev.gpu)
  ROS 2 topics: /wamv/sensors/imu, /wamv/sensors/gps, /wamv/sensors/camera,
                /wamv/sensors/lidar, wave probe topic (sea-state proxy)
  Ray worker subscribes via Zenoh bridge; writes raw sensor + ground-truth wave field
  to data plane

Step 2 — Wave-field encoding (User plane, torch.dev.gpu)
  FNO operator: sparse wave probe samples → dense local wave field → latent encoding
  Latent wave state is concatenated with proprioception (eta, nu) as world-model input

Step 3 — Policy proposes high-level action (User plane, torch.dev.gpu)
  SPRL policy outputs setpoint: target pose (station keeping) or next waypoint segment
  (wayfinding). Inference via vLLM + Zenoh queryable.

Step 4 — World model predicts trajectory (User plane, torch.dev.gpu)
  LRWM rolls out N steps in latent space: (eta_t, nu_t, wave_latent_t, action_t)
    -> (eta_{t+1}, nu_{t+1}, wave_latent_{t+1})
  Predicted state snapshots written to data plane (source=predicted)

Step 5 — Feasibility check (User plane, ros.dev.gpu)
  Apply Fossen constraints (thruster limits, capsize check, COLREG zones)
  to predicted trajectory. Reject if predicted state violates safety envelope.

Step 6 — Execute via Fossen-grounded controllers (User plane, ros.dev.gpu)
  Setpoint -> wayfinding / station-keeping node -> inverse kinematics -> /wamv/thrusters/*/cmd
  Observed state_snapshots written to data plane (source=ros_job)

Post-job — Reconciliation (Control plane)
  TwinAgent.sync_twin("usv-wamv-01", job_id):
    Compare predicted vs observed (eta, nu) trajectories
    Flag divergences (used as LRWM training signal)
    Update sea-state regression: predicted vs ground-truth wave field

Cluster mapping

Editable Mermaid source: images/marine-world-model-cluster-mapping.mermaid.md

Why a neural operator for the wave field?

Three reasons a Fourier Neural Operator (or PINO variant) is the right choice here, as opposed to a generic CNN or transformer:

Resolution invariance. FNOs operate in spectral space and generalize across discretizations. The same trained operator handles fine probe spacing near the hull and coarse spacing in the far field — useful when VRX sea-state changes between tasks.
Physics consistency. Wave propagation under linear theory is naturally band-limited. Spectral methods match the inductive bias of the underlying Navier–Stokes / potential-flow dynamics, so we need far less data than a pure data-driven approach.
Surrogate-mode reuse. Once trained, the FNO can run without VRX as a fast simulator for policy rollouts during RL training — the maritime equivalent of Cosmos-Predict2's video-generation surrogate, but in state space rather than pixel space. This is what makes "off-cluster" RL training tractable on a single GPU.

References: PI Neural Operators for Wavefield Reconstruction, FNO baseline, D-FNO 3D efficiency.

Why LRWM/SPRL rather than DreamerV3 directly?

DreamerV3 (DreamerNav) is the obvious baseline, and we will start by porting its RSSM (Recurrent State-Space Model) into VRX. But the LRWM paper's key contribution is the double-latent factorization: vehicle latent ( $z^v$ ) and disturbance latent ( $z^d$ ). This separation matters for maritime use because:

Vehicle dynamics are Fossen-prior-driven (low-variance, near-deterministic given controls)
Disturbances are stochastic and the dominant source of variance

A single conflated latent (RSSM) bleeds disturbance noise into the vehicle prediction. The factored latent makes the world model both more sample-efficient and more interpretable for debugging. v1 ships an RSSM baseline; v1.5 swaps in LRWM-style factorization.

Schema extensions

USV asset registration

TwinAgent.create_twin(
  asset_id="usv-wamv-01",
  asset_type="usv",
  urdf_path="user-plane/usv/urdf/wamv.urdf",
  metadata=\{
    "hull": "wamv",
    "thruster_config": "twin_lateral",
    "dof": 6,
    "controller": "my_usv_components",
    "fossen_params_version": "vrx_2023",
    "sim_env": "vrx_gz_harmonic",
    "task_set": ["station_keeping", "wayfinding", "perception", "dock"]
  \}
)

Extended `twins/state_snapshots` for USV (6-DOF marine)

The mobile-base columns (position_x/y/z + quaternion) already cover USV pose. Marine-specific fields are added as nullable columns:

Column	Type	Description
`linear_velocity`	DOUBLE[3]	Body-frame surge/sway/heave (m/s)
`angular_velocity`	DOUBLE[3]	Body-frame roll/pitch/yaw rate (rad/s)
`thruster_commands`	DOUBLE[]	Per-thruster commanded force (length = thruster count)
`wind_vector`	DOUBLE[3]	World-frame wind at vehicle (m/s) — ground truth
`current_vector`	DOUBLE[3]	World-frame current at vehicle (m/s) — ground truth
`wave_latent`	DOUBLE[]	FNO-encoded local wave-field latent (length = FNO embedding dim)
`sea_state`	INTEGER	Beaufort-equivalent integer 0–9 (binned from wave RMS)
`task_phase`	VARCHAR	e.g. `approach`, `dwell`, `depart` for VRX task scoring

NULL columns let the same state_snapshots table serve TurtleBot, AR4, and USV. No new tables.

Extended event types

usv.task_started           — VRX task start (payload: \{task, scoring_zone\})
usv.task_completed         — VRX task complete (payload: \{task, score, time_s\})
usv.station_held           — station keeping tolerance achieved
usv.waypoint_reached       — wayfinding waypoint hit (payload: \{waypoint_index, error_m\})
usv.capsize_warning        — roll exceeded threshold
wave.regime_changed        — FNO-detected sea-state class change
fossen.feasibility_reject  — predicted action rejected by Fossen feasibility check

Data plane: rollout dataset

VRX rollouts produce two artifact classes:

Bag recordings (ros2 bag record + Rerun .rrd): full sensor + state + control history, written to landing/usv-vrx/ and indexed by (asset_id, task, sea_state, seed, timestamp).
Aggregated rollout dataset (Parquet, warehouse/usv-vrx/rollouts.parquet): downsampled to 10 Hz state, action, wave-latent, reward — the LRWM training set.

landing/usv-vrx/
  raw/<asset_id>/<task>/<seed>/<timestamp>/
    bag.db3                  ROS 2 bag
    trace.rrd                Rerun recording (R2 public mirror for visualization)
    sea_state.json           wind/wave/current ground truth (VRX environmental plugin dump)
    score.json               VRX task score

warehouse/usv-vrx/
  rollouts.parquet           (state, action, wave_latent, reward, terminal) tuples
  wave_fields.parquet        (probe_samples, dense_field) for FNO training
  predicted_vs_observed/     LRWM evaluation: predicted (eta,nu) vs observed

Synthetic data generation (SDG)

The marine analogue of the Cosmos SDG pipeline is much simpler: VRX itself is the synthetic data source. The pipeline:

VRX rollouts (ros.dev.gpu)
  Domain randomization: sea state, wind, current, hull mass, drag coefficients
  Per-rollout: collect (state, action, wave_field) tuples
    -> Parquet in warehouse/usv-vrx/
      -> LRWM training jobs on torch.dev.gpu
      -> FNO wave operator training jobs on torch.dev.gpu
      -> SPRL policy training jobs on torch.dev.gpu

This is closer to standard RL data generation than the Cosmos Predict2/Transfer2.5 pipeline used for ground robots. We do not need photorealistic sim2real augmentation in v1 because (a) we are not deploying to a physical USV yet, and (b) Fossen-grounded VRX physics is already accurate enough for sim2sim transfer across hull variants.

Evolution path

v1   — VRX integration on ros.dev.gpu Ray cluster (Gazebo Harmonic + vrx_gz + ROS 2 Jazzy)
       USV registered as third reference asset in twins/ schema
       Baseline: existing my_usv_components controllers (PID station keeping, wayfinding)
       Bag + Rerun recording wired into data-plane lakehouse
       Single GPU rollouts; manual policy tuning

v1.5 — DreamerV3 RSSM world model trained inside VRX on torch.dev.gpu
       FNO wave-field operator v1 (input: sparse probes; output: dense local field)
       SPRL policy on top of RSSM; high-level setpoints fed to my_usv_components
       Predict -> Constrain -> Reason -> Execute loop wired end-to-end
       state_snapshots extended for marine fields

v2   — LRWM double-latent factorization (vehicle + disturbance) replaces RSSM
       PINO variant of wave operator (physics-informed: linear wave equation residual)
       Multi-task curriculum across VRX 2023 task set
       Cross-hull transfer: train on WAM-V, evaluate on alternate hull configs
       Domain randomization expanded to currents, sensor noise, thruster degradation

v3   — Edge deployment evaluation: LRWM + SPRL on Jetson AGX Thor for physical USV trials
       Real-world VRX (sim2real) via Cosmos-Transfer2.5 photorealistic camera augmentation
       Multi-USV coordination + COLREG compliance (deferred — out of scope for this design)

Requirements (UP-xxx)

Extends the user plane requirements in docs/user-plane/design.mdx.

ID	Requirement	Traces to	Version
UP-037	The user plane shall host the USV reference application using VRX (Gazebo Harmonic + vrx_gz) on ros.dev.gpu	UP-007	v1
UP-038	The USV reference application shall preserve Fossen 6-DOF dynamics as the physics prior (no learning of base dynamics)	—	v1
UP-039	The marine control layer shall use my_usv_components (IK thruster allocation, station-keeping PID, wayfinding) as the executable controller	UP-037	v1
UP-040	The USV shall be registered as a third reference asset in twins/ alongside TurtleBot and AR4-MK3	—	v1
UP-041	state_snapshots shall be extended with nullable marine fields: linear_velocity, angular_velocity, thruster_commands, wind_vector, current_vector, wave_latent, sea_state, task_phase	UP-040	v1
UP-042	A DreamerV3 RSSM world model shall be trained inside VRX on torch.dev.gpu	UP-006	v1.5
UP-043	An FNO wave-field operator shall be trained on torch.dev.gpu, mapping sparse wave probes to a dense local wave field	UP-006	v1.5
UP-044	The SPRL policy shall be served via vLLM + Zenoh queryable, consuming (eta, nu, wave_latent) and emitting high-level setpoints	UP-006, UP-013	v1.5
UP-045	The Predict -> Constrain -> Reason -> Execute loop shall be implemented end-to-end for VRX 2023 station keeping and wayfinding tasks	UP-039, UP-042, UP-043, UP-044	v1.5
UP-046	Fossen feasibility check shall reject predicted actions violating thruster limits, capsize threshold, or COLREG zones	UP-045	v1.5
UP-047	VRX rollouts shall be persisted as ROS 2 bag + Rerun .rrd recordings in landing/usv-vrx/	SYS-007	v1
UP-048	Aggregated rollout dataset (state, action, wave_latent, reward) shall be persisted as Parquet in warehouse/usv-vrx/	SYS-007	v1.5
UP-049	LRWM double-latent factorization (vehicle + disturbance) shall replace RSSM	UP-042	v2
UP-050	A physics-informed neural operator (PINO) variant shall enforce a linear-wave-equation residual loss on the wave operator	UP-043	v2

Files to create / modify

user-plane/usv/
  urdf/wamv.urdf                          symlink or vendored from vrx_urdf
  config/usv_controllers.yaml             my_usv_components params (vendored from src/)
  launch/usv_vrx_world.launch.py          VRX world + station-keeping + wayfinding bringup
  launch/usv_rl_rollout.launch.py         policy-driven rollout for SDG

control-plane/backend/
  agents/twin_agent.py                    extend for USV asset_type + marine state schema
  agents/notebook_agent.py                add VRX rollout job template
  agents/lakehouse_agent.py               extend for rollouts.parquet + wave_fields.parquet

control-plane/backend/marine/
  world_model/rssm.py                     v1.5: DreamerV3 RSSM trained on rollouts.parquet
  world_model/lrwm.py                     v2: LRWM double-latent factorization
  wave_operator/fno.py                    v1.5: FNO wave-field surrogate
  wave_operator/pino.py                   v2: PINO variant with physics residual
  policy/sprl.py                          v1.5: SPRL policy on top of world model latent
  feasibility/fossen.py                   Fossen-prior feasibility check (capsize, thrust, COLREG)

data-plane/
  schema/twins/state_snapshots/           add nullable marine columns (alembic-style migration)
  schema/usv_vrx/rollouts/                rollouts.parquet schema
  schema/usv_vrx/wave_fields/             wave_fields.parquet schema

References

Direct prior art

USV Navigation Under Disturbances: World Model Enhanced RL (IEEE, 2025) — LRWM + SPRL
Predictive Obstacle Avoidance for Under-Actuated USV via RL (J. Field Robotics, 2025)
Risk-aware DRL for Mapless USV Navigation (Ocean Engineering, 2025)
DreamerNav (PMC, 2025) — DreamerV3 + RSSM, indoor robots
Aeolus Ocean: COLREG-compliant USV DRL simulator