Marine World Model for USV Navigation
Date: 2026-05-20 Status: Draft (v1) Reference application: USV (VRX) Related: User Plane Design, Digital Twins
Problem
The Virtual RobotX (VRX) USV reference application requires a navigation stack that operates under
strong, time-varying hydrodynamic disturbances (wind, waves, currents, thruster non-linearities)
that no classical controller alone handles robustly. The existing my_usv_components controllers
(inverse-kinematics thruster allocation, station-keeping PID, wayfinding goal tracker) follow the
standard SINGABOAT-VRX approach: a hand-tuned PID stack on top of Fossen's 6-DOF rigid-body model.
This works for the basic VRX tasks but is fragile under environmental forcing and does not transfer
across hulls, sea states, or task families.
The literature gap (surveyed 2026-05-20): no public artifact combines (1) a Fossen-grounded physics prior, (2) a learned world model in the Dreamer / LRWM family, and (3) a neural-operator wave-field surrogate, trained inside a VRX-class simulator. The closest precedents are:
- LRWM/SPRL (IEEE 2025) — Latent Robust World Model + State Prediction RL for USV navigation under disturbances. World model only; no neural-operator wave field; not integrated with VRX.
- DreamerNav (2025) — DreamerV3 with hybrid global/local planning. Indoor robots only.
- PI Neural Operators for Wavefield Reconstruction (arxiv 2508.03315) — Wave-field surrogate. Reconstruction task only; not coupled to a navigation policy.
Auraison already has the maritime equivalent of the Cosmos stack we use for TurtleBot Maze in
place (see User Plane Design §turtlebot-maze).
This document specifies the marine counterpart of the Predict → Transfer → Reason → Execute
loop: a Fossen-grounded, Dreamer-class world model with an FNO wave-field surrogate, trained inside
VRX on ros.dev.gpu + torch.dev.gpu.
Goals
- Provide a robust USV navigation stack for the canonical VRX tasks: station keeping, wayfinding, perception, channel following, dock-and-deliver
- Preserve Fossen 6-DOF dynamics as the physics prior — controllers grounded in marine engineering, not learned from scratch
- Learn a latent world model that predicts USV evolution under wind/wave/current disturbances (Dreamer / LRWM class)
- Train a neural-operator wave-field surrogate (FNO class) that acts both as fast environment-state input to the policy and as a learned simulator surrogate for off-cluster rollouts
- Run end-to-end inside VRX (Gazebo Harmonic +
vrx_gz) onros.dev.gpu, with world model inference ontorch.dev.gpu - Persist rollouts and predicted trajectories to the data plane lakehouse for fine-tuning and evaluation
Non-goals (v1)
- Real-world deployment on a physical USV — VRX simulation only
- Cosmos retraining on marine data — Cosmos stack stays as-is for ground/manipulation
- Multi-USV swarming or COLREG-compliant traffic rules — single USV, single task at a time
- Wind/wave forecasting beyond simulator-provided ground truth — VRX environmental plugins are the source of truth
Architecture: Predict → Constrain → Reason → Execute
The marine loop differs from the Cosmos turtlebot loop in two material ways:
- Physics prior is dominant. Fossen's 6-DOF model (
vrx_gzhydrodynamic plugins) gives an analytically grounded simulator. We do not need pixel-space world prediction at the fidelity Cosmos provides for ground robots — we need state-space prediction of under disturbances. - Wave field is a first-class disturbance. Unlike indoor or ground environments, the dominant stochastic forcing comes from the surrounding fluid. A neural operator (FNO) surrogate for the local wave field is the maritime equivalent of a perception model.
Layered planes (maritime variant)
The AR4 digital twin design introduces layered planes (Layer A — World, B — Control, C — AI Runtime, D — World Model, E — Memory). The USV reference asset reuses this decomposition with maritime-specific components:
| Layer | Plane | Cluster | Components |
|---|---|---|---|
| A — World | User plane | ros.dev.gpu | Gazebo Harmonic + vrx_gz (WAM-V hull, Fossen 6-DOF hydrodynamic plugins, wind/wave/current generators); vrx_ros topic bridges |
| B — Marine Control | User plane | ros.dev.gpu | my_usv_components: inverse-kinematics thruster allocation, station-keeping PID, wayfinding goal tracker |
| C — AI Runtime | User plane | torch.dev.gpu | RL policy (SPRL-class) via vLLM/Ray Serve + Zenoh queryable; takes latent world-model state, emits high-level setpoints to Layer B |
| D — World Model | User plane | split | LRWM latent dynamics (torch.dev.gpu) · FNO wave-field operator (torch.dev.gpu) · feasibility check (ros.dev.gpu) |
| E — Memory | Data plane | — | twins/ Parquet tables (USV state, sea-state, predicted vs observed trajectories); ROS bag + Rerun .rrd recordings in the lakehouse |
| Orchestration | Control plane | — | TwinAgent (USV asset), NotebookAgent (training jobs), ClusterAgent |
Predict → Constrain → Reason → Execute loop
Editable Mermaid source: images/marine-world-model-predict-loop.mermaid.md
Loop steps, mapped to planes:
Cluster mapping
Editable Mermaid source: images/marine-world-model-cluster-mapping.mermaid.md
Why a neural operator for the wave field?
Three reasons a Fourier Neural Operator (or PINO variant) is the right choice here, as opposed to a generic CNN or transformer:
- Resolution invariance. FNOs operate in spectral space and generalize across discretizations. The same trained operator handles fine probe spacing near the hull and coarse spacing in the far field — useful when VRX sea-state changes between tasks.
- Physics consistency. Wave propagation under linear theory is naturally band-limited. Spectral methods match the inductive bias of the underlying Navier–Stokes / potential-flow dynamics, so we need far less data than a pure data-driven approach.
- Surrogate-mode reuse. Once trained, the FNO can run without VRX as a fast simulator for policy rollouts during RL training — the maritime equivalent of Cosmos-Predict2's video-generation surrogate, but in state space rather than pixel space. This is what makes "off-cluster" RL training tractable on a single GPU.
References: PI Neural Operators for Wavefield Reconstruction, FNO baseline, D-FNO 3D efficiency.
Why LRWM/SPRL rather than DreamerV3 directly?
DreamerV3 (DreamerNav) is the obvious baseline, and we will start by porting its RSSM (Recurrent State-Space Model) into VRX. But the LRWM paper's key contribution is the double-latent factorization: vehicle latent () and disturbance latent (). This separation matters for maritime use because:
- Vehicle dynamics are Fossen-prior-driven (low-variance, near-deterministic given controls)
- Disturbances are stochastic and the dominant source of variance
A single conflated latent (RSSM) bleeds disturbance noise into the vehicle prediction. The factored latent makes the world model both more sample-efficient and more interpretable for debugging. v1 ships an RSSM baseline; v1.5 swaps in LRWM-style factorization.
Schema extensions
USV asset registration
Extended twins/state_snapshots for USV (6-DOF marine)
The mobile-base columns (position_x/y/z + quaternion) already cover USV pose. Marine-specific
fields are added as nullable columns:
| Column | Type | Description |
|---|---|---|
linear_velocity | DOUBLE[3] | Body-frame surge/sway/heave (m/s) |
angular_velocity | DOUBLE[3] | Body-frame roll/pitch/yaw rate (rad/s) |
thruster_commands | DOUBLE[] | Per-thruster commanded force (length = thruster count) |
wind_vector | DOUBLE[3] | World-frame wind at vehicle (m/s) — ground truth |
current_vector | DOUBLE[3] | World-frame current at vehicle (m/s) — ground truth |
wave_latent | DOUBLE[] | FNO-encoded local wave-field latent (length = FNO embedding dim) |
sea_state | INTEGER | Beaufort-equivalent integer 0–9 (binned from wave RMS) |
task_phase | VARCHAR | e.g. approach, dwell, depart for VRX task scoring |
NULL columns let the same state_snapshots table serve TurtleBot, AR4, and USV. No new tables.
Extended event types
Data plane: rollout dataset
VRX rollouts produce two artifact classes:
- Bag recordings (
ros2 bag record+ Rerun.rrd): full sensor + state + control history, written tolanding/usv-vrx/and indexed by(asset_id, task, sea_state, seed, timestamp). - Aggregated rollout dataset (Parquet,
warehouse/usv-vrx/rollouts.parquet): downsampled to 10 Hz state, action, wave-latent, reward — the LRWM training set.
Synthetic data generation (SDG)
The marine analogue of the Cosmos SDG pipeline is much simpler: VRX itself is the synthetic data source. The pipeline:
This is closer to standard RL data generation than the Cosmos Predict2/Transfer2.5 pipeline used for ground robots. We do not need photorealistic sim2real augmentation in v1 because (a) we are not deploying to a physical USV yet, and (b) Fossen-grounded VRX physics is already accurate enough for sim2sim transfer across hull variants.
Evolution path
Requirements (UP-xxx)
Extends the user plane requirements in docs/user-plane/design.mdx.
| ID | Requirement | Traces to | Version |
|---|---|---|---|
| UP-037 | The user plane shall host the USV reference application using VRX (Gazebo Harmonic + vrx_gz) on ros.dev.gpu | UP-007 | v1 |
| UP-038 | The USV reference application shall preserve Fossen 6-DOF dynamics as the physics prior (no learning of base dynamics) | — | v1 |
| UP-039 | The marine control layer shall use my_usv_components (IK thruster allocation, station-keeping PID, wayfinding) as the executable controller | UP-037 | v1 |
| UP-040 | The USV shall be registered as a third reference asset in twins/ alongside TurtleBot and AR4-MK3 | — | v1 |
| UP-041 | state_snapshots shall be extended with nullable marine fields: linear_velocity, angular_velocity, thruster_commands, wind_vector, current_vector, wave_latent, sea_state, task_phase | UP-040 | v1 |
| UP-042 | A DreamerV3 RSSM world model shall be trained inside VRX on torch.dev.gpu | UP-006 | v1.5 |
| UP-043 | An FNO wave-field operator shall be trained on torch.dev.gpu, mapping sparse wave probes to a dense local wave field | UP-006 | v1.5 |
| UP-044 | The SPRL policy shall be served via vLLM + Zenoh queryable, consuming (eta, nu, wave_latent) and emitting high-level setpoints | UP-006, UP-013 | v1.5 |
| UP-045 | The Predict -> Constrain -> Reason -> Execute loop shall be implemented end-to-end for VRX 2023 station keeping and wayfinding tasks | UP-039, UP-042, UP-043, UP-044 | v1.5 |
| UP-046 | Fossen feasibility check shall reject predicted actions violating thruster limits, capsize threshold, or COLREG zones | UP-045 | v1.5 |
| UP-047 | VRX rollouts shall be persisted as ROS 2 bag + Rerun .rrd recordings in landing/usv-vrx/ | SYS-007 | v1 |
| UP-048 | Aggregated rollout dataset (state, action, wave_latent, reward) shall be persisted as Parquet in warehouse/usv-vrx/ | SYS-007 | v1.5 |
| UP-049 | LRWM double-latent factorization (vehicle + disturbance) shall replace RSSM | UP-042 | v2 |
| UP-050 | A physics-informed neural operator (PINO) variant shall enforce a linear-wave-equation residual loss on the wave operator | UP-043 | v2 |
Files to create / modify
References
Direct prior art
- USV Navigation Under Disturbances: World Model Enhanced RL (IEEE, 2025) — LRWM + SPRL
- Predictive Obstacle Avoidance for Under-Actuated USV via RL (J. Field Robotics, 2025)
- Risk-aware DRL for Mapless USV Navigation (Ocean Engineering, 2025)
- DreamerNav (PMC, 2025) — DreamerV3 + RSSM, indoor robots
- Aeolus Ocean: COLREG-compliant USV DRL simulator
Marine hydrodynamics foundation
- Fossen — Handbook of Marine Craft Hydrodynamics and Motion Control
- Fossen's Marine Craft Model
- Physics-Informed NN for Vessel Trajectory Prediction (2025)
- Data-Driven 6-DoF Ship Motion ID with Neural Networks
- Robust Adaptive Dynamic Positioning via RL for Unknown Hydrodynamics (JMSE, 2025)
- Feature-Decoupled Gated DRL for Path-Following of Large-Inertia Vessels (Wiley)
Neural-operator wave-field surrogate
- Bridging Ocean Wave Physics and DL: PI Neural Operators for Wavefield Reconstruction (2025)
- Neural Operator reference (Zongyi Li)
- PINNs and Neural Operators for Parametric PDEs (2025)
- DPNO: Dual Path Neural Operator (2025)
- Data-Free Neural Operator for Navier–Stokes (2025)