Object Storage Migration: MinIO → RustFS on TrueNAS SCALE
Date: 2026-06-05 Status: Runbook Applies to: TrueNAS SCALE 25.04.2 (Fangtooth)
Why this migration
The MinIO Docker image is now effectively unmaintained (and AIStor is MinIO's rebrand — same on-disk format), so the lakehouse object-storage layer moves to a two-tier S3 design:
- Local, on-prem (authoritative working set) on TrueNAS ZFS.
- Cloudflare R2 — cloud (exposure tier only). Holds only buckets that must reach the rest of the Cloudflare infrastructure: public/shared artifacts (e.g. Rerun recordings) and buckets backing aegean.ai's websites. Anything that does not need public/Cloudflare/website exposure stays local and is not mirrored.
- Sync:
rclonereplicates the R2-destined buckets between local and R2 on a schedule (R2 has no native inbound replication; consistency is eventual).
This is the platform standard tracked in Jira AURA-667.
Chosen engine: RustFS (in-place binary replacement)
The local engine is RustFS — an open-source, S3-compatible, Rust object store positioned as a MinIO drop-in. We use its binary-replacement path: RustFS reads MinIO's existing data directory in place, with no data copy.
Why in-place works here (and why a copy would otherwise be needed)
MinIO does not store one object as one file. For every object it writes a
directory bucket/objectname/ containing an xl.meta file (metadata + inlined small
data) and, for larger objects, erasure-coded part.N files, plus a .minio.sys
metadata tree. A generic POSIX-backed S3 gateway (e.g. versitygw — see the appendix)
expects the opposite layout (bucket = directory, object = a single file), so it
cannot read MinIO's data directly and requires an S3-API copy to transcode every
object.
RustFS is different: it implements MinIO's on-disk format well enough to serve the existing data directory in place. Stop MinIO, point RustFS at the same path, start RustFS — bucket metadata, versioning, object-locks, IAM, and lifecycle rules carry over, with downtime typically under five minutes.
Trade-offs you are accepting
- Pre-GA software. RustFS is in beta (latest
1.0.0-beta.6, May 2026; Apache-2.0; GA targeted ~July 2026). The pre-migration ZFS snapshot (below) is what makes running pre-GA code on authoritative data acceptable. - Opaque on disk. Objects remain in MinIO-style erasure layout, so a ZFS snapshot is a disaster-recovery backup but not browsable individual files (same as MinIO today; unlike the versitygw POSIX alternative).
- Single-node, single/multi-drive only — not distributed clusters.
- Not migrated: event notifications, site replication, LDAP/OIDC.
Source configuration (reference)
Values read from the existing minio app:
| Setting | Value |
|---|---|
| Pool | marathon |
| MinIO data (host path) | /mnt/marathon/minio-dataset (mounted at /export) |
| MinIO API port | 9000 (console 9002) |
| Access key | IJDY6D4LMF3Z34F1BY7S |
| Secret key | (keep out of source control — rotate after migration) |
| MinIO data owner (UID:GID) | 473:473 |
Mount the host path, not MinIO's /export
In the MinIO app, Mount Path /export is just the in-container mount point;
Host Path /mnt/marathon/minio-dataset is the real on-disk location holding
.minio.sys + the bucket dirs. RustFS must mount the host path. Sanity-check it
before launch — ls -la /mnt/marathon/minio-dataset should show .minio.sys and your
buckets at the top level; if they're nested a level down, mount that subdirectory
instead.
We reuse the MinIO root keys for RustFS so clients change nothing (same endpoint, same keys).
The UID is the gotcha, and it rules out the catalog app. MinIO's data is owned by
473:473, and RustFS must run as a UID that can read it. The TrueNAS catalog app
can't do this: it forces UID 568 and points RustFS at its own /rustfs/data0
volume rather than your dataset, so it crash-loops with Permission denied. Deploy a
Custom App instead (Step 3) — it lets you set both the volume path and the UID. Run
the container as 473 (matching the existing owner, no chown — cleanest and
rollback-safe), or as 568 if you deliberately chowned the dataset. Do not use
RustFS's image default of 10001 — it just diverges from the platform apps user and
complicates rollback.
Step 1 — Snapshot the MinIO dataset (your only rollback)
RustFS's docs provide no rollback guidance and it may rewrite metadata in place, so snapshot first (System Settings → Shell):
Step 2 — Stop the old apps
- Apps → minio → Stop. RustFS and MinIO cannot both own the data directory.
- If you trialled versitygw earlier, stop/delete that app and you may delete its
empty
marathon/objectsdataset — it is not part of the RustFS plan.
Step 3 — Deploy RustFS as a Custom App (not the catalog app)
The catalog app does NOT work for binary replacement
The TrueNAS RustFS catalog app runs a fresh, self-managed store — it points RustFS
at its own internal path /rustfs/data0 (often a root-owned ixVolume), not at your
MinIO dataset, and it enforces a minimum UID/GID of 568. Pointed at MinIO's
473-owned data it crash-loops with Io error: Permission denied (os error 13) and
never adopts the data. Use a Custom App instead — it lets you override the volume
path and the UID, which is exactly what in-place replacement requires.
Apps → Discover Apps → Custom App → Install via YAML. The two settings that make
binary replacement work are RUSTFS_VOLUMES=/data (overrides the /rustfs/data0
default so RustFS reads your MinIO data) and user: matching the data owner:
Console behind Cloudflare is buggy on the current beta
The S3 API port (9000) and console port (9001) above are both published, so the
console works on the LAN (http://<nas-ip>:9001). But RustFS's console currently
embeds its internal port in API calls, so reached through a Cloudflare hostname
(port 443, no port) those calls fail
(rustfs#966,
#3062). Recommended split:
- S3 API (
:9000) → Cloudflare — a plain S3 endpoint proxies fine (and R2 is the real cloud exposure tier anyway). - Console (
:9001) → reach over LAN / Tailscale, not Cloudflare. It's an admin UI; it doesn't need public ingress, and this sidesteps the proxy bug.
If you must expose the console via Cloudflare, keep API + console on the same
hostname (split domains are the worst case in the issues) and set the two
*_CORS_ALLOWED_ORIGINS vars above; expect flakiness until those issues are fixed.
Match user: to the data owner — check it first:
If it's still 473, use user: "473:473" (no chown — cleanest, rollback-safe). If you
already chowned it to 568, use user: "568:568". The only rule is container user ==
data owner; otherwise RustFS hits Permission denied on /data and exits. Avoid the
image default 10001 — it just diverges from the platform apps user and complicates
rollback.
Step 4 — Verify
RustFS ships a web console (unlike versitygw):
- Console:
http://<nas-ip>:9001→ log in → confirm your buckets and objects list. - S3 API round-trip (same keys, port 9000):
Cross-check the bucket list against data-plane/datasets.yml (the dataset manifest)
to confirm nothing is missing.
Step 5 — Roll back if needed
If buckets/objects don't list correctly, revert instantly and restart MinIO:
Step 6 — Wire the R2 exposure-tier sync
R2 stays the cloud/exposure tier; only the local engine changed. Schedule an
rclone → R2 sync for the public/website buckets via System → Advanced → Cron Jobs
(R2 has no native inbound replication; consistency is eventual).
Step 7 — Rotate the root key
If the MinIO secret was ever exposed, change RUSTFS_SECRET_KEY in the app and update
clients.
Step 8 — SSO via JumpCloud OIDC (optional)
To let people sign in to the RustFS console (and obtain temporary S3 credentials) with their org identity, federate RustFS to an external IdP.
Why OIDC, and why JumpCloud (not Google directly)
- RustFS supports OIDC only — no LDAP. Its external-auth implementation is OIDC
(
OidcSys+ STSAssumeRoleWithWebIdentity, with JWT claims mapped to IAM policies). So "JumpCloud LDAP" / "Google Secure LDAP" are not options regardless of what the IdPs offer. - Use JumpCloud, not raw Google. The value of an IdP here is mapping group membership → RustFS IAM policy, which needs group info in the JWT. JumpCloud emits group claims in its OIDC token (and already federates your Google Workspace users). Google's OIDC tokens do not contain Workspace group membership (groups live in the Admin SDK, not the ID token), so Google-direct authenticates users but can't drive group-based authorization. JumpCloud sits in front of Google and fills that gap.
RustFS OIDC config (OidcProviderConfig)
| Key | Value |
|---|---|
config_url | JumpCloud discovery URL (.../.well-known/openid-configuration) |
client_id / client_secret | from the JumpCloud OIDC app |
scopes | openid,profile,email + whatever scope surfaces groups |
groups_claim / roles_claim | the JWT claim carrying group/role names (e.g. groups) |
email_claim / username_claim | identity (RustFS resolves preferred_username → email → sub) |
Setup
- JumpCloud → SSO → add a Custom OIDC app:
- Grant type Authorization Code + PKCE.
- Redirect URI = the RustFS console OIDC callback (exact path from your console,
under
/rustfs/console/...) — it must match exactly. - Add a groups attribute/claim to the app and attach the user groups RustFS should
see. Record
client_id,client_secret, and the discovery URL.
- RustFS: configure the OIDC provider with the keys above (
config_url,client_id,client_secret,groups_claim). - RustFS IAM: define policies and map claim values → policies (e.g. group
data-admins→ admin policy,data-readers→ read-only). - Test the console Login with SSO flow; programmatic clients can use
AssumeRoleWithWebIdentitywith a JumpCloud JWT, or keep static keys.
Caveats (beta)
- RustFS OIDC is beta with active bugs — notably issuer trailing-slash handling
that breaks some IdPs (rustfs#2349,
#2049). Match JumpCloud's issuer URL
exactly (with/without the trailing
/). - Binary replacement dropped MinIO's old OIDC/LDAP config, so you configure this fresh.
- Keep the static root access key as a break-glass admin in case SSO misbehaves.
Running long S3 jobs detached
The TrueNAS web UI Shell has an idle timeout and a fragile websocket; a foreground
command tied to it gets SIGHUP'd on disconnect. For any long-running S3 job (e.g. an
R2 sync), run the container detached so the Docker daemon owns it:
SSH into the NAS is also steadier than the web Shell, but with -d it doesn't matter
if the connection drops.
Appendix — versitygw (copy-based) alternative
If you ever want browsable objects as plain files on ZFS (so zfs snapshot yields
individually restorable files) and production-grade maturity, the alternative is
versitygw with its posix backend. It cannot
read MinIO's directory in place, so it requires an S3-API copy:
- Create a fresh dataset (
marathon/objects),acltype=posix. (xattr=sais an optional perf tweak; TrueNAS SCALE often reverts it toon, which is fine.) - Install the versitygw catalog app (
posixbackend, host path the new dataset, UID/GID568— the catalog app enforces a 568 minimum), on a spare port. rclone syncfrom MinIO's S3 endpoint to versitygw's, thenrclone check.- Cut clients over and decommission MinIO.
Trade-off vs RustFS: versitygw needs the copy (time + temporary double space) and a plain copy does not preserve versioning/object-locks/IAM, but you gain browsable files, snapshot-as-real-backup, and mature code.