Skip to content

Upgrade Guide

dockmesh follows semantic versioning. Minor and patch releases are drop-in upgrades. Major versions may require manual steps — always check the changelog before a major upgrade.

  1. Read the changelog for your target version
  2. Back up the data directory
  3. Stop the dockmesh server
  4. Replace the binary
  5. Start the dockmesh server
  6. Verify agents reconnect
  7. Verify all stacks are healthy

Typical total downtime: 30-60 seconds for the UI. Your containers keep running the whole time — the dockmesh server being down doesn’t affect Docker.

Terminal window
# Stop-the-world snapshot of the database + CA
systemctl stop dockmesh
tar czf /tmp/dockmesh-backup-$(date +%Y%m%d).tar.gz /var/lib/dockmesh/

Copy the tarball somewhere off the host (S3, another server, your workstation).

For a longer-term DR-grade snapshot, schedule the built-in dockmesh-system backup job (see Backup & Restore) — that pairs with dockmesh restore --from ….

If the upgrade fails, restore by extracting the tarball and using the previous binary.

The simplest path is to re-run the one-line installer — it picks up the latest GitHub release, verifies the SHA-256 against checksums.txt, and swaps the binary in place:

Terminal window
curl -fsSL https://get.dockmesh.dev | sudo bash

For a manual upgrade (air-gap, staged rollouts, pinned versions):

Terminal window
# Check current version
/usr/local/bin/dockmesh --version
# Pick a specific tag (replace v0.2.x with the target) and fetch the
# matching tarball + checksum from GitHub releases.
TAG=v0.2.x
curl -fsSL "https://github.com/dockmesh/dockmesh/releases/download/${TAG}/dockmesh_linux_amd64.tar.gz" -o /tmp/dockmesh.tar.gz
curl -fsSL "https://github.com/dockmesh/dockmesh/releases/download/${TAG}/dockmesh_linux_amd64.tar.gz.sha256" -o /tmp/dockmesh.tar.gz.sha256
( cd /tmp && sha256sum -c dockmesh.tar.gz.sha256 )
# Extract just the binaries we need
tar -xzf /tmp/dockmesh.tar.gz -C /tmp dockmesh dockmesh-agent dmctl
Terminal window
# Move old binaries aside
sudo mv /usr/local/bin/dockmesh /usr/local/bin/dockmesh.old
# Install new
sudo install -m 0755 /tmp/dockmesh /usr/local/bin/dockmesh
# Optional: ship the matching dmctl + agent binary too
sudo install -m 0755 /tmp/dmctl /usr/local/bin/dmctl
sudo install -m 0755 /tmp/dockmesh-agent /usr/local/share/dockmesh/bin/dockmesh-agent-linux-amd64
# Start
sudo systemctl start dockmesh
# Watch logs
journalctl -u dockmesh -f

The server’s startup logs will note when migrations apply and the HTTP / agent listeners come up. The exact lines differ per release — what you’re looking for is the absence of ERROR rows and a final “http listening” entry.

If migrations fail, the server exits. Roll back:

Terminal window
sudo systemctl stop dockmesh
sudo mv /usr/local/bin/dockmesh.old /usr/local/bin/dockmesh
# Restore the data dir if migrations partially ran
sudo tar -xzf /tmp/dockmesh-backup-*.tar.gz -C /
sudo systemctl start dockmesh

When an agent reconnects after the server starts, it reports its version. If the server’s new version ships an updated agent protocol, the server instructs the agent to download and hot-swap its binary.

Agent hot-swap:

  1. Server sends upgrade message with the new agent URL + checksum
  2. Agent downloads, verifies, writes new binary atomically
  3. Agent re-executes itself with the new binary (open connections survive via the systemd socket fd)
  4. Server confirms new version in the Hosts page

Typical agent upgrade: < 10 seconds, no stack restart.

Agents that can’t reach the upgrade URL (true air-gapped, or simply behind a strict egress firewall) won’t auto-upgrade. The Agents page flags them as drifted; copy the matching agent binary across the gap by hand:

Terminal window
# On a machine with internet, grab the right tarball + extract the agent
TAG=v0.2.x # match the central server's version
curl -fsSL "https://github.com/dockmesh/dockmesh/releases/download/${TAG}/dockmesh_linux_amd64.tar.gz" \
| tar -xz dockmesh-agent
# Move the file across to the air-gapped host (USB, scp via bastion, …)
# On the air-gapped agent host
sudo install -m 0755 dockmesh-agent /usr/local/bin/dockmesh-agent
sudo systemctl restart dockmesh-agent

Quick checks:

  • Dashboard loads, shows expected host count
  • All hosts are Online (Hosts page, status column)
  • No unexpected alerts fired during the upgrade window
  • Spot-check one stack — open it, deploy something minor, verify it works
  • Check the audit log for migration events (new version often adds new audit categories)

If you have many agents and want to upgrade them in waves instead of all at once, configure the fleet-wide policy on the Agents page (top panel):

  • Manual (default) — agents stay on their current version until an admin clicks Upgrade on each row. Safest; what existing installs get on first boot after this feature shipped.
  • Auto — every drifted agent is sent a FrameReqAgentUpgrade on the next evaluator tick. Good for homelab / small fleets where blast-radius is small.
  • Staged — the controller upgrades a configurable percentage of drifted agents per tick (default 10%), so you can watch the first wave before the rest follow.

The evaluator runs every 60 seconds. When you change the mode or bump the server version, hit Run now on the policy panel to trigger an immediate evaluation instead of waiting for the next tick.

Every drifted agent also gets a dedicated Upgrade button on its row in the Agents page — useful regardless of mode when a single host fell behind after a network blip.

The evaluator compares each connected agent’s Hello.Version (reported on every WS connect) to pkg/version.Version compiled into the server binary. String equality — no semver comparison. That keeps the logic simple: after a server upgrade, every agent is “drifted” until it reports the new version in its next hello, and normal evaluation takes over.

Every dispatched upgrade is logged with action stack.deploy and a action=agent-upgrade detail field, including the target agent id, the server’s version, and which binary was requested. Policy changes log as agent.upgrade_policy. Manual runs log as agent.upgrade_run.

Downgrades are not supported across database migrations. The server refuses to start if the DB schema is newer than the binary expects.

If you must downgrade:

  1. Stop the server
  2. Restore the pre-upgrade backup
  3. Install the older binary
  4. Start

This loses any changes made after the upgrade.

For v1.x → v2.0-style jumps:

  • Read the migration notes section of the release post
  • Test on a staging dockmesh first (point a throwaway instance at a subset of hosts)
  • Plan for 15-30 min downtime
  • Have the rollback procedure pre-tested