Hardening Guide

This is a checklist for running dockmesh in production or any multi-user environment. Work through it once at setup, then revisit quarterly.

Server binary

Permissions

The dockmesh binary should be owned by root and writable only by root:

chown root:root /usr/local/bin/dockmesh
chmod 755 /usr/local/bin/dockmesh

The data directory (DOCKMESH_DB_PATH parent) should be root-owned with 700 permissions:

chown -R root:root /opt/dockmesh
chmod 700 /opt/dockmesh
chmod 600 /opt/dockmesh/data/*.db

The SQLite database contains the CA private key, session tokens, and encrypted secrets. Do not let it be world-readable.

Systemd hardening

Add these directives to /etc/systemd/system/dockmesh.service:

[Service]
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
ReadWritePaths=/opt/dockmesh /var/run/docker.sock
RestrictNamespaces=true
RestrictRealtime=true
LockPersonality=true
MemoryDenyWriteExecute=true

User accounts

Disable the default admin/admin on first login by setting a strong password. Better: enable SSO and delete the local admin after verifying SSO works.

Enforce 2FA for all admin accounts (Settings → Authentication → 2FA policy → Required for admins).

TLS

For the web UI

Enable the embedded Caddy reverse proxy (see Reverse Proxy) and bind dockmesh to 127.0.0.1:8080. Caddy terminates TLS on 443.

If you use an external load balancer (Cloudflare, AWS ALB, nginx), set DOCKMESH_HTTP_ADDR=127.0.0.1:8080 so dockmesh never listens on a public IP directly.

For agents

The agent protocol is mTLS by default. Do not disable mTLS. Rotate the CA if you suspect compromise:

# Backup first
cp /opt/dockmesh/data/dockmesh.db /opt/dockmesh/data/dockmesh.db.bak

# Rotate
dockmesh ca rotate --reissue-all-agents

All agents re-enroll on next connect. Old certs are revoked.

Network

Firewall

On the server host:

Port	Direction	Source	Purpose
443	in	Public (if UI is internet-facing) or VPN subnet	HTTPS UI
8443	in	Agent IPs only	Agent mTLS
80	in	Public	ACME challenge (only if using auto-TLS)
22	in	Admin IPs only	SSH

Block everything else. Example ufw:

ufw default deny incoming
ufw allow from <VPN-subnet> to any port 443
ufw allow from <agent-subnet> to any port 8443
ufw allow from <admin-ip> to any port 22
ufw enable

Agent hosts

Agents make outbound-only connections. No inbound ports needed. If you have an inbound firewall rule for them, remove it.

Secrets

Don’t commit to Git

If stacks are Git-backed, don’t commit plaintext passwords. Options:

SOPS with age encryption — files are encrypted in Git, decrypted at deploy time
Docker native secrets with an external store (Vault, AWS Secrets Manager)
dockmesh env var secrets — encrypted at rest in the dockmesh DB, never in Git

Rotation

Rotate these on a schedule:

Secret	Rotation
Agent mTLS certs	Auto, every 30 days
API tokens	Every 90 days (force via UI)
Session tokens	Expire after 15 min (JWT)
SSO client secret	Every 12 months
SMTP password	On staff change

Backups

Encrypt everything

Use age-encrypted backups (Backup docs). Store the passphrase in a password manager separate from the dockmesh instance.

Test restores

A backup you haven’t restored isn’t a backup. Schedule a quarterly test restore to a scratch host.

Off-site

At least one backup target should be off the same host (and ideally off the same provider). Local backup + S3/B2 is the common pattern.

Agent enrollment

Enrollment tokens are powerful. Treat them like root SSH keys:

One-time use (dockmesh enforces this)
Transmit over encrypted channel (SSH, 1Password shared vault, Signal)
Never commit to Git or pipeline config
Rotate the server’s enrollment-signing key annually

Container runtime

dockmesh doesn’t replace Docker security best practices:

Run containers as non-root where possible (USER in Dockerfile)
Drop capabilities (cap_drop: [ALL], add only what’s needed)
Use read-only root filesystem (read_only: true + explicit write volumes)
Set resource limits (cpus, mem_limit) on every container
Use no-new-privileges: true in compose

The image-scanner in dockmesh catches known CVEs but not runtime misconfigurations. Separate runtime scanner (Falco, Tracee) for that.

Audit and monitoring

Enable the Audit Log webhook to ship events to your SIEM
Alert on: failed SSO attempts, mTLS handshake failures, audit chain breaks
Review the audit log weekly for unexpected admin actions

Periodic reviews

Quarterly:

Who has Admin? Why?
Which API tokens exist? Who owns each? Still needed?
Expired or unused role assignments
Hosts that haven’t connected in 30 days (stale agents)
Stacks running on EOL’d base images