Agent Protocol
Most users never need to understand the wire protocol. But when debugging connectivity, planning network architecture, or evaluating the security model, knowing how it works matters.
The basics
Section titled “The basics”- Transport: WebSocket over TLS (
wss://) - Authentication: Mutual TLS (both sides present certs)
- Direction: Outbound-only from the agent
- Multiplexing: One WebSocket carries many concurrent streams
Certificate authority
Section titled “Certificate authority”Every dockmesh server has its own internal CA. On first boot, it generates:
- A CA keypair (RSA 4096 by default)
- A server leaf cert signed by the CA
The CA private key is stored in the DB, encrypted with a key derived from the DB file path + optional DOCKMESH_CA_PASSPHRASE.
Enrollment
Section titled “Enrollment”When you add a host:
- Server generates a one-time bootstrap token (random 128-bit, stored in DB)
- You transport the token to the agent host (install script, copy-paste)
- Agent runs:
dockmesh-agent enroll --server <url> --token <token> - Agent connects with the token as initial auth
- Server verifies token (one-time, expires in 1 hour)
- Agent generates a keypair
- Agent sends CSR with its hostname + requested SANs
- Server signs a cert valid for 30 days with the CA
- Cert returned to agent, saved to
/etc/dockmesh-agent/cert.pem - Token is invalidated
From then on, the agent uses its cert for every connection.
Connection lifecycle
Section titled “Connection lifecycle”The agent dials wss://<server>:8443 as soon as its service starts:
- TLS handshake — agent presents its cert, server verifies against CA + revocation list
- WebSocket upgrade — HTTP/1.1 Upgrade to WebSocket
- Protocol version handshake — both sides exchange supported versions, pick highest common
- Ready — bidirectional messaging begins
On any disconnection, the agent reconnects with exponential backoff (1s → 2s → 4s → 8s → 16s → 32s → 60s max).
Message framing
Section titled “Message framing”Once connected, both sides send typed messages:
+------+---------+-------+---------+| type | stream | flags | payload |+------+---------+-------+---------+type(uint16) — message kind (heartbeat, stats, log, exec, etc.)stream(uint32) — ID for multiplexed streams (logs for container A vs container B, etc.)flags— bit flags for compression, end-of-stream, etc.payload— MessagePack-encoded data
Multiplexing
Section titled “Multiplexing”A single WebSocket carries many concurrent “virtual streams”:
- Stream 0: control channel (heartbeat, upgrade commands)
- Stream N (allocated per operation):
- Live log tail for container X
- Live stats sampling
- Exec session for a specific shell
- File transfer (volume backup/restore, stack migration)
Each stream has its own flow control and lifecycle. When you close the browser tab with the log viewer, dockmesh sends a stream close message — the agent stops tailing that container’s log.
This is why a single agent connection handles dozens of simultaneous operations efficiently.
Message types
Section titled “Message types”A small selection of the 30+ message types:
| Type | Direction | Purpose |
|---|---|---|
heartbeat | agent → server | Every 15s; server marks host offline after 45s silence |
container.list | server → agent | ”Give me all containers” |
container.log.stream | server → agent | ”Start tailing logs for container X” |
container.exec.open | server → agent | ”Open a shell in container X” |
stack.deploy | server → agent | ”Apply this compose file” |
volume.transfer.chunk | either | Stream a chunk of a volume during migration |
agent.upgrade | server → agent | ”Download new agent binary from URL Y” |
cert.renew | agent → server | ”Give me a new cert, mine expires soon” |
Cert rotation
Section titled “Cert rotation”When a cert is within 7 days of expiry, the agent sends cert.renew with a fresh CSR. Server issues a new cert and returns it in the response. Agent saves it and uses it on next reconnect — no downtime.
Revocation
Section titled “Revocation”When you remove a host from dockmesh, the server adds the cert serial to a revocation list. On the next handshake attempt, server rejects the connection. Agent logs the rejection and exits.
Revocation is checked in memory, not via CRL or OCSP — takes effect within seconds.
Agent upgrade
Section titled “Agent upgrade”When the server boots with a newer version, it includes the agent binary URL + checksum in the protocol handshake. The agent downloads, verifies the checksum, writes the new binary atomically, and re-executes itself.
Open operations (tailing logs, running exec) are interrupted but reconnect automatically via retry logic in the client.
Why outbound-only?
Section titled “Why outbound-only?”Tradition says “server-to-agent” models pull from a central server. dockmesh flips this for practical reasons:
- No inbound firewall holes on agent hosts (NAT, corporate firewalls, home routers all work unchanged)
- Works behind CGNAT (common in home labs)
- Works from anywhere the agent has internet access (coffee shop, traveling laptop, edge locations)
- Agent knows when it has Docker running — no need for server to poll
The tradeoff: the server is reachable to all agents, which means protecting it matters. See Hardening.
Why not gRPC?
Section titled “Why not gRPC?”The original protocol used gRPC. Switched to raw WebSocket + MessagePack for:
- Simpler deployment (WebSocket survives corporate proxies that strip HTTP/2)
- Easier debugging (wireshark dissects clean frames)
- Smaller agent binary (no full gRPC runtime)
- Same-ish performance for our small-message workloads
Troubleshooting protocol issues
Section titled “Troubleshooting protocol issues”See Troubleshooting → Agent won’t connect.
For deep debugging, enable debug logging:
# On the agentsystemctl set-environment DOCKMESH_LOG_LEVEL=debugsystemctl restart dockmesh-agentjournalctl -u dockmesh-agent -fSee also
Section titled “See also”- Agent mTLS — certificate management
- Multi-Host — user-facing multi-host features
- Hardening — server-side protection