Skip to content

Smart Scaling

dockmesh scales individual services within a stack horizontally — spinning up or tearing down replicas of the same container. Scaling works on a single host; for spreading replicas across a fleet, combine with Migration.

On a stack’s detail page each scalable service has a Scale action that opens a modal with a replicas slider (0–10) plus the safety-check report. Pick a target count, the modal renders any warnings (see Safety checks below), and Apply runs compose up --scale <service>=<n> server-side, streaming progress back into the UI.

The auto-scaling controller exists in the backend (internal/scaling) and reads its rules from each stack’s .dockmesh.meta.json. A rule looks like this on disk:

{
"scaling": [
{
"service": "web",
"min_replicas": 2,
"max_replicas": 10,
"scale_up": { "metric": "cpu", "threshold_percent": 80, "duration_seconds": 300 },
"scale_down": { "metric": "cpu", "threshold_percent": 30, "duration_seconds": 600 },
"cooldown_seconds": 180
}
]
}

metric accepts cpu or memory; the threshold is the per-container average, evaluated by the controller every 30 seconds. Events land in the audit log.

There is no rule-editor UI yet — to add or change a rule today, hit the REST endpoint directly (PUT /api/v1/stacks/<name>/scaling-rules) or edit the stack’s .dockmesh.meta.json and redeploy. A proper UI is on the roadmap.

Before scaling — manual or auto — dockmesh runs the same pre-flight (compose.CheckScale) and surfaces:

  • has_container_name — Docker refuses to create a second container with the same name. The compose file must drop the container_name: key to scale beyond 1.
  • has_hard_port — fixed host port bindings ("8080:80") can only bind to one container. Switch to a port range, drop the host port, or move the service behind a reverse proxy.
  • is_stateful + has_volumes — when the image matches a known database pattern (postgres, mysql, mariadb, redis, mongo, etc.) and the service has mounted volumes, dockmesh flags scaling as unsafe. The modal shows the matched image and lets you tick “I understand the risk — proceed anyway” to override.

Manual scaling shows each flag inline as a coloured banner; the auto-scaler refuses to fire when has_container_name or has_hard_port is true (a stateful warning alone doesn’t block it).

The container metrics pipeline records CPU + memory at 30-second granularity and rolls up to 1-minute / 1-hour buckets for retention. A dedicated “replica count over time” chart isn’t shipped yet — for now, track auto-scaling events through the audit log filter (action = scale.*) and pair that with the per-container metrics on the container detail page.

  • Alerts — alert on replica count stuck at max
  • Stacks — editing the underlying compose.yaml