Setting up Docker Swarm nodes

Docker Swarm - Overview and how to use it

4 min read

Published Jul 13 2025, updated Jul 14 2025

CLIDockerMulti-NodeOrchestrationReplicasServicesStacks

Docker Swarm is Docker's native orchestration tool that allows you to cluster and manage a group of Docker engines (nodes) as a single virtual system. This is especially useful for deploying applications at scale, handling load balancing, maintaining high availability, and ensuring automatic recovery in the event of container or node failures.

To begin using Docker Swarm, you must first initialise a Swarm cluster and define which nodes serve as managers and which as workers.

Initialising a Docker Swarm

The process starts by creating the initial Swarm and designating a manager node. On a machine that has Docker installed, you can run the following command to initialise a new Swarm:

docker swarm init

This will make the current machine the first manager node in the Swarm. After running this command, Docker outputs the command to join new nodes to the cluster. For example:

docker swarm join --token <worker-token> <manager-ip>:2377

This token and address can be shared with other machines to join them as worker nodes. Similarly, there's also a manager join token for joining additional manager nodes to form a high-availability control plane.

Manager Nodes

Manager nodes are responsible for the control and orchestration of the entire Swarm cluster. Their responsibilities include:

Maintaining the desired state of services
Orchestrating container deployment across nodes
Handling Swarm-specific operations (e.g., docker service commands)
Managing Swarm's internal distributed key-value store (Raft)

A manager node can also run containers, just like worker nodes, unless explicitly configured not to.

Worker Nodes

Worker nodes receive tasks from manager nodes and execute them. They do not participate in decision-making or orchestration. Their primary job is to run the containers assigned to them and report back status and metrics to managers.

Workers can also be promoted to managers at any time with:

docker node promote <node-name>

And demoted back with:

docker node demote <node-name>

High Availability and Best Practices for Managers

While a Swarm can operate with a single manager node, this configuration is not fault-tolerant. If that one node fails, the entire Swarm becomes non-functional from a control perspective — even if the worker nodes continue running containers.

For high availability (HA), it is recommended to have an odd number of manager nodes, typically 3 or 5. This is to ensure Raft consensus can always be achieved with a majority vote during network partitions or failures.

Why 3 or 5 Manager Nodes?

Swarm managers use the Raft protocol to maintain a consistent distributed state. Raft requires a majority of nodes (a quorum) to agree on updates. If you have:

3 managers: Swarm can tolerate 1 manager failure (2 still form a majority)
5 managers: Swarm can tolerate 2 manager failures
7 managers: Can tolerate 3, but adds communication overhead

Using an even number (like 2 or 4) increases the risk of a split-brain scenario, where the system cannot form a majority, causing the cluster to stall.

Thus, stick with 3 or 5 managers for HA. Avoid excessive manager nodes to minimise Raft overhead.

Container Deployment on Managers vs Workers

By default, both manager and worker nodes can run containers. This simplifies resource utilisation in small clusters. However, for security and stability in production environments, it's common to restrict managers to only orchestration duties by setting the availability of a manager node to “drain”:

docker node update --availability drain <manager-node>

This prevents new tasks from being scheduled on that node, preserving its performance for cluster control duties only.