Setting up Docker Swarm nodes
Docker Swarm - Overview and how to use it
4 min read
Published Jul 13 2025, updated Jul 14 2025
Guide Sections
Guide Comments
Docker Swarm is Docker's native orchestration tool that allows you to cluster and manage a group of Docker engines (nodes) as a single virtual system. This is especially useful for deploying applications at scale, handling load balancing, maintaining high availability, and ensuring automatic recovery in the event of container or node failures.
To begin using Docker Swarm, you must first initialise a Swarm cluster and define which nodes serve as managers and which as workers.
Initialising a Docker Swarm
The process starts by creating the initial Swarm and designating a manager node. On a machine that has Docker installed, you can run the following command to initialise a new Swarm:
This will make the current machine the first manager node in the Swarm. After running this command, Docker outputs the command to join new nodes to the cluster. For example:
This token and address can be shared with other machines to join them as worker nodes. Similarly, there's also a manager join token for joining additional manager nodes to form a high-availability control plane.
Manager Nodes
Manager nodes are responsible for the control and orchestration of the entire Swarm cluster. Their responsibilities include:
- Maintaining the desired state of services
- Orchestrating container deployment across nodes
- Handling Swarm-specific operations (e.g.,
docker service
commands) - Managing Swarm's internal distributed key-value store (Raft)
A manager node can also run containers, just like worker nodes, unless explicitly configured not to.
Worker Nodes
Worker nodes receive tasks from manager nodes and execute them. They do not participate in decision-making or orchestration. Their primary job is to run the containers assigned to them and report back status and metrics to managers.
Workers can also be promoted to managers at any time with:
And demoted back with:
High Availability and Best Practices for Managers
While a Swarm can operate with a single manager node, this configuration is not fault-tolerant. If that one node fails, the entire Swarm becomes non-functional from a control perspective — even if the worker nodes continue running containers.
For high availability (HA), it is recommended to have an odd number of manager nodes, typically 3 or 5. This is to ensure Raft consensus can always be achieved with a majority vote during network partitions or failures.
Why 3 or 5 Manager Nodes?
Swarm managers use the Raft protocol to maintain a consistent distributed state. Raft requires a majority of nodes (a quorum) to agree on updates. If you have:
- 3 managers: Swarm can tolerate 1 manager failure (2 still form a majority)
- 5 managers: Swarm can tolerate 2 manager failures
- 7 managers: Can tolerate 3, but adds communication overhead
Using an even number (like 2 or 4) increases the risk of a split-brain scenario, where the system cannot form a majority, causing the cluster to stall.
Thus, stick with 3 or 5 managers for HA. Avoid excessive manager nodes to minimise Raft overhead.
Container Deployment on Managers vs Workers
By default, both manager and worker nodes can run containers. This simplifies resource utilisation in small clusters. However, for security and stability in production environments, it's common to restrict managers to only orchestration duties by setting the availability of a manager node to “drain”:
This prevents new tasks from being scheduled on that node, preserving its performance for cluster control duties only.
Monitoring and Managing the Swarm
Once your Swarm is running, you can monitor its status using:
This shows all nodes in the cluster, their roles (manager/worker), availability, and status.
To see detailed info about a node:
You can leave a Swarm at any time (as long as you're not the last manager) with:
Managers can also force a leave if necessary: