You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 1, 2021. It is now read-only.
Docker Swarm (aka Swarm Classic) and SwarmKit are tools for container orchestration. Swarm requires the use of an external discovery mechanism for its functioning. Commonly used discovery backends include etcd, consul, and Docker Hub discovery tokens (not recommended in production). SwarmKit on the other hand has discovery built in, and does not require the setting up of a separate discovery backend.
Some projects might want to use both Swarm and SwarmKit, which means that in addition to having SwarmKit’s inbuilt discovery, it is still required to set up an external discovery mechanism (etcd), which adds complexity and points of failure. Moreover, Swarm and SwarmKit might report node states differently, leading to inconsistent views of the cluster.
This issue describes a proposal to make SwarmKit the default discovery backend for Swarm
High-level UX
Swarm managers are Go processes, and they read cluster membership information off a discovery backend. Starting a Swarm manager currently looks like
where consul://192.168.99.103:8500 specifies the IP address and port for the discovery backend. Swarm agents (which are backed by a Docker Engine each) can then be run to advertise the corresponding Engine’s address to the discovery backend.
docker run -d swarm join --advertise=172.30.0.69:2375 consul://192.168.99.103:8500
This may not be required if the Engine itself advertises to the backend. The key is that the discovery should be able to access the list of agent Engines.
Since SwarmKit (and its inbuilt discovery) is integrated into the Docker Engine, we require that each Swarm manager be backed by a corresponding Docker Engine, that is also a SwarmKit manager in the cluster. We call it the manager’s “local” Engine. With that requirement, the Swarm manager would be passed a Docker socket so that it can directly query the Docker API at that engine. The workflow may look something like
This means the Swarm manager is listening for requests at 192.168.99.103:2376, and using the Docker socket of its “local” Engine for discovery. This may become the default way to run Swarm, as token based discovery is deprecated (#2743) and not considered safe for production.
Design Details
Here are some high-level design details for how this would be implemented.
When the Swarm manager starts up, it would query the Docker API to acquire the list of nodes along with their roles. This would allow it to get an idea of other managers and agents in the cluster.
This list should be stored in memory on the manager.
It would then start watching for SwarmKit events. The list of events is described here: API: Events stream moby/swarmkit#491 (comment). In particular, the relevant ones are node join/remove node promote/demote node status change (Ready, Down) node availability change (Active, Pause, Drain) node leadership change
Based on events coming in, Swarm should update its view of the cluster in memory.
// Backend is implemented by discovery backends which manage cluster entries.
type Backend interface {
// Watcher must be provided by every backend.
Watcher
// Initialize the discovery with URIs, a heartbeat, a ttl and optional settings.
Initialize(string, time.Duration, time.Duration, map[string]string) error
// Register to the discovery.
Register(string) error
}
Watcher is defined as
// Watcher provides watching over a cluster for nodes joining and leaving.
type Watcher interface {
// Watch the discovery for entry changes.
// Returns a channel that will receive changes or an error.
// Providing a non-nil stopCh can be used to stop watching.
Watch(stopCh <-chan struct{}) (<-chan Entries, <-chan error)
}
In this case, we need to define the Watcher based on the Docker socket, while the Register is not particularly relevant.
Leader Election
It is important to ensure that leader election can take place in a way that does not cause inconsistencies or race conditions in the cluster. SwarmKit is the source of truth here, and should also determine which manager node is the leader. The key requirement is that there should be a 1:1 correspondence between Swarm managers and SwarmKit managers, including their roles (Leader or not).
There are two competing ideas for leader election:
Swarm managers are subscribed to SwarmKit events, so they update their Leadership status based on received events. The upside is that this is low-overhead, but the downside is that there could be possible inconsistency if events to one or more managers are delayed. With this option, it would still be advisable to refresh membership information periodically (say every 30 seconds) by querying the Docker API.
A Swarm manager that is about to make an API call should first check if its “local” Engine is the leader. If not, it should delegate the request to a different manager (which will also do the same check). This ensures that a manager that isn’t a Leader will not make API calls to the Swarm cluster. The obvious downside is that there will be an extra API call to make each time, but as fewer operations depend on Swarm, this may not matter so much.
TLS
It looks like the discovery backend should be able to support TLS certs (moby/moby#16644). Any application that has access to SwarmKit certs should be able to provide those when starting up Swarm managers.
SwarmKit changes
As of now, it does not seem like any SwarmKit changes are necessary to make this proposal possible. The only requirement from SwarmKit is the ability to report full cluster information, both from the API and via Events, which exists currently.
Impact on Swarm Users
The key implication for the Swarm project and its users if we make these changes is that default discovery might become SwarmKit, which means it is required to have a 17.06 or newer “local” Engine backing Swarm managers. Of course, users of older Docker Engines will have to set up a consul or etcd cluster as discovery backends, so the old workflow will remain the same and there will be no impact or breakage for those users.
Timeline
This feature is targeted for the next major release: 1.3.0
The text was updated successfully, but these errors were encountered:
Proposal
Docker Swarm (aka Swarm Classic) and SwarmKit are tools for container orchestration. Swarm requires the use of an external discovery mechanism for its functioning. Commonly used discovery backends include etcd, consul, and Docker Hub discovery tokens (not recommended in production). SwarmKit on the other hand has discovery built in, and does not require the setting up of a separate discovery backend.
Some projects might want to use both Swarm and SwarmKit, which means that in addition to having SwarmKit’s inbuilt discovery, it is still required to set up an external discovery mechanism (etcd), which adds complexity and points of failure. Moreover, Swarm and SwarmKit might report node states differently, leading to inconsistent views of the cluster.
This issue describes a proposal to make SwarmKit the default discovery backend for Swarm
High-level UX
Swarm managers are Go processes, and they read cluster membership information off a discovery backend. Starting a Swarm manager currently looks like
where
consul://192.168.99.103:8500
specifies the IP address and port for the discovery backend. Swarm agents (which are backed by a Docker Engine each) can then be run to advertise the corresponding Engine’s address to the discovery backend.This may not be required if the Engine itself advertises to the backend. The key is that the discovery should be able to access the list of agent Engines.
Since SwarmKit (and its inbuilt discovery) is integrated into the Docker Engine, we require that each Swarm manager be backed by a corresponding Docker Engine, that is also a SwarmKit manager in the cluster. We call it the manager’s “local” Engine. With that requirement, the Swarm manager would be passed a Docker socket so that it can directly query the Docker API at that engine. The workflow may look something like
This means the Swarm manager is listening for requests at 192.168.99.103:2376, and using the Docker socket of its “local” Engine for discovery. This may become the default way to run Swarm, as token based discovery is deprecated (#2743) and not considered safe for production.
Design Details
Here are some high-level design details for how this would be implemented.
node join/remove
node promote/demote
node status change (Ready, Down)
node availability change (Active, Pause, Drain)
node leadership change
The new discovery backend will be based on the following
Backend
interface (https://github.com/moby/moby/blob/master/pkg/discovery/discovery.go):Watcher
is defined asIn this case, we need to define the
Watcher
based on the Docker socket, while theRegister
is not particularly relevant.Leader Election
It is important to ensure that leader election can take place in a way that does not cause inconsistencies or race conditions in the cluster. SwarmKit is the source of truth here, and should also determine which manager node is the leader. The key requirement is that there should be a 1:1 correspondence between Swarm managers and SwarmKit managers, including their roles (Leader or not).
There are two competing ideas for leader election:
TLS
It looks like the discovery backend should be able to support TLS certs (moby/moby#16644). Any application that has access to SwarmKit certs should be able to provide those when starting up Swarm managers.
SwarmKit changes
As of now, it does not seem like any SwarmKit changes are necessary to make this proposal possible. The only requirement from SwarmKit is the ability to report full cluster information, both from the API and via Events, which exists currently.
Impact on Swarm Users
The key implication for the Swarm project and its users if we make these changes is that default discovery might become SwarmKit, which means it is required to have a 17.06 or newer “local” Engine backing Swarm managers. Of course, users of older Docker Engines will have to set up a consul or etcd cluster as discovery backends, so the old workflow will remain the same and there will be no impact or breakage for those users.
Timeline
This feature is targeted for the next major release: 1.3.0
The text was updated successfully, but these errors were encountered: