The TL;DR version: If you use Docker restart policies you probably want to use on-failure.
Restarting Docker Containers
Docker has a couple of documented approaches to support restarting containers automatically “when they exit“.
I felt the documentation wasn’t clear about what “when they exit” means, so I’m sharing a clarification in this short post.
At Pivot Freight, our current approach (part of our V1 Docker infrastructure which we’ll detail in a following set of posts) is to push customizations into the containers themselves and have the least amount of configuration changes within the hosts, apart from hardening changes. We’re also running a single service per container, so our setup is essentially as follows:
- Within the container use supervisor to manage the service (e.g. a JVM-based service) lifecycle
- The image’s CMD command sets up an environment-specific configuration symlink and starts supervisor with the service supervisor configuration pre-populated on the image
- When deploying new images, we have rolling automation in place that stops the running container, pulls the new image and starts a new container
- Use Docker restart policies to control the lifecycle of the container for non-deployment use cases
In other words, we expect that the service running within container may have to be automatically restarted due to a failure or other unforeseen conditions. We only expect the container to have to be automatically restarted in the event of the Docker daemon being restarted or the host being rebooted.
As described above we use supervisor for the former use case, and we use Docker restart policies for the former.
Docker Restart Policies
When I first read the documentation for the Docker restart policies I wasn’t sure it would meet our needs, specifically:
- on-failure[:max-retries]: Restart only if the container exits with a non-zero exit status.
- always: Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely.
I wanted to stay away from the –restart=always policy, because despite the ever increasing back-off pause I would rather have an eventual hard failure of the container in any case.
As it turns out, those policies were not behaving the way I was expecting from reading the documentation.
When running under this policy, I would expect that docker kill would cause the container to be restarted, but it doesn’t. That was a bit surprising. A docker stop will not cause a restart either, but that makes sense.
When running under this policy, I wouldn’t expect the container to be restarted when the Docker daemon is restarted nor when the host is rebooted. I also wouldn’t expect a container to be restarted once a docker stop has been issued followed by a Docker daemon restart. However, in all three cases, the container will be restarted.
While the on-failure behavior was great news for us, I can’t help but think there is a discrepancy here. If a docker kill does not cause a restart under the always policy, this would indicate it’s “regardless of the exit status”, except when the Docker daemon sent the signal, but that’s probably explainable. What is more surprising, is that a docker stop followed by a Docker daemon restart is considered a “non-zero exit status” under the on-failure policy.
We’re now running with the on-failure policy, which meets our current needs perfectly. As far as the long-running containers are concerned, we want restarted when the Docker daemon (re-)starts or the host is rebooted.
I will readily admit that the behavior may square completely with which exit codes are returned by the container (as opposed to the service running within) under various circumstances, but I thought the documentation could use a few additional common use cases.
This applies to latest released Docker version as of this writing, version 1.8.3.