Chapter 7. Debugging Containers

Once you’ve shipped an application to production, there will come a day when it’s not working as expected. It’s always nice to know ahead of time what to expect when that day comes. It’s also important to have a good understanding of debugging containers before moving on to more complex deployments. Without debugging skills, it will be difficult to see where orchestration systems have gone wrong. So let’s take a look at debugging containers.

In the end, debugging a containerized application is not all that different from debugging a normal process on a system except that the tools are somewhat different. Docker provides some pretty nice tooling to help you out! Some of these map to regular system tools, and some go further.

It is also critical to understand that your application is not running in a separate system from the other Docker processes. They share a kernel, likely a filesystem, and depending on your container configuration, they may share network interfaces. That means you can get a lot of information about what your container is doing.

If you’re used to debugging applications in a virtual machine environment, you might think you would need to enter the container to inspect in detail an application’s memory or CPU use, or to debug its system calls. Not so! Despite feeling in many ways like a virtualization layer, processes in containers are just processes on the Docker host itself. If you want to see a process list across all of the Docker containers on a machine, you can just run ps with your favorite command-line options right on the server, for example. By using the docker top command, you can see the process list as your container understands it. Let’s take a look at some of the things that you can do when debugging a containerized application that don’t require the use of docker exec and nsenter.

Process Output

One of the first things you’ll want to know when debugging a container is what is running inside it. Docker has a built-in command for doing that: docker top. This is nice because it works even from remote hosts because it is exposed over the Docker Remote API. This is not the only way to see what’s going on inside a container, but it is the easiest to use. Let’s see how that works:

$ docker ps

CONTAINER ID   IMAGE        COMMAND    ...  NAMES
106ead0d55af   test:latest  /bin/bash  ...  clever_hypatia

$ docker top 106ead0d55af

UID        PID    PPID    C  STIME  TTY TIME     CMD
root       4548   1033    0  13:29  ?   00:00:00 /bin/sh -c nginx
root       4592   4548    0  13:29  ?   00:00:00 nginx: master process nginx
www-data   4593   4592    0  13:29  ?   00:00:00 nginx: worker process
www-data   4594   4592    0  13:29  ?   00:00:00 nginx: worker process
www-data   4595   4592    0  13:29  ?   00:00:00 nginx: worker process
www-data   4596   4592    0  13:29  ?   00:00:00 nginx: worker process

To run docker top, we need to pass it the ID of our container, so we get that from docker ps. We then pass that to docker top and get a nice listing of what is running inside our container, ordered by PID just as we’d expect from Linux ps output.

There are some oddities here, though. The primary one is namespacing of user IDs and filesystems.

For example, a user might exist in a container’s /etc/passwd directory, but that same user might not exist simultaneously on the host machine, or have an entirely different user ID on the host machine versus inside the container. In the case where that user is running a process in a container, the ps output on the host machine will show a numeric ID rather than a username. In some cases, two containers might have users squatting on the same numeric ID, or mapping to an ID that is a completely different user on the host system. For example, if haproxy were installed on the host system, it would be possible for nginx in the container to appear to be running as the haproxy user on the host.

Let’s look at a more concrete example. Let’s consider a production Docker server running CentOS 7 and a container running on it that has an Ubuntu distribution inside. If you ran the following commands on the CentOS host, you would see that UID 7 is named halt:

$ id 7

uid=7(halt) gid=0(root) groups=0(root)

Note

There is nothing special about the UID number we are using here. You don’t need to take any particular note of it. It was chosen simply because it is used by default on both platforms but represents a different username.

If we then enter the standard Ubuntu container on that Docker host, you will see that UID 7 is set to lp in /etc/passwd. By running the following commands, you can see that the container has a completely different perspective of who UID 7 is:

$ docker run -i -t ubuntu:latest /bin/bash

root@f86f8e528b92:/# grep x:7: /etc/passwd
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin

root@f86f8e528b92:/# id lp

uid=7(lp) gid=7(lp) groups=7(lp)
root@409c2a8216b1:/# exit

If we then run ps au on the Docker host while that container is running as UID 7 (-u 7), we would see that the Docker host would show the container process as being run by halt instead of lp:

$ docker run -d -u 7 ubuntu:latest sleep 1000

5525...06c6

$ ps ua | grep sleep

 1185 halt     sleep 1000
 1192 root     grep sleep

This could be particularly confusing if a well-known user like nagios or postgres were configured on the host system but not in the container, yet the container ran its process with the same ID. This namespacing can make the ps output look quite strange. It might, for example, look like the nagios user on your Docker host is running the postgresql daemon that was launched inside a container, if you don’t pay close attention.

Tip

One solution to this is to dedicate a nonzero UID to your containers. On your Docker hosts, you can create a container user as UID 5000 and then create the same user in your base container images. If you then run all your containers as UID 5000 (-u 5000), not only will you improve the security of your system by not running container processes as UID 0, but you will also make the ps output on the Docker host easier to decipher by displaying the container user for all of your running container processes. Some systems use the nobody or daemon user for the same purpose, but we prefer container for clarity. There is a little more detail about how this works in “Namespaces”.

Likewise, because the process has a different view of the filesystem, paths that are shown in the ps output are relative to the container and not the host. In these cases, knowing it is in a container is a big win.

So that’s how you use the Docker tooling to look at what’s running in a container. But that’s not the only way, and in a debugging situation, it might not be the best way. If you hop onto a Docker server and run a normal Linux ps to see what’s running, you get a full list of everything containerized and not containerized just as if they were all equivalent processes. There are some ways to look at the process output to make things a lot clearer. For example, you can facilitate debugging by looking at the Linux ps output in tree form so that you can see all of the processes descended from Docker. Here’s what that might look like when you use the BSD command-line flags; we’ll chop the output to just the part we care about:

$ ps axlfww

...  /usr/bin/dockerd -H fd://
...  \_ docker-containerd -l unix:///var/run/docker/l
...  |   \_ docker-containerd-shim b668353c3af5d62350
...  |   |   \_ /usr/bin/cadvisor -logtostderr
...  |   \_ docker-containerd-shim dd72ecf1c9e4c22bf7
...  |       \_ /bin/s6-svscan /etc/services
...  |           \_ s6-supervise nginx.svc
...  |               \_ ./nginx
...  \_ /usr/bin/docker-proxy -proto tcp -host-ip 0.0
...  \_ /usr/bin/docker-proxy -proto tcp -host-ip 0.0

Note

Many of the ps commands in the preceding example work only on Linux distributions with the full ps command. Some stripped-down versions of Linux run the Busybox shell, which does not have full ps support and won’t show some of this output. We recommend running a full distribution on your host systems like CoreOS, Ubuntu, or CentOS.

Here you can see that we’re running one Docker daemon and two instances of the docker-proxy, which we will discuss in more detail in “Network Inspection”. There is one containerd running, which is the main container runtime inside Docker. Everything else under those processes represents Docker containers and processes inside them. In this example, we have two containers. They show up as docker-containerd-shim followed by the container ID. In this case, we are running one instance of Google’s cadvisor, and one instance of nginx in another container, being supervised by the S6 supervisor. Each container has a related docker-proxy process that is used to map the required network ports between the container and the host Docker server.

Because of the tree output from ps, it’s pretty clear how they are related to each other, and we know they’re running in a container because they are in dockerd’s process tree. If you’re a bigger fan of Unix SysV command-line flags, you can get a similar, but not as nice-looking, tree output with ps -ejH:

$ ps -ejH

40643 ...   docker
43689 ...     docker
43697 ...     docker
43702 ...     start
43716 ...       java
46970 ...     docker
46976 ...     supervisord
46990 ...       supervisor_remo
46991 ...       supervisor_stdo
46992 ...       nginx
47030 ...         nginx
47031 ...         nginx
47032 ...         nginx
47033 ...         nginx
46993 ...       ruby
47041 ...         ruby
47044 ...         ruby

You can get a more concise view of the docker process tree by using the pstree command. Here, we’ll use pidof to scope it to the tree belonging to docker:

$ pstree `pidof dockerd`

dockerd... docker-containe──cadvisor─┬──15*[{cadvisor}]
       ...         │                 └─9*[{docker-containe}]
       ...         └─docker-containe─┬─s─nginx───13*[{nginx}...]
       ...                           └─9*[{docker-containe}]

This doesn’t show us PIDs and therefore is useful only for getting a sense of how things hang together in our containers. But this is pretty nice output when there are a lot of containers running on a host. It’s far more concise and provides a nice high-level map of how things connect. Here we can see the same containers that were shown in the previous ps output, but the tree is collapsed so we get multipliers like 10* when there are 10 duplicate processes.

We can actually get a full tree with PIDs if we run pstree, as shown here:

$ pstree -p `pidof dockerd`

dockerd(4086)... ─┬─dockerd(6529)─┬─{dockerd}(6530)
             ...  │               ├─...
             ...  │               └─{dockerd}(6535)
             ...  ├─...
             ...  ├─mongod(6675)─┬─{mongod}(6737)
             ...  │              ├─...
             ...  │              └─{mongod}(6756)
             ...  ├─redis-server(6537)─┬─{redis-server}(6576)
             ...  │                    └─{redis-server}(6577)
             ...  ├─{dockerd}(4089)
             ...  ├─...
             ...  └─{dockerd}(6738)

This output provides us with a very good look at all the processes attached to Docker and what they are running. It is, however, difficult to see the docker-proxy in this output, since it is really just another forked dockerd process.

Process Inspection

If you’re logged in to the Docker server, you can inspect running processes in many of the same ways that you would on the host. Common debugging tools like strace work as expected. In the following code, we’ll inspect a unicorn process running inside a Ruby web app container:

$ strace -p 31292

Process 31292 attached - interrupt to quit
select(11, [10], NULL, [7 8], {30, 103848}) = 1 (in [10], left {29, 176592})
fcntl(10, F_GETFL)                      = 0x802 (flags O_RDWR|O_NONBLOCK)
accept4(10, 0x7fff25c17b40, [128], SOCK_CLOEXEC) = -1 EAGAIN (...)
getppid()                               = 17
select(11, [10], NULL, [7 8], {45, 0})  = 1 (in [10], left {44, 762499})
fcntl(10, F_GETFL)                      = 0x802 (flags O_RDWR|O_NONBLOCK)
accept4(10, 0x7fff25c17b40, [128], SOCK_CLOEXEC) = -1 EAGAIN (...)
getppid()                               = 17

You can see that we get the same output that we would from noncontainerized processes on the host. Likewise, an lsof shows us that the files and sockets open in a process work as expected:

$ lsof -p 31292

COMMAND ...  NAME
ruby    ...  /data/app
ruby    ...  /
ruby    ...  /usr/local/rbenv/versions/2.1.1/bin/ruby
ruby    ...  /usr/.../iso_8859_1.so (stat: No such file or directory)
ruby    ...  /usr/.../fiber.so (stat: No such file or directory)
ruby    ...  /usr/.../cparse.so (stat: No such file or directory)
ruby    ...  /usr/.../libsasl2.so.2.0.23 (path dev=253,0, inode=1443531)
ruby    ...  /lib64/libnspr4.so (path dev=253,0, inode=655717)
ruby    ...  /lib64/libplc4.so (path dev=253,0, inode=655718)
ruby    ...  /lib64/libplds4.so (path dev=253,0, inode=655719)
ruby    ...  /usr/lib64/libnssutil3.so (path dev=253,0, inode=1443529)
ruby    ...  /usr/lib64/libnss3.so (path dev=253,0, inode=1444999)
ruby    ...  /usr/lib64/libsmime3.so (path dev=253,0, inode=1445001)
ruby    ...  /usr/lib64/libssl3.so (path dev=253,0, inode=1445002)
ruby    ...  /lib64/liblber-2.4.so.2.5.6 (path dev=253,0, inode=655816)
ruby    ...  /lib64/libldap_r-2.4.so.2.5.6 (path dev=253,0, inode=655820)

Note that the paths to the files are all relative to the container’s view of the backing filesystem, which is not the same as the host view. Therefore, the version of the file on the host will not match the one the container sees. In this case, it’s probably best to enter the container using docker exec to look at the files with the same view that the processes inside it have.

It’s possible to run the GNU debugger (gdb) and other process inspection tools in the same manner as long as you’re root and have proper permissions to do so.

Controlling Processes

When you have a shell directly on the Docker server, you can, in many ways, treat containerized processes just like any other process running on the system. If you’re remote, you might send signals with docker kill because it’s expedient. But if you’re already logged in to a Docker server for a debugging session or because the Docker daemon is not responding, you can just kill the process like you would any other.

Note that unless you kill the top-level process in the container (PID 1 inside the container), killing a process will not terminate the container itself. That might be desirable if you were killing a runaway process, but might leave the container in an unexpected state. Developers probably expect that all the processes are running if they can see their container in docker ps and it could also confuse a scheduler like Mesos or Kubernetes or any other system that is health-checking your application. Keep in mind that containers are supposed to look to the outside world like a single bundle. If you need to kill off something inside the container other than for debugging purposes, it’s best to replace the whole container. Containers offer an abstraction that tools interoperate with. They expect the internals of the container to remain consistent.

Terminating processes is not the only reason to send signals. And since containerized processes are just normal processes in many respects, they can be passed the whole array of Unix signals listed in the manpage for the Linux kill command. Many Unix programs will perform special actions when they receive certain predefined signals. For example, nginx will reopen its logs when receiving a SIGUSR1 signal. Using the Linux kill command, you can send any Unix signal to a container process on the local Docker server.

Tip

Unless you run an orchestrator like Kubernetes that can handle multiple containers in a larger abstraction like a pod, we consider it a best practice to run some kind of process control in your production containers. Whether it be systemd, upstart, runit, s6, or your own homegrown tools, this allows you to treat containers atomically even when they contain more than one process. You should, however, try very hard not to run more than one thing inside your container, to ensure that your container is scoped to handle one well-defined task and does not grown into a monolithic container.

But in either case, you want docker ps to reflect the presence of the whole container and don’t want to worry if one of the processes inside it has died. If you can assume that the presence of a container and absence of error logs means that things are working, you can treat docker ps output as the truth about what’s happening on your Docker systems. It also means any orchestration system you use can do the same.

It is also a good idea to ensure that you understand the complete behavior of your preferred process control service, including memory or disk utilization, Unix single handling, and so on, since this can impact your container’s performance and behavior. Generally, the lightest-weight systems are the best.

Because containers work just like any other process, it’s important to understand how they can interact with your application in a less helpful way. There are some special needs in a container for processes that spawn background children—that is, anything that forks and daemonizes so the parent no longer manages the child process lifecycle. Jenkins build containers are one common example where people see this go wrong. When daemons fork into the background, they become children of PID 1 on Unix systems. Process 1 is special and is usually an init process of some kind.

PID 1 is responsible for making sure that children are reaped. In your container, by default your main process will be PID 1. Since you probably won’t be handling reaping of children from your application, you can end up with zombie processes in your container. There are a few solutions to this problem. The first is to run an init system in the container of your own choosing—one that is capable of handling PID 1 responsibilities. s6, runit, and others described in the preceding Note can be easily used inside the container.

But Docker itself provides an even simpler option that solves just this one case without taking on all the capabilities of a full init system. If you provide the --init flag to docker run, Docker will launch a very small init process based on the tini project that will act as PID 1 inside the container on startup. Whatever you specify in your Dockerfile as the CMD is passed to tini and otherwise works in the same way you would expect. It does, however, replace anything you might have in the ENTRYPOINT section of your Dockerfile.

When you launch a Docker container without the --init flag, you get something like this in your process list:

$ docker run -i -t alpine:3.6 sh
/ # ps -ef

PID   USER     TIME   COMMAND
    1 root       0:00 sh
    5 root       0:00 ps -ef

/ # exit

Notice that in this case, the CMD we launched is PID 1. That means it is responsible for child reaping. If we are launching a container where that is important, we can pass --init to make sure that when the parent process exits, children are reaped.

$ docker run -i -t --init alpine:3.6 sh
/ # ps -ef

PID   USER     TIME   COMMAND
    1 root       0:00 /dev/init -- sh
    5 root       0:00 sh
    6 root       0:00 ps -ef

/ # exit

Here, you can see that the PID 1 process is actually /dev/init. That has in turn launched the shell binary for us as specified on the command line. Because we now have an init system inside the container, the PID 1 responsibilities fall to it rather than the command we used to invoke the container. In most cases this is what you want. You may not need an init system, but it’s small enough that you should consider having at least tini inside your containers in production.

Network Inspection

Compared to process inspection, debugging containerized applications at the network level can be more complicated. Unlike traditional processes running on the host, Docker containers can be connected to the network in a number of ways. If you are running the default setup, as the vast majority of people are, then your containers are all connected to the network via the default bridge network that Docker creates. This is a virtual network where the host is the gateway to the rest of the world. We can inspect these virtual networks with the tooling that ships with Docker. You can get it to show you which networks exist by calling the docker network ls command:

$ docker network ls

NETWORK ID          NAME                DRIVER              SCOPE
a4ea6aeb7503        bridge              bridge              local
b5b5fa6889b1        host                host                local
08b8b30a20da        none                null                local

Here we can see the default bridge network: the host network, which is for any containers running in host network mode, where containers share the same network namespace as the host; and the none network, which disables network access entirely for the container. If you run docker-compose or other orchestration tools, they may create additional networks here with different names.

But seeing which networks exist doesn’t make it any easier to see what’s on those networks. So, you can see which containers are attached to any particular named network with the docker network inspect command. This produces a fair amount of output. It shows you all of the containers that are attached to the network specified, and a number of details about the network itself. Let’s take a look at the default bridge network:

$ docker network inspect bridge

[
    {
        "Name": "bridge",
        ...
        "Driver": "bridge",
        "EnableIPv6": false,
        ...
        "Containers": {
            "6a0f439...9a9c3": {
                "Name": "inspiring_johnson",
                ...
                "IPv4Address": "172.17.0.2/16",
                "IPv6Address": ""
            },
            "8720cc2...e91b5": {
                "Name": "zealous_keller",
                ...
                "IPv4Address": "172.17.0.3/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            ...
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            ...
        },
        "Labels": {}
    }
]

We’ve excluded some of the details here to shrink the output a bit. But what we can see is that there are two containers on the bridge network, and they are attached to the docker0 bridge on the host. We can also see the IP addresses of each container, and the host network address they are bound to (host_binding_ipv4). This is useful when you are trying to understand the internal structure of the bridged network. Note that if you have containers on different networks, they may not have connectivity to each other, depending on how the networks were configured.

Tip

In general we recommend leaving your containers on the default bridge network unless you have a good reason not to, or are running docker-compose or a scheduler that constructs networks in its own manner. In addition, naming your containers in some identifiable way really helps here because we can’t see the image information. The name and ID are the only reference we have in this output that can tie us back to the docker ps listing. Some schedulers don’t do a good job of naming containers, which is too bad because it can be really helpful for debugging.

As we’ve seen, containers will normally have their own network stack and their own IP address, unless they are running in host networking mode, which we will discuss further in “Networking”. But what about when we look at them from the host machine itself? Because containers have their own network and addresses, they won’t show up in all netstat output on the host. But we know that the ports you map to your containers are bound on the host. Running netstat -an on the Docker server works as expected, as shown here:

$ sudo netstat -an

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 10.0.3.1:53             0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp6       0      0 :::23235                :::*                    LISTEN
tcp6       0      0 :::2375                 :::*                    LISTEN
tcp6       0      0 :::4243                 :::*                    LISTEN
tcp6       0      0 fe80::389a:46ff:fe92:53 :::*                    LISTEN
tcp6       0      0 :::22                   :::*                    LISTEN
udp        0      0 10.0.3.1:53             0.0.0.0:*
udp        0      0 0.0.0.0:67              0.0.0.0:*
udp        0      0 0.0.0.0:68              0.0.0.0:*
udp6       0      0 fe80::389a:46ff:fe92:53 :::*

Here we can see all of the interfaces that we’re listening on. Our container is bound to port 23235 on IP address 0.0.0.0. That shows up. But what happens when we ask netstat to show us the process name that’s bound to the port?

$ netstat -anp

Active Internet connections (servers and established)
Proto ... Local Address           Foreign Address State  PID/Program name
tcp   ... 10.0.3.1:53             0.0.0.0:*       LISTEN 23861/dnsmasq
tcp   ... 0.0.0.0:22              0.0.0.0:*       LISTEN 902/sshd
tcp6  ... :::23235                :::*            LISTEN 24053/docker-proxy
tcp6  ... :::2375                 :::*            LISTEN 954/docker
tcp6  ... :::4243                 :::*            LISTEN 954/docker
tcp6  ... fe80::389a:46ff:fe92:53 :::*            LISTEN 23861/dnsmasq
tcp6  ... :::22                   :::*            LISTEN 902/sshd
udp   ... 10.0.3.1:53             0.0.0.0:*              23861/dnsmasq
udp   ... 0.0.0.0:67              0.0.0.0:*              23861/dnsmasq
udp   ... 0.0.0.0:68              0.0.0.0:*              880/dhclient3
udp6  ... fe80::389a:46ff:fe92:53 :::*                   23861/dnsmasq

We see the same output, but notice what is bound to the port: docker-proxy. That’s because, in its default configuration, Docker actually has a proxy written in Go that sits between all of the containers and the outside world. That means that when we look at output, we see only docker-proxy. Notice that there is no clue here about which container that docker-proxy is handling. Luckily, docker ps shows us which containers are bound to which ports, so this isn’t a big deal. But it’s not necessarily expected, and you probably want to be aware of it before you’re debugging a production failure. Still, passing the p flag to netstat is helpful in identifying which ports are tied to containers.

Note

If you’re using host networking in your container, then this layer is skipped. There is no docker-proxy, and the process in the container can bind to the port directly. It also shows up as a normal process in netstat -anp output.

Other network inspection commands work largely as expected, including tcpdump, but it’s important to remember that docker-proxy is there, in between the host’s network interface and the container, and that the containers have their own network interfaces on a virtual network.

Image History

When you’re building and deploying a single container, it’s easy to keep track of where it came from and what images it’s sitting on top of. But this rapidly becomes unmanageable when you’re shipping many containers with images that are built and maintained by different teams. How can you tell what layers are actually underneath the one your container is running on? Your container’s image version hopefully shows you which build you’re running of the application, but that doesn’t reveal anything about the images it’s built on. docker history does just that. You can see each layer that exists in the inspected image, the sizes of each layer, and the commands that were used to build it:

$ docker history redis:latest

IMAGE        CREATED     CREATED BY                                    SIZE ...
33c26d72bd74 3 weeks ago /bin/sh -c #(nop)  CMD ["redis-server"]       0B
<missing>    3 weeks ago /bin/sh -c #(nop)  EXPOSE 6379/tcp            0B
<missing>    3 weeks ago /bin/sh -c #(nop)  ENTRYPOINT ["docker-entry… 0B
<missing>    3 weeks ago /bin/sh -c #(nop) COPY file:9c29fbe8374a97f9… 344B
<missing>    3 weeks ago /bin/sh -c #(nop) WORKDIR /data               0B
<missing>    3 weeks ago /bin/sh -c #(nop)  VOLUME [/data]             0B
<missing>    3 weeks ago /bin/sh -c mkdir /data && chown redis:redis … 0B
<missing>    3 weeks ago /bin/sh -c set -ex;   buildDeps='   wget    … 24.1MB
<missing>    3 weeks ago /bin/sh -c #(nop)  ENV REDIS_DOWNLOAD_SHA=ff… 0B
<missing>    3 weeks ago /bin/sh -c #(nop)  ENV REDIS_DOWNLOAD_URL=ht… 0B
<missing>    3 weeks ago /bin/sh -c #(nop)  ENV REDIS_VERSION=4.0.8    0B
<missing>    3 weeks ago /bin/sh -c set -ex;   fetchDeps='ca-certific… 3.1MB
<missing>    3 weeks ago /bin/sh -c #(nop)  ENV GOSU_VERSION=1.10      0B
<missing>    3 weeks ago /bin/sh -c groupadd -r redis && useradd -r -… 330kB
<missing>    3 weeks ago /bin/sh -c #(nop)  CMD ["bash"]               0B
<missing>    3 weeks ago /bin/sh -c #(nop) ADD file:a0f72eb6710fe45af… 79.2MB

Using docker history can be useful, for example, when you are trying to determine why the size of the final image is much larger than expected. The layers are listed in order, with the first one at the bottom of the list and the last one at the top.

Here we can see that the command output has been truncated in a few cases. For long commands, adding the --no-trunc option to the preceding command will let you see the complete command that was used to build each layer.

Inspecting a Container

In Chapter 4, we showed you how to read the docker inspect output to see how a container is configured. But underneath that is a directory on the host’s disk that is dedicated to the container. Usually this is /var/lib/docker/containers. If you look at that directory, it contains very long SHA hashes, as shown here:

$ ls /var/lib/docker/containers

106ead0d55af55bd803334090664e4bc821c76dadf231e1aab7798d1baa19121
28970c706db0f69716af43527ed926acbd82581e1cef5e4e6ff152fce1b79972
3c4f916619a5dfc420396d823b42e8bd30a2f94ab5b0f42f052357a68a67309b
589f2ad301381b7704c9cade7da6b34046ef69ebe3d6929b9bc24785d7488287
959db1611d632dc27a86efcb66f1c6268d948d6f22e81e2a22a57610b5070b4d
a1e15f197ea0996d31f69c332f2b14e18b727e53735133a230d54657ac6aa5dd
bad35aac3f503121abf0e543e697fcade78f0d30124778915764d85fb10303a7
bc8c72c965ebca7db9a2b816188773a5864aa381b81c3073b9d3e52e977c55ba
daa75fb108a33793a3f8fcef7ba65589e124af66bc52c4a070f645fffbbc498e
e2ac800b58c4c72e240b90068402b7d4734a7dd03402ee2bce3248cc6f44d676
e8085ebc102b5f51c13cc5c257acb2274e7f8d1645af7baad0cb6fe8eef36e24
f8e46faa3303d93fc424e289d09b4ffba1fc7782b9878456e0fe11f1f6814e4b

That’s a bit daunting. But those are just the container IDs in long form. If you want to look at the configuration for a particular container, you just need to use docker ps to find its short ID, and then find the directory that matches:

$ docker ps

CONTAINER ID        IMAGE                             COMMAND             ...
8720cc2f0502        alpine:3.6   "/bin/sh -c nginx"     ...

You can view the short ID from docker ps, then match it to the ls /var/lib/docker/containers output to see that you want the directory beginning with 8720cc2f0502. Command-line tab completion is helpful here. If you need exact matching, you can do a docker inspect 8720cc2f0502 and grab the long ID from the output. This directory contains some pretty interesting files related to the container:

$ cd /var/lib/docker/containers/\
8720cc2f05021643fb1f78710ec5ebc245ca42d3c64dd4e73dd2534226be91b5

$ ls -la

total 36
drwx------  4 root root 4096 Dec 28 04:26 .
drwx------ 35 root root 4096 Dec 28 15:32 ..
-rw-r-----  1 root root    0 Dec 28 04:26 8720cc2f05021643fb1f78710ec5ebc...
drwx------  2 root root 4096 Dec 28 04:26 checkpoints
-rw-------  1 root root 2620 Dec 28 04:26 config.v2.json
-rw-r--r--  1 root root 1153 Dec 28 04:26 hostconfig.json
-rw-r--r--  1 root root   13 Dec 28 04:26 hostname
-rw-r--r--  1 root root  174 Dec 28 04:26 hosts
-rw-r--r--  1 root root  194 Dec 28 04:26 resolv.conf
-rw-r--r--  1 root root   71 Dec 28 04:26 resolv.conf.hash
drwxrwxrwt  2 root root   40 Dec 28 04:26 shm

As we discussed in Chapter 5, this directory contains some files that are bind-mounted directly into your container, like hosts, resolv.conf, and hostname. If you are running the default logging mechanism, then this directory is also where Docker stores the JSON file containing the log that is shown with the docker logs command, the JSON configuration that backs the docker inspect output (config.v2.json), and the networking configuration for the container (hostconfig.json). The resolv.conf.hash file is used by Docker to determine when the container’s file has diverged from the current one on the host so it can be updated.

This directory can also be really helpful in the event of severe failure. Even if we’re not able to enter the container, or if docker is not responding, we can look at how the container was configured. It’s also pretty useful to understand where those files are mounted from inside the container. Keep in mind that it’s not a good idea to modify these files. Docker expects them to contain reality, and if you alter that reality, you’re asking for trouble. But it’s another avenue for information on what’s happening in your container.

Filesystem Inspection

Docker, regardless of the backend actually in use, has a layered filesystem that allows it to track the changes in any given container. This is how the images are actually assembled when you do a build, but it is also useful when you’re trying to figure out if a Docker container has changed anything and, if so, what. A common problem with Dockerized applications is that they continue to write things into the filesystem. Normally you don’t want your containers to do that, to the extent possible, and it can be helpful for debugging to figure out if they have been writing into the container. Sometimes this is helpful in turning up stray logfiles that exist in the container as well. As with most of the core tools, this kind of inspection is built into the docker command-line tooling and is also exposed via the API. Let’s take a look at what this shows us. We’ll assume that we already have the ID of the container we’re concerned with.

$ sudo docker diff 89b8e19707df

C /var/log/redis
A /var/log/redis/redis.log
C /var/run
A /var/run/cron.reboot
A /var/run/crond.pid
C /var/lib/logrotate.status
C /var/lib/redis
A /var/lib/redis/dump.rdb
C /var/spool/cron
A /var/spool/cron/root

Each line begins with either A or C, which are just shorthand for added or changed, respectively. We can see that this container is running redis, that the redis log is being written to, and that someone or something has been changing the crontab for root. Logging to the local filesystem is not a good idea, especially for anything with high-volume logs. Being able to find out what is writing to your Docker filesystem can really help you understand where things are filling up, or give you a preview of what would be added if you were to build an image from it.

Further detailed inspection requires jumping into the container with docker exec or nsenter and the like, in order to see what exactly is in the filesystem. But docker diff gives you a good place to start.

Wrap-Up

At this point, you should have a good idea how to deploy and debug individual containers in development and production, but how do you start to scale this for larger application ecosystems? In the next chapter we’ll take a look at one of the simpler Docker orchestration tools: Docker Compose. This tool is a nice bridge between a single Docker container and a production orchestration system. It delivers a lot of value in development environments and throughout the DevOps pipeline.

Previous Chapter

6. Exploring Docker

Next Chapter

8. Exploring Docker Compose