Image testing and debugging
While we can applaud the benefits of containers, troubleshooting and effectively monitoring them currently present some complexity. Since by design, containers run in isolation, their resulting environment can be cloudy. Effective troubleshooting has generally required shell entry into the container itself, coupled with the complications of installing additional Linux tools to merely peruse information that is twice as hard to investigate.
Typically, available tools, methods, and approaches for meaningful troubleshooting of our containers and images has required installing additional packages in every container. This results in the following:
- Requirements for connecting or attaching directly to the container, which is not always a piddling matter
- Limitations on inspection of a single container at a time
Compounding these difficulties, adding unnecessary bloat to our containers with these tools is something we originally attempted to avoid in our planning; minimalism is one of the advantages we looked for in using containers in the first place. Let's take a look then at how we can reasonably glean useful information on our container images with some basic commands, as well as investigate emergent applications that allow us to monitor and troubleshoot containers from the outside.
Docker details for troubleshooting
Now that you have your image (regardless of building method) with Docker running, let's do some testing to make sure that all is copacetic with our build. While these may seem routine and mundane, it is a good practice to run any or all of the following as a top-down approach to troubleshooting.
The first two commands here are ridiculously simple and seemingly too generic, but will provide base-level detail with which to begin any downstream troubleshooting efforts--$ docker version
and $ docker info
.
Docker version
Let's ensure that we firstly recognize what version of Docker, Go, and Git we are running:
$ sudo docker version
Docker info
Additionally, we should understand our host operating system and kernel version, as well as storage, execution, and logging drivers. Knowing these things can help us troubleshoot from our top-down perspective:
$ sudo docker info
A troubleshooting note for Debian/Ubuntu
From a $ sudo docker info
command, you may receive one or both of the following warnings:
WARNING: No memory limit support WARNING: No swap limit support
You will need to add the following command-line parameters to the kernel in order to enable memory and swap accounting:
cgroup_enable=memory swapaccount=1
For these Debian or Ubuntu systems, if you use the default GRUB bootloader, those parameters can be added by editing /etc/default/grub
and extending GRUB_CMDLINE_LINUX
. Locate the following line:
GRUB_CMDLINE_LINUX=""
Then, replace it with the following one:
GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
Then, run update-grub
and reboot the host machine.
Listing installed Docker images
We also need to ensure that the container instance has actually installed your image locally. SSH into the docker host and execute the docker images
command. You should see your docker image listed, as follows:
$ sudo docker images
What if my image does not appear? Check the agent logs and make sure that your container instance is able to contact your docker registry by curling the registry and printing out the available tags:
curl [need to add in path to registry!]
Note
What $ sudo docker images tells us: Our container image was successfully installed on the host.
Manually crank your Docker image
Now that we know our image is installed on the host, we need to know whether it is accessible to the Docker daemon. An easy way to test to make certain your image can be run on the container instance is by attempting to run your image from the command line. There is an added benefit here: we will now have the opportunity to additionally inspect application logs for further troubleshooting.
Let's take a look at the following example:
$ sudo docker run -it [need to add in path to registry/latest bin!]
Note
What $ sudo docker run <imagename> tells us: Our container image is accessible from the docker daemon and also provides accessible output logs for further troubleshooting.
What if my image does not run? Check for any running containers. If the intended container isn't running on the host, there may be issues preventing it from starting:
$ sudo docker ps
When a container fails to start, it does not log anything. Output of logs for container start processes are located in /var/log/containers
on the host. Here, you will find files following the naming convention of <service>_start_errors.log
. Within these logs, you will find any output generated by our RUN
command, and are a recommended starting point in troubleshooting as to why your container failed to start.
Tip
TIP: Logspout (https://github.com/gliderlabs/logspout) is a log router for Docker containers that runs inside Docker. Logsprout attaches to all containers on a host, then routes their logs wherever you desire.
While we can also peruse the /var/log/messages
output in our attempts to troubleshoot, there are a few other avenues we can persue, albeit a little more labor intensive.
Examining the filesystem state from cache
As we've discussed, after each successful RUN
command in our Dockerfiles, Docker caches the entire filesytem state. We can exploit this cache to examine the latest state prior to the failed RUN
command.
To accomplish the task:
- Access the Dockerfile and comment out the failing
RUN
command, in addition to any and subsequentRUN
commands - Re-save the Dockerfile
- Re-execute
$ sudo docker build
and$ sudo docker run
Image layer IDs as debug containers
Every time Docker successfully executes a RUN
command from a Dockerfile, a new layer in the image filesystem is committed. Conveniently, you can use those layers IDs as images to start a new container.
Consider the following Dockerfile as an example:
FROM centos RUN echo 'trouble' > /tmp/trouble.txt RUN echo 'shoot' >> /tmp/shoot.txt
If we then build from this Dockerfile:
$ docker build -force-rm -t so26220957 .
We would get output similar to the following:
Sending build context to Docker daemon 3.584 kB Sending build context to Docker daemon Step 0 : FROM ubuntu ---> b750fe79269d Step 1 : RUN echo 'trouble' > /tmp/trouble.txt ---> Running in d37d756f6e55 ---> de1d48805de2 Removing intermediate container d37d756f6e55 Step 2 : RUN echo 'bar' >> /tmp/shoot.txt Removing intermediate container a180fdacd268 Successfully built 40fd00ee38e1
We can then use the preceding image layer IDs to start new containers from b750fe79269d
, de1d48805de2
, and 40fd00ee38e1
:
$ docker run -rm b750fe79269d cat /tmp/trouble.txt cat: /tmp/trouble.txt No such file or directory $ docker run -rm de1d48805de2 cat /tmp/trouble.txt trouble $ docker run -rm 40fd00ee38e1 cat /tmp/trouble.txt trouble shoot
Note
We employ --rm
to remove all the debug containers since there is no reason to have them around postruns.
What happens if my container build fails? Since no image is created on a failed build, we'd have no hash of the container with which to ID. Instead, we can note the ID of the preceding layer and run a container with a shell of that ID:
$ sudo docker run --rm -it <id_last_working_layer> bash -il
Once inside the container, execute the failing command in attempt to reproduce the issue, fix the command and test, and finally update the Dockerfile with the fixed command.
You may also want to start a shell and explore the filesystem, try out commands, and others:
$ docker run -rm -it de1d48805de2 bash -il root@ecd3ab97cad4:/# ls -l /tmp total 4 -rw-r-r-- 1 root root 4 Jul 3 12:14 trouble.txt root@ecd3ab97cad4:/# cat /tmp/trouble.txt trouble root@ecd3ab97cad4:/#
Additional example
One final example is to comment out of the following Dockerfile, including the offending line. We are then able to run the container and docker commands manually and look into the logs in the normal way. In this example Dockerfile:
RUN trouble RUN shoot RUN debug
Also, the failure is at shoot, then comment out as follows:
RUN trouble # RUN shoot # RUN debug
Then, build and run:
$ docker build -t trouble . $ docker run -it trouble bash container# shoot ...grep logs...
Checking failed container processes
Even if your container successfully runs from the command line, it would prove beneficial to inspect for any failed container processes, for containers that are no longer running, and checking our container configuration.
Run the following command to check for failed or no-longer running containers and note the CONTAINER ID
to inspect a given container's configuration:
$ sudo docker ps -a
Note the STATUS of the containers. Should any of your containers, STATUS show exit codes other than 0
, there could be issues with the container's configuration. By way of an example, a bad command would result in an exit code of 127
. With this information, you can troubleshoot the task definition CMD
field to debug.
Although somewhat limited, we can further inspect a container for additional troubleshooting details:
$ sudo docker inspect <containerId>
Finally, let's also analyze the container's application logs. Error messages for container start failures are output here:
$ sudo docker logs <containerId>
Other potentially useful resources
$ sudo docker
top gives us a list of processes running inside a container.
$ sudo docker htop
can be utilized when you need a little more detail than provided by top
in a convenient, cursor-controlled inferface. htop
starts faster than top
, you can scroll the list vertically and horizontally to see all processes and complete command lines, and you do not need to type the process number to kill a process or the priority value to recieve a process.
By the time this book goes to print, it is likely that the mechanisms for troubleshooting containers and images will have dramatically improved. Much focus is being given by the Docker community toward baked-in reporting and monitoring solutions, in addition to market forces that will certainly bring additional options to bear.
Using sysdig to debug
As with any newer technology, some of the initial complexities inherent with them are debugged in time, and newer tools and applications are developed to enhance their use. As we've discussed, containers certainly fit into this category at this time. While we have witnessed improvements in availability of official, standardized images within the Docker Registry, we are also now seeing emergent tools that help us to effectively manage, monitor, and troubleshoot our containers.
Sysdig provides application monitoring for containers [Image Copyright © 2014 Draios, Inc.]
Sysdig (http://www.sysdig.org/ ) is one such tool. As an au courant application for system-level exploration and troubleshooting visibility into containerized environments, the beauty of sysdig
is that we are able to access container data from the outside (even though sysdig
can actually also be installed inside a container). From a top level, what sysdig
brings to our container management is this:
- Ability to access and review processes (inclusive of internal and external PIDs) in each container
- Ability to drill-down into specific containers
- Ability to easily filter sets of containers for process review and analysis
Sysdig provides data on CPU usage, I/O, logs, networking, performance, security, and system state. To repeat, this is all accomplishable from the outside, without a need to install anything into our containers.
We will make continued and valuable use of sysdig
going forward in this book to monitor and troubleshoot specific processes related to our containers, but for now we will provide just a few examples toward troubleshooting our basic container processes and logs.
Let's dig into sysdig
by getting it installed on our host to show off what it can do for us and our containers!
Single step installation
Installation of sysdig
can be accomplished in a single step by executing the following command as root or with sudo
:
curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash
Note
NOTE: sysdig
is currently included natively in the latest Debian and Ubuntu versions; however, it is recommended to update/run installation for the latest packages.
Advanced installation
According to the sysdig
wiki, the advanced installation method may be useful for scripted deployments or containerized environments. It is also easy; the advanced installation method is enlisted for RHEL and Debian systems.
What are chisels?
To get started with sysdig
, we should understand some of its parlance, specifically chisels. In sysdig
, chisels are little scripts (written in Lua) that analyze the sysdig
event stream to perform useful actions. Events are efficiently brought to user level, enriched with context, and then scripts can be applied to them. Chisels work well on live systems, but can also be used with trace files for offline analysis. You can run as many chisels as you'd like, all at the same time. For example:
topcontainers_error
chisel will show us the top containers by number of errors.
For a list of sysdig chisels:
$ sysdig -cl
(use the -i
flag to get detailed information about a specific chisel)
Single container processes analysis
Using the example of a topprocs_cpu
chisel, we can apply a filter:
$ sudo sysdig -pc -c topprocs_cpu container.name=zany_torvalds
These are the example results:
CPU% Process container.name ------------------------------------------ 02.49% bash zany_torvalds 37.06% curl zany_torvalds 0.82% sleep zany_torvalds
Unlike using $ sudo docker top
(and similar), we can determine exactly which containers we want to see processes for; for example, the following example shows us processes from only the wordpress
containers:
$ sudo sysdig -pc -c topprocs_cpu container.name contains wordpress CPU% Process container.name -------------------------------------------------- 5.38% apache2 wordpress3 4.37% apache2 wordpress2 6.89% apache2 wordpress4 7.96% apache2 wordpress1
Other Useful Sysdig Chisels & Syntax
topprocs_cpu
shows top processes by CPU usagetopcontainers_file
shows top containers by R+W disk bytestopcontainers_net
shows top containers by network I/Olscontainers
will list the running containers$ sudo sysdig -pc -cspy_logs
analyzes all logs per screen$ sudo sysdig -pc -cspy_logs container.name=zany_torvalds
prints logs for the containerzany_torvalds
Troubleshooting - an open community awaits you
In general, most issues you may face have likely been experienced by others, somewhere and sometime before. The Docker and open source communities, IRC channels and various search engines, can provide resulting information that is highly accessible and likely to provide you with answers to situations, and conditions, that perplex. Make good use of the open source community (specifically, the Docker community) in getting the answers you are looking for. As with any emergent technology, in the beginning, we are all somewhat learning together!