Integration Tests Issue 1: Docker

von Patrick Bédat, 24. Juli 2016

I think there are two types of integration tests suites:

  1. Test Suites that have become a bloated, arcane, a non maintainable collection of code wizardry with endless lines of setup code and crazy bash scripts. They test your actually very well crafted applications in a way nobody really understands. They do a good job, because they saved your asses multiple times. You hate them because the only way to make tests pass again is to comment out assertions.
  2. And the test suites written by people who created a test suite of the previous type before.

Requirements to Integration Test Suites

Integration tests are a wonderful instrument, to get as much as possible covered when you are not „test driven“ and to provide low level documentation. But what makes a good integration test suite?

  • Performance: It has to be fast. This applies to all kind of automated tests. Nothing eats up your development time, like waiting for slow tests. If you want to motivate developers to write more tests, don’t make it a burden
  • Close to production: Running a service, that depends on ghostscript on ubuntu 12.04 may give you very different results when running it on 14.04
  • Minimal invasive: So you want to test services like on a production environment. „Ok install those 100 dependencies and make sure they always match the versions like on prod. What you installed mono 4.x – oops now you have to downgrade to 2.4….“ Thinking about continuous integration, we want to have the tests running on almost naked installations. Everywhere.
  • Debuggable: Period.
  • Robustness: When you have to fire up a dozen service for testing, robustness is crucial. You don’t want to log into the build server and killing your services by hand whenever a build fails.
  • Well crafted: Invest as much love and quality into test code as into the rest of the application. Maybe even more.

Docker To The Rescue

Even though we don’t use docker for all applications on production (soon we will), they are all dockerized in our development and testing environment. Back in the days we used VMs to get our applications running locally, we were passing around gigabytes of copied VMs, because they got out of sync from time to time. We are not looking back…

The word docker still is a real „eye-brow-raiser“. Many have heard of it, knowing that is doing virtualization, but don’t really understand what’s the point about docker. Let me give you an example:

When it comes to testing in general, you always want to test on a fresh environment, because the key to predictable test execution is statelessness (docker containers are often called ephemeral). Here is the use case of „running apps on a fresh environment“ looked at from the VM and from the Docker perspective.

VM
1. Create a VM
2. Install a OS (or take snapshot, that you probably want to update)
3. Install dependencies (e.g. zip and ghostscript)
4. Make configurations
5. Deploy binaries
6. Run applications

The get a fresh environment, you have to repeat (or automate) those steps. While you can take snapshots, it will be a resource and time consuming endeavor. You can try to keep the state of the system in sync, with the actual environment, but synchronization can be a nasty thing.

Docker

  1. Create a Dockerfile
    FROM ubuntu:14.04
    MAINAINER Patrick Bédat 
    RUN apt-get update && apt-get install zip ghostscript
    
  2. Write configuration into the Dockerfile
    RUN echo "127.0.0.1 my-awesome-host" >> /etc/hosts
  3. Build an image. After an initial download, further builds will be incredebly fast (seconds)
    docker build -t my-awesome-image .

And here comes the big difference: We wont run all the applications in one docker container. We host one container per app:

docker run --volume ./bin:/opt/my-awesome-app1 my-awesome-image /opt/my-awesome-app1/app1.exe
docker run --volume ./bin:/opt/my-awesome-app2 my-awesome-image /opt/my-awesome-app2/app2.exe

Note that instead of deploying binaries, we just mapped the container directory /opt/my-awesome-appx into our hosts filesystem.
Instead of waiting for VM to boot up or a snapshot be restored, we have a container running in an instant! That is because docker does not virtualize low level stuff like hardware or an operating system. Instead it does kernel virtualization and even reuses libs and executables from the host!

container_vs_vm

Now check that Dockerfile into your repository and have your build server run those apps exactly like you did!

I hope the point, that you can spin up a complete production like environment in seconds, gives you at least clue about the power of docker. I think the hardest part to understand about docker is the ephemeral nature of containers. You don’t log into containers, they usually dont have a gui or a terminal – they just run one application and exist usually only as long as the application is running. I don’t want to dig any deeper, because docker has so much to offer and I want to focus on it’s utilization for testing.

Test Suite Setup

Back to our test suite. Before you can perform tests against real services, you need to spin them up. There are different workflows on how and when to get them running. I think the most common are:

  • Same containers per test
  • Same containers per fixture
  • Same containers per suite

The list is ordered from cleanest to the most practical and fastest solution. While starting a docker container happens in the blink of an eye, running DB setup scripts or waiting for HTTP services to be ready does not. This is our solution:

Workflow

We are using docker-compose to orchestrate the startup of a dozen containers (you know services can depend on each other). There is one docker-compose.yml file for the entire test suite. It works for us, because almost all of the services are stateless. The database is being reset with a dynamically generated list of TRUNCATE TABLE statements. And if you really need a fresh and clean container, you simply restart or recreate them. To ease things up, we created a wrapper for the docker-compose cli in C#.

This is how a docker-compose.yml looks like:

service_a:
  build: ./dir/to/service_a_dockerfile/
    link:
      - mysql
    ports:
     - "8080:80"
mysql:
  image: mysql:latest
    ports:
     - "33066:3306"

Among many other useful features, docker compose builds images and starts containers in the correct order for you.

What basically happens, when you run the nunit test suite:

  1. There is a SetupFixture, where existing containers are cleaned up first and the containers are then brought up with „docker-compose up -d“.
  2. Then we wait for all the services to be usable. E.g. we try to connect to a HTTP service and try to make a GET request.
  3. We setup the structure of the database

Then in each TestFixtureSetup or Setup, which runs either before the tests in a fixture or before each test, we truncate the whole database. Depending on how the fixture is structured, we setup a scenario in the setup or in the tests (or in both actually).

Usually you want to run tests, while you are developing. When you are developing, the services are usually running. But how can you run a HTTP service listening to port 80 run twice on the same machine, so that your test suite can talk to it? You can’t. Luckily docker allows you to remap the ports exposed by the containers, to different ports on their hosts. Hence you can run the service with the same configuration in your tests by simpliy remapping them.

Gotchas

In the first version of the test suite all exposed ports were mapped to a „test suite port“:

service_a:
  ports:
      - "8080:80"
mysql:
  ports:
      - "33066:3306"

This was very practicable while debugging, because when you halted the test after the debug, you knew, that you would be able to connect to the database with port 33066 to check, wether the scenario was correctly setup. But it also led to problems, when the suite was executed on the build server on different branches. On the one hand, because the tests couldn’t be executed in parallel (ports were already occupied) and on the other because when the test run on one branch crashed, it left services running an therefore left the ports occupied.

Our solution was to template the docker-compose file:

service_a:
  ports:
      - {{ ports.service_a }}
mysql:
  ports:
     - {{ ports.mysql }}

Now it was possible to map the ports in the development environment („8080:80“) and simply expose them on the build server („80“). When you expose ports in docker, which means you make the port accessable by the host, docker maps a random free port from the host.
But how do the clients in the integration tests get informed about how the port was mapped? Luckily docker-compose offers a „port“ command:

docker-compose port mysql
0.0.0.0:32673 -> 3306

After integrating this command to the docker-compose wrapper and tieing the clients to it, robustness of our tests multiplied.

Conclusion

Having a test suite which tests against the services as if they are running in production is a huge benefit. Making it easily executable and debuggable for developers is priceless. But it wasn’t for free. It was a long journey that beared many lessons, with a build server often remaining red for days (the team consists of 1.5 people…). But damn it was absolutely worth it.