Docker has been the most attractive way to virtualize applications. It offers fast, lightweight, secure, isolated (yet connected) containers that run where you put them.
That’s perfect, isn’t it? You write a piece of code, put it in a container, and ta-da! It is fast and secure now! Right… Is that so?
Well, there might be lots of things that have been thought for you, but don’t think that it would be enough. It’s never enough! Let’s go over the simple matters to keep in mind while deploying software over Docker:
- Keep your image simple
- Know what to include and exclude in the image
- Monitor your container
You’ve written your code with well-thought optimizations, efficient algorithms, simple data protocols, etc… Good for you! That is what we call Source Code Level Optimization. And there is this concept, Deployment Level Optimization. This concept is the union of Build Level, Compile Level, Assembly Level and Run Time Level Optimizations, see Program Optimization (Wikipedia) for more. Anyway, let’s get to the point. Disasters may occur if you don’t take this subject seriously.
- Prefer using smaller image bases
- Include only the necessary artifacts in the image
- Don’t wrap your image within countless layers
- Don’t install unnecessary packages
- Write your Dockerfile effectively
1.1 Create Lightweight Images
Performance is mostly correlated with being lightweight. And there are dead-simple ways to create lightweight images in Docker.
Base Image, the base of our container. It differs from application to application, obviously. But there is always a lighter, better image than you use. For example, instead of
debian image, you can use
alpine image. Instead of using
python:3.8, you can use
python:3.8-alpine. You can take a look at this example for lightweight image building.
1.2 Artifacts to Include in Image
COPY . /app
Hope you don’t write this line in your Dockerfile unless you know what you’re doing. You might be passing irrelevant artifacts like
README.md, and these artifacts might contain secrets of application, hints about your application or system, etc… And if it doesn’t, it enlarges the image size for nothing.
Being explicit about
COPY is a better practice. But it’s even better to be explicit and use just enough amount of
COPY at the same time, no more (but still might be less). This will prevent creating lots of layers, which would harm performance. Layers will be mentioned in the next matter.
#COPY <src> <dest> #COPY ["<src 1>", "<src 2>", ..., "<dest>"] COPY ["./app", ".env", "/app"] COPY ./sample_data /data
You should use a
.dockerignore file to ignore files that you don’t want to include in your image. Example
# Git .git .gitignore # Docker docker-compose.yml .docker # Other **/*.md LICENSE
1.3 Don’t End Up With a Heavy Onion
Layers in a docker image can be simply defined as changes on an image. Take a look at these commands:
FROM, creates a layer over the layers of the base image you use
COPY, adds files from your choice of the source to destination in a new layer
ADD, similar to
COPY, except this command can unpack local packages
RUN, runs commands in a new layer
CMD, executes specified commands in a new layer which is the last layer.
When you use these commands, you stack up layers in your image and increase your image size.
You would use
CMD only once, so they won’t stack layers up. But
RUN can do that. Therefore, use these commands with caution.
Instead of doing this:
RUN apt-get update RUN apt-get -y install git
RUN apt-get update && apt-get -y install git
1.4 Don’t Install Unnecessary Packages
If you’re not trying to reinvent the wheel, you must be using some packages additional to your base image, right? Good, okay. Now, the thing about installing packages is, sometimes they bring extra promotions with themselves. To prevent installing unused and good to have but doesn’t have to be installed packages, is to use some options of the package manager of your base image and/or leverage multi-stage build.
RUN apt-get update && \ apt-get -y install --no-install-recommends git
For detailed information, check out my other blog post about Multi-Stage Builds.
1.5 A Good Dockerfile
Firstly, you need to learn docker key concepts to write a good Dockerfile. Take a look at my blog post Docker Key Concepts and Definitions.
If you’re already familiar with the key concepts, you can use Dockerfile linters like hadolint (not an affiliate link, just a FOSS project).
Now again, let’s imagine together. You’ve written your code (seems like you’re very productive). You’ve used linters, security checks, all kinds of tests; you’ve done daily updates of softwares, packages. Everything looks okay, you’re good to go.
BUT, if you don’t think about your deployment security, you’re still as vulnerable as any application.
- Minimal Base Images (again)
- You don’t need root privileges, use non-root user
- Use benchmarking tools for security
- Monitor your container
2.1 Prefer Minimal Base Images
Minimal base image means fewer libraries, fewer functions, fewer capabilities. And this results in a smaller attack surface.
Check out my blog on How to Create Minimal Docker Images to see a comprehensive example.
COPY Instead of
ADD are basically the same, except that
ADD can unpack
.tar files. Thus, if you don’t want to include any local packages or unpack them, don’t use
2.3 Create a Non-root User
Create a non-root user and group in your container which has just enough permissions and use that user to run your program.
FROM python:3.8-alpine COPY ./app /app # Create `app` user group and then create... #...`testuser` user under the group `app` #Own `/app` directory with `testuser:app` user RUN groupadd --gid 1000 app && \ useradd --shell /bin/bash testuser app && \ chown -R testuser:app /app USER testuser CMD ["python", "/app/app.py", "start"]
2.4 Use Benchmarking Tools
Alongside the linting tools, you can use benchmarking tools too, like Docker Bench Security (not an affiliate link just a FOSS project). You can easily automate your lints and benchmarks to achieve high-quality docker images.
2.5 Monitor Your Container
You’ve gone through every step of the best practices, you have the most secure and performant application in the world. It’s a beautiful feeling isn’t it, feeling safe?
How can you feel safe!? It’s never enough remember? You have to think from every aspect in terms of security and stability. Therefore, you have to keep an eye on your container after the deployment. Watch for some metrics like System Resource Usage, Network Bandwith Usage, Error Logs, System Performance, and any other metric that you could use.
Watching these metrics will give you insights about what to do better and how, where to fix in the code (you need to trace back your error logs), what optimizations should you implement, etc… You may consider these tools for monitoring (no affiliate links, all FOSS):
- Use tags when pulling images
- Pull your own images from your own private Docker Registry
- Add metadata to Dockerfile
3.1 Use Tags When Pulling Images
Tags specify the version of the image. If you do not specify any tags like
FROM python, docker daemon will pull the
latest tag by default. So, when this
python:latest image gets updated, your application and/or environment may break, some parts of your code may become deprecated, and so on… You have to specify precise tags like
python:3.8.7 or even with SHA
3.2 Your Own Docker Registry
Even if you pull your images from public registries with SHA digest, that image might get deleted or restricted from public usage. Therefore, the best way to make sure your image remains safe and unchanged is to keep them in your private registry. To do this, you have to deploy your own registry, make some security adjustments on the registry side and pull images with some security policies like TLS, Signing & Verifying.
3.3 Add Metadata to Dockerfile with LABEL
You can pass useful information in your Dockerfile to be readable (both human-readable and machine-readable), reachable and explicit.
LABEL maintainer="firstname.lastname@example.org" \ version="1.2.0" \ stage="test" \ license="MIT" \ keywords="python, api, crud" \ multi-word-key1="Some data"
And you can use this information later to filter or organize your images.
This list of tips is not comprehensive at all, obviously. It’s intended to be an introduction anyway. So, there are things to be added into this list:
- Only one main process per container
- Specify precise versions of dependencies
- Be explicit in Dockerfile
- Verify image signs
- File & Directory permissions
- Unused commands are removed
- Restrict Linux kernel capabilities
- Limit memory usage
- Make your container read-only if possible
- Official Docker Documentation: Lots of in-depth information
- Open Container Initiative (OCI): OCI is a project to define open container standards on container formats and runtime.
- Official Docker Documentation on Security