Docker

Docker Multistage Builds: How to Optimize Your Images

docker multistage build

🚀 Level Up Your Infrastructure Skills

You focus on building. We’ll keep you updated. Get curated infrastructure insights that help you make smarter decisions.

Multi-stage Docker image builds allow you to simplify your Dockerfiles and improve build efficiency. They let you reference more than one base image in your Dockerfile and copy only the content you need into your final image.

In this article, we’ll examine Docker’s multi-stage build features in detail. We’ll also show how to create a multi-stage Dockerfile and discuss some key best practices for reducing build times. Let’s begin by learning exactly how multi-stage builds differ from regular builds.

What we’ll cover:

  1. What are Docker multi-stage builds?
  2. How to use multistage Docker image builds
  3. Debugging multistage Docker image builds
  4. Docker multistage builds: Best practices

What are multi-stage Docker image builds?

Docker images are filesystem templates that define the initial state of Docker containers. They’re like blueprints that contain the binaries, source code, and runtimes needed by the containerized application, as well as any dependencies and other related files.

Docker images are created from Dockerfiles. A Dockerfile is a list of instructions that assemble an image’s filesystem by copying files and running commands. Dockerfiles usually start with a FROM instruction that references an existing image to use as the build’s starting point. 

The rest of the instructions are then applied on top of this base image:

FROM httpd:alpine
COPY build/ /usr/local/apache2/htdocs

In the example above, the Dockerfile selects the httpd:alpine Docker Hub image as its base image. It then copies the contents of the build/ path in your working directory to /usr/local/apache2/htdocs within the image’s filesystem.

Each FROM instruction in a Dockerfile creates a new build stage. The sample Dockerfile above contains only one stage, but you can create multiple stages by writing several FROM instructions. This allows you to use more than one base image in your build:

FROM first-image:latest as build
COPY files-to-build/ /build
RUN build-script --output /out

FROM second-image:latest as final
COPY --from=build /out/build/ /app
COPY extra-files/ /app

The sample Dockerfile above contains two distinct stages: build and final. Each stage uses a different base image. 

The first stage builds some output that’s then copied into the second stage’s environment, but the other files from first-image:latest aren’t included. This helps reduce the file size of the output image. The COPY --from instruction specifies the name of the stage that contains the files you’re copying.

Use cases and benefits for multi-stage Docker builds

Multi-stage builds benefit the Docker build process and Dockerfile development in several ways:

  • Easy access to resources from multiple base images: Multi-stage builds allow you to use resources from several base images in a single Docker build, such as a build system, testing tools, and then a separate runtime environment.
  • Run multi-step build processes to produce a final image: Using multi-stage builds allows you to model your full build process in one Dockerfile. For instance, you can fetch dependencies and build your source code, then copy the compiled output into a final image layer that uses a smaller base image.
  • Reduce Dockerfile complexity: Using multiple named stages can help you organize and simplify your Dockerfiles. Whereas complex build processes historically required several Dockerfiles and the use of intermediary build helper images, multi-stage builds enable you to wrap everything into a single Dockerfile.
  • Improve build efficiency: Multi-stage builds can reduce image sizes and increase build efficiency. Your final image can use a lightweight base image, then selectively copy the files it needs from earlier build stages. Docker’s layer caching system means only the changed layers in each stage will be rebuilt.

Multi-stage builds are a good fit whenever your image build process involves more than one base image, multiple steps, or large build tools that you don’t need to keep in the final image. By using a multi-stage build, you can stick to a single Dockerfile while still optimizing build times and layer cache efficiency.

How to create multi-stage Docker images

Let’s look at how to use a multi-stage Dockerfile in a realistic scenario.

The following Dockerfile is designed to build a Docker image for a simple static website created using Hyde, a PHP static site generator. The site also uses node-sass to compile some custom style sheets.

FROM node:18-alpine AS sass
WORKDIR /app
COPY package.json .
COPY package-lock.json .
RUN npm ci
COPY sass/ sass
RUN node_modules/.bin/node-sass sass/main.scss build/styles.css

FROM php:8.4-cli AS hyde
RUN apt-get update && apt-get install -y git zip unzip
COPY --from=composer:2 /usr/bin/composer /usr/bin/composer
COPY composer.json .
COPY composer.lock .
COPY _docs/ _docs
COPY _media/ _media
COPY _pages/ _pages
COPY _posts/ _posts
COPY app/ app
COPY config/ config
COPY resources/ resources
COPY hyde .
COPY *.js .
RUN composer install
RUN php hyde build

FROM httpd:alpine AS httpd
WORKDIR /usr/local/apache2/htdocs
COPY --from=sass /app/build/styles.css .
COPY --from=hyde /_site .
The three separate FROM instructions are the main points to highlight in this Dockerfile:
FROM node:18-alpine AS sass
...

FROM php:8.4-cli AS hyde
...

FROM httpd:alpine AS httpd
...

Each FROM instruction starts a new build stage. The stages are essentially independent image builds, but only the last stage saves and tags your final image. The other stages create intermediary images that can be referenced using COPY --from instructions. These instructions allow files to be moved between the stages.

Here’s a deeper breakdown of how the three stages work:

  1. sass: This stage uses the node:18 base image. It installs the project’s npm dependencies, then uses node-sass to compile the extra SASS stylesheets found in the repository’s sass directory.
  2. hyde: This stage uses the php:8.4-cli base image. It also references the composer:2 base image in a COPY --from statement. The Composer binary is copied into the build from composer:2, then used to install the project’s PHP dependencies. Afterwards, php hyde build compiles the static site’s content.
  3. httpd: The final stage uses the httpd:alpine base image. This minimal image runs the Apache web server yet weighs just a few megabytes. The previous build stages compiled the site’s content to pure HTML and CSS, so this stage simply needs to copy the output directories from those stages into the Apache server’s document root.

This Dockerfile neatly demonstrates how multi-stage builds allow several distinct tasks to occur within a single Dockerfile. 

Although the SASS and Hyde compilation steps are relatively complex and fetch many dependencies, the final image remains lightweight. It’s just the httpd:alpine image combined with the compiled HTML, CSS, and media assets.

Want to try building this image yourself? Find the complete sample project on GitHub, ready to use.

Debugging multi-stage Docker image builds

Multi-stage Docker image builds can be troublesome to debug. Problems in earlier stages may cause issues that don’t appear until later in the build.

You can troubleshoot multi-stage build issues by instructing Docker to stop at a specific stage during the build

Enable this behavior by passing the name of the stage you want to stop at to the docker build command’s --target flag:

# Runs the build until the end of the "build" stage
$ docker build --target build -t debug-image:build

When you use --target, Docker runs your Dockerfile instructions as normal, until it reaches the end of the named stage. It then saves and tags the intermediate image that exists at that point in the build. You can use the tagged image to start a container and inspect the filesystem state at the end of the stage.

We also encourage you to explore the ways Spacelift offers full flexibility when it comes to customizing your workflow. You can bring your own Docker image and use it as a runner to speed up deployments that leverage third-party tools. Spacelift’s official runner image can be found here.

 

If you want to learn more about what you can do with Spacelift, check out this article, create a free account today, or book a demo with one of our engineers.

Best practices for Docker multi-stage builds

Now we’ve seen how to write multi-stage Dockerfiles, let’s quickly cover some best practices that can help you optimize your builds.

1. Choose minimal base images

Minimal base images are lightweight variants optimized for a small file size. They include the bare minimum of operating system packages and dependencies required for their purpose. You can then add just the components you need. 

Minimal images are typically created from scratch or a lightweight operating system image such as Alpine.

Using a minimal base image for your final Dockerfile stage helps reduce the size of your final image. If your build process depends on tools that don’t exist in the minimal image, then you should complete those tasks in earlier build stages using a heavier image. 

You can then use the Dockerfile COPY --from instruction to copy the built output into your final image, as shown above.

2. Name stages for clarity

Using AS to name each stage in your Dockerfile helps document what’s happening. It makes it obvious what each stage is doing and why it’s required. You can then refer to stages by name in later COPY instructions:

FROM golang:latest AS build
COPY main.go .
RUN go build -o /bin/output ./main.go

FROM scratch AS final
COPY --from=build /bin/output/ bin/output
CMD ["/bin/output"]

Without named stages, you can only refer to a previous stage by its numeric index. In the example above, the build stage could also be identified as stage 0

However, relying on these indexes increases the risk of errors. If you reorder the stages in your Dockerfile, you must remember to update the numeric stage references.

3. Organize your stages to maximize efficiency

Maintaining maximum build efficiency requires carefully ordering each stage in your Dockerfile. Stages that are less likely to change often should be positioned as early as possible. This enables Docker to make the most of its layer cache.

If infrequently changed stages are located after frequently changed ones, they’ll be rebuilt even when their own content hasn’t changed. Changed layers invalidate the cache for the layers and stages that come after them.

4. Use COPY –from to directly copy content from other images

Depending on your requirements, it’s not always necessary to explicitly create a new build stage with a FROM instruction. If you just need to copy a file from an existing Docker image, you can use a COPY --from instruction instead:

FROM php:8.4-apache
COPY composer.json .
COPY composer.lock .
COPY --from=composer:2 /usr/bin/composer /usr/bin/composer
RUN composer install

In the example above, the composer binary doesn’t exist in the php:8.4-apache image that’s used as the build’s base image. COPY --from is used to copy the binary directly from the separate composer:2 image. 

During the build, Docker fetches composer:2 and then copies the specified file path out of that image and into the build context.

Key points

Multi-stage Docker builds use multiple Dockerfile FROM instructions to reference content from more than one base image. You can selectively copy just the files you need from each stage into your final image. This allows you to implement complex build processes using a single Dockerfile, without making your final image excessively large.

Adopting multi-stage Dockerfiles can improve the speed, simplicity, and ease of maintenance of your Docker builds. However, it’s also important to implement other Dockerfile best practices to ensure your builds run as smoothly as possible. 

Check out our Docker image layers guide to learn more tips for optimizing your Dockerfiles and enhancing cache efficiency.

Solve your infrastructure challenges

Spacelift is a flexible orchestration solution for IaC development. It delivers enhanced collaboration, automation, and controls to simplify and accelerate the provisioning of cloud-based infrastructures.

Learn more

The Practitioner’s Guide to Scaling Infrastructure as Code

Transform your IaC management to scale

securely, efficiently, and productively

into the future.

ebook global banner
Share your data and download the guide