Lesson 2.4: Multi-stage Builds

Welcome to Lesson 2.4! You've already built custom images and learned about layering and caching. Now it's time to tackle one of the most powerful techniques for creating small, efficient, and secure Docker images: multi-stage builds. This feature allows you to use multiple FROM statements in a single Dockerfile, selectively copying artifacts from one stage to another, discarding everything else. By the end of this lesson, you'll be able to drastically reduce your image sizes and separate build-time dependencies from runtime environments.

Learning Objectives

TIP

By the end of this lesson, you will be able to:

Explain the problem that multi-stage builds solve (large images, unnecessary dependencies).
Understand the syntax and workflow of a multi-stage Dockerfile.
Use multiple FROM instructions with aliases (AS).
Copy artifacts from previous stages using COPY --from.
Build minimal production images by leveraging multi-stage builds.
Apply best practices for common languages (Go, Node.js, Python, Java).

1. The Problem: Bloated Images

In traditional single-stage Dockerfiles, you often need build tools, compilers, or package managers to create your application. But these tools are not needed at runtime, yet they end up in the final image, making it large and potentially insecure.

Example (single-stage for a Go app):

dockerfile

FROM golang:1.20
WORKDIR /app
COPY . .
RUN go build -o myapp .
CMD ["./myapp"]

This image contains the entire Go toolchain, source code, and intermediate files – easily 800+ MB.

Example (single-stage for a Node.js app with build step):

dockerfile

FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
CMD ["npm", "start"]

This image includes dev dependencies, source code, and possibly unnecessary files.

INFO

The solution: Use a multi-stage build to build the application in a "builder" stage, then copy only the compiled artifact (and maybe runtime dependencies) to a clean, minimal final stage.

2. What Are Multi-stage Builds?

A multi-stage build is a Dockerfile with multiple FROM instructions. Each FROM starts a new stage. You can selectively copy files from one stage to another, leaving behind everything you don't need in the final image.

Key points:

Each stage can use a different base image.
Stages are numbered starting from 0, or you can name them with AS name.
Only the last stage's layers are kept in the final image (unless earlier stages are referenced).
You can copy files from previous stages using COPY --from=stage_name.

Simple Example (Go)

dockerfile

# Build stage
FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .

# Final stage
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/myapp .
CMD ["./myapp"]

The first stage (builder) uses the large golang image to compile the app.
The second stage uses a tiny alpine image.
Only the compiled binary is copied; the Go toolchain and source are discarded.
The final image is often 10-20 MB instead of 800+ MB.

Visual: Single-stage vs Multi-stage

Single-stage (bloated):

+----------------------------------+
|   Go binary + source + toolchain |  <- 800+ MB
+----------------------------------+
|       FROM golang:1.20           |
+----------------------------------+

Multi-stage (lean):

Stage 1: builder     Stage 2: final
+---------------+    +----------------+
| Go toolchain  | -> |  Binary only   |  <- ~15 MB
| Source code   |    +----------------+
+---------------+    |  FROM alpine   |
| golang:1.20   |    +----------------+
+---------------+    DISCARDED

3. Syntax and Usage

3.1. Naming Stages

You can name a stage using AS:

dockerfile

FROM node:18 AS build-env

Then refer to it later: COPY --from=build-env ...

If you don't name a stage, you can refer to it by index (0, 1, 2, ...). Naming is strongly recommended for clarity.

3.2. Copying from a Stage

dockerfile

COPY --from=<stage_name_or_index> <source_path> <dest_path>

<source_path> is relative to the build context of that stage.
You can also copy from a completely different image (not just a stage) using --from=image:tag.

Example copying from an image:

dockerfile

COPY --from=nginx:latest /usr/share/nginx/html/index.html /usr/share/nginx/html/

3.3. Stopping at a Specific Stage

Sometimes you may want to stop the build at a particular stage for debugging. Use the --target flag with docker build:

bash

docker build --target builder -t myapp-builder .

This builds only up to the stage named builder. Useful for testing.

4. Common Patterns by Language

4.1. Go (Statically Compiled)

As shown above, Go produces a static binary. You can even build with CGO_ENABLED=0 to create a fully static binary that runs on scratch (an empty image).

dockerfile

FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o myapp .

FROM scratch
COPY --from=builder /app/myapp /myapp
CMD ["/myapp"]

The scratch image is truly empty – it contains nothing. Perfect for statically linked binaries.

4.2. Node.js (with Build Step)

Node.js apps often have a build step (e.g., Webpack, Babel) and need only the built files and production dependencies.

dockerfile

# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:18-slim
WORKDIR /app
# Copy built assets and production node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
CMD ["node", "dist/index.js"]

TIP

Note: This copies the node_modules from builder (which may include dev dependencies if not pruned). Better to run npm ci --production in builder stage to get only production deps, or use separate stages for dependencies. See advanced patterns.

4.3. Python

For Python, you may need to compile native extensions. Use a builder stage with the full Python image, then copy the installed site-packages and your code to a slim runtime stage.

dockerfile

FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt

FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "app.py"]

This uses pip install --user to install packages into a user directory, then copies that directory.

4.4. Java

For Java, you might compile with Maven/Gradle in one stage and copy the resulting JAR/WAR to a JRE base image.

dockerfile

FROM maven:3.8-openjdk-11 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package

FROM openjdk:11-jre-slim
COPY --from=builder /app/target/myapp.jar /app.jar
CMD ["java", "-jar", "/app.jar"]

5. Advanced Techniques

5.1. Using External Images as Stages

You can copy files from any existing image, not just stages in the current Dockerfile. This is useful for pulling in binaries or configuration files.

dockerfile

FROM alpine:latest
COPY --from=nginx:alpine /etc/nginx/nginx.conf /nginx.conf

5.2. Build Args Across Stages

Build arguments are available in all stages. You can use them to control versions, e.g., ARG GO_VERSION=1.20 and then FROM golang:${GO_VERSION} AS builder.

5.3. Conditional Stages (with `--target`)

As mentioned, --target allows you to stop at a specific stage, which is great for CI/CD where you might want a testing image separate from production.

5.4. Multiple Build Contexts (Docker BuildKit)

With BuildKit, you can also use --from with a named context, but that's beyond this lesson.

6. Benefits of Multi-stage Builds

TIP

Smaller images: Remove build tools, source code, and intermediate files.
Improved security: Fewer components mean fewer vulnerabilities.
Faster deployments: Smaller images transfer faster over the network.
Separation of concerns: Builder stage can have all debugging tools, final stage only runtime essentials.
Reusability: You can copy artifacts from one stage to multiple final images.

Hands-On Tasks

Task 1: Convert a Single-Stage Go App to Multi-stage

Create a simple Go program main.go:

package main

import "fmt"

func main() {
    fmt.Println("Hello from Go multi-stage!")
}

Write a single-stage Dockerfile (as above) and build it. Note the size.
Create a multi-stage Dockerfile as shown (using golang:1.20 as builder and alpine or scratch as final). Build and compare sizes.
If using scratch, run the container: docker run --rm myapp. It should print the message.

Task 2: Node.js App with Build

Create a simple Node.js app with a build script:
- package.json with a "build" script (e.g., echo "built" > dist/index.js).
- dist/index.js could be a simple console.log("Hello").
Write a multi-stage Dockerfile:
- Stage 1: node:18 to run npm install and npm run build.
- Stage 2: node:18-slim to copy the built files.
Build and verify the image size compared to a single-stage version (if you also try a single-stage for comparison).

Task 3: Python with User Install

Create a Python script app.py that uses a library (e.g., requests).
Create requirements.txt with requests.
Write a multi-stage Dockerfile using the pattern above (pip install --user). Build and run to confirm it works.
Check that the final image doesn't contain pip or build tools.

Task 4: Debug with `--target`

For the Go example, build only the builder stage:
bash
```
docker build --target builder -t go-builder .
```
Run a shell in that intermediate image and explore: docker run -it go-builder sh. You'll see source and Go toolchain.
Then build the full image and notice that the intermediate image is separate; you could even push the builder image for caching in CI.

Task 5: Compare Image Sizes

For any of the above, use docker images to see the size difference between a naive single-stage build and your multi-stage build. Document the reduction.

Summary

Key Takeaways

Multi-stage builds let you use multiple FROM statements in one Dockerfile.
Each stage is independent; you can copy artifacts from earlier stages using COPY --from.
The final image only contains what you explicitly copy, drastically reducing size and attack surface.
Common patterns exist for compiled languages (Go, Rust, C), interpreted languages with build steps (Node, Python with native extensions), and Java.
Use --target to stop at a specific stage for debugging or testing.
Always aim to use the smallest possible base image in the final stage (e.g., alpine, slim, scratch).

Check Your Understanding

What problem do multi-stage builds solve?
How do you refer to a previous stage by name when copying files?
If you have three stages and don't copy anything from stage 2, is stage 2 included in the final image?
What base image would you use for a statically compiled Go binary to achieve the smallest possible image?
How can you build only up to a specific stage, and why would you want to do that?
Write a Dockerfile snippet that copies a file named app.jar from a stage named builder into the current stage's /app directory.

Click to see answers

They solve the problem of bloated images by separating build-time dependencies (compilers, build tools) from runtime dependencies, so only the final artifact and necessary runtime files end up in the production image.
Using COPY --from=stage_name where stage_name is the name defined with AS in the FROM instruction.
No. Only the final stage becomes the image. Intermediate stages are used only during the build process.
scratch – an empty base image. It's perfect for statically compiled binaries because they don't need any runtime libraries.
With docker build --target stage_name. You'd use this for debugging, testing intermediate stages, or creating separate images for development vs. production.
COPY --from=builder /app/target/app.jar /app/app.jar (adjust paths as needed).

Additional Resources

Next Up

In the next lesson, we'll cover tagging and pushing images to registries, so you can share your optimized images with the world (or your team). See you there!

Lesson 2.4: Multi-stage Builds ​

Learning Objectives ​

1. The Problem: Bloated Images ​

2. What Are Multi-stage Builds? ​

Simple Example (Go) ​

Visual: Single-stage vs Multi-stage ​

3. Syntax and Usage ​

3.1. Naming Stages ​

3.2. Copying from a Stage ​

3.3. Stopping at a Specific Stage ​

4. Common Patterns by Language ​

4.1. Go (Statically Compiled) ​

4.2. Node.js (with Build Step) ​

4.3. Python ​

4.4. Java ​

5. Advanced Techniques ​

5.1. Using External Images as Stages ​

5.2. Build Args Across Stages ​

5.3. Conditional Stages (with --target) ​

5.4. Multiple Build Contexts (Docker BuildKit) ​

6. Benefits of Multi-stage Builds ​

Hands-On Tasks ​

Task 1: Convert a Single-Stage Go App to Multi-stage ​

Task 2: Node.js App with Build ​

Task 3: Python with User Install ​

Task 4: Debug with --target ​

Task 5: Compare Image Sizes ​

Summary ​

Check Your Understanding ​

Additional Resources ​

Lesson 2.4: Multi-stage Builds

Learning Objectives

1. The Problem: Bloated Images

2. What Are Multi-stage Builds?

Simple Example (Go)

Visual: Single-stage vs Multi-stage

3. Syntax and Usage

3.1. Naming Stages

3.2. Copying from a Stage

3.3. Stopping at a Specific Stage

4. Common Patterns by Language

4.1. Go (Statically Compiled)

4.2. Node.js (with Build Step)

4.3. Python

4.4. Java

5. Advanced Techniques

5.1. Using External Images as Stages

5.2. Build Args Across Stages

5.3. Conditional Stages (with `--target`)

5.4. Multiple Build Contexts (Docker BuildKit)

6. Benefits of Multi-stage Builds

Hands-On Tasks

Task 1: Convert a Single-Stage Go App to Multi-stage

Task 2: Node.js App with Build

Task 3: Python with User Install

Task 4: Debug with `--target`

Task 5: Compare Image Sizes

Summary

Check Your Understanding

Additional Resources