Lesson 2.4: Multi-stage Builds
Welcome to Lesson 2.4! You've already built custom images and learned about layering and caching. Now it's time to tackle one of the most powerful techniques for creating small, efficient, and secure Docker images: multi-stage builds. This feature allows you to use multiple FROM statements in a single Dockerfile, selectively copying artifacts from one stage to another, discarding everything else. By the end of this lesson, you'll be able to drastically reduce your image sizes and separate build-time dependencies from runtime environments.
Learning Objectives
TIP
By the end of this lesson, you will be able to:
- Explain the problem that multi-stage builds solve (large images, unnecessary dependencies).
- Understand the syntax and workflow of a multi-stage Dockerfile.
- Use multiple
FROMinstructions with aliases (AS). - Copy artifacts from previous stages using
COPY --from. - Build minimal production images by leveraging multi-stage builds.
- Apply best practices for common languages (Go, Node.js, Python, Java).
1. The Problem: Bloated Images
In traditional single-stage Dockerfiles, you often need build tools, compilers, or package managers to create your application. But these tools are not needed at runtime, yet they end up in the final image, making it large and potentially insecure.
Example (single-stage for a Go app):
FROM golang:1.20
WORKDIR /app
COPY . .
RUN go build -o myapp .
CMD ["./myapp"]This image contains the entire Go toolchain, source code, and intermediate files – easily 800+ MB.
Example (single-stage for a Node.js app with build step):
FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
CMD ["npm", "start"]This image includes dev dependencies, source code, and possibly unnecessary files.
INFO
The solution: Use a multi-stage build to build the application in a "builder" stage, then copy only the compiled artifact (and maybe runtime dependencies) to a clean, minimal final stage.
2. What Are Multi-stage Builds?
A multi-stage build is a Dockerfile with multiple FROM instructions. Each FROM starts a new stage. You can selectively copy files from one stage to another, leaving behind everything you don't need in the final image.
Key points:
- Each stage can use a different base image.
- Stages are numbered starting from 0, or you can name them with
AS name. - Only the last stage's layers are kept in the final image (unless earlier stages are referenced).
- You can copy files from previous stages using
COPY --from=stage_name.
Simple Example (Go)
# Build stage
FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .
# Final stage
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/myapp .
CMD ["./myapp"]- The first stage (
builder) uses the largegolangimage to compile the app. - The second stage uses a tiny
alpineimage. - Only the compiled binary is copied; the Go toolchain and source are discarded.
- The final image is often 10-20 MB instead of 800+ MB.
Visual: Single-stage vs Multi-stage
Single-stage (bloated):
+----------------------------------+
| Go binary + source + toolchain | <- 800+ MB
+----------------------------------+
| FROM golang:1.20 |
+----------------------------------+Multi-stage (lean):
Stage 1: builder Stage 2: final
+---------------+ +----------------+
| Go toolchain | -> | Binary only | <- ~15 MB
| Source code | +----------------+
+---------------+ | FROM alpine |
| golang:1.20 | +----------------+
+---------------+ DISCARDED3. Syntax and Usage
3.1. Naming Stages
You can name a stage using AS:
FROM node:18 AS build-envThen refer to it later: COPY --from=build-env ...
If you don't name a stage, you can refer to it by index (0, 1, 2, ...). Naming is strongly recommended for clarity.
3.2. Copying from a Stage
COPY --from=<stage_name_or_index> <source_path> <dest_path><source_path>is relative to the build context of that stage.- You can also copy from a completely different image (not just a stage) using
--from=image:tag.
Example copying from an image:
COPY --from=nginx:latest /usr/share/nginx/html/index.html /usr/share/nginx/html/3.3. Stopping at a Specific Stage
Sometimes you may want to stop the build at a particular stage for debugging. Use the --target flag with docker build:
docker build --target builder -t myapp-builder .This builds only up to the stage named builder. Useful for testing.
4. Common Patterns by Language
4.1. Go (Statically Compiled)
As shown above, Go produces a static binary. You can even build with CGO_ENABLED=0 to create a fully static binary that runs on scratch (an empty image).
FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o myapp .
FROM scratch
COPY --from=builder /app/myapp /myapp
CMD ["/myapp"]The scratch image is truly empty – it contains nothing. Perfect for statically linked binaries.
4.2. Node.js (with Build Step)
Node.js apps often have a build step (e.g., Webpack, Babel) and need only the built files and production dependencies.
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM node:18-slim
WORKDIR /app
# Copy built assets and production node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
CMD ["node", "dist/index.js"]TIP
Note: This copies the node_modules from builder (which may include dev dependencies if not pruned). Better to run npm ci --production in builder stage to get only production deps, or use separate stages for dependencies. See advanced patterns.
4.3. Python
For Python, you may need to compile native extensions. Use a builder stage with the full Python image, then copy the installed site-packages and your code to a slim runtime stage.
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "app.py"]This uses pip install --user to install packages into a user directory, then copies that directory.
4.4. Java
For Java, you might compile with Maven/Gradle in one stage and copy the resulting JAR/WAR to a JRE base image.
FROM maven:3.8-openjdk-11 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package
FROM openjdk:11-jre-slim
COPY --from=builder /app/target/myapp.jar /app.jar
CMD ["java", "-jar", "/app.jar"]5. Advanced Techniques
5.1. Using External Images as Stages
You can copy files from any existing image, not just stages in the current Dockerfile. This is useful for pulling in binaries or configuration files.
FROM alpine:latest
COPY --from=nginx:alpine /etc/nginx/nginx.conf /nginx.conf5.2. Build Args Across Stages
Build arguments are available in all stages. You can use them to control versions, e.g., ARG GO_VERSION=1.20 and then FROM golang:${GO_VERSION} AS builder.
5.3. Conditional Stages (with --target)
As mentioned, --target allows you to stop at a specific stage, which is great for CI/CD where you might want a testing image separate from production.
5.4. Multiple Build Contexts (Docker BuildKit)
With BuildKit, you can also use --from with a named context, but that's beyond this lesson.
6. Benefits of Multi-stage Builds
TIP
- Smaller images: Remove build tools, source code, and intermediate files.
- Improved security: Fewer components mean fewer vulnerabilities.
- Faster deployments: Smaller images transfer faster over the network.
- Separation of concerns: Builder stage can have all debugging tools, final stage only runtime essentials.
- Reusability: You can copy artifacts from one stage to multiple final images.
Hands-On Tasks
Task 1: Convert a Single-Stage Go App to Multi-stage
- Create a simple Go program
main.go:gopackage main import "fmt" func main() { fmt.Println("Hello from Go multi-stage!") } - Write a single-stage Dockerfile (as above) and build it. Note the size.
- Create a multi-stage Dockerfile as shown (using
golang:1.20as builder andalpineorscratchas final). Build and compare sizes. - If using
scratch, run the container:docker run --rm myapp. It should print the message.
Task 2: Node.js App with Build
- Create a simple Node.js app with a build script:
package.jsonwith a "build" script (e.g.,echo "built" > dist/index.js).dist/index.jscould be a simpleconsole.log("Hello").
- Write a multi-stage Dockerfile:
- Stage 1:
node:18to runnpm installandnpm run build. - Stage 2:
node:18-slimto copy the built files.
- Stage 1:
- Build and verify the image size compared to a single-stage version (if you also try a single-stage for comparison).
Task 3: Python with User Install
- Create a Python script
app.pythat uses a library (e.g.,requests). - Create
requirements.txtwithrequests. - Write a multi-stage Dockerfile using the pattern above (
pip install --user). Build and run to confirm it works. - Check that the final image doesn't contain pip or build tools.
Task 4: Debug with --target
- For the Go example, build only the builder stage:bash
docker build --target builder -t go-builder . - Run a shell in that intermediate image and explore:
docker run -it go-builder sh. You'll see source and Go toolchain. - Then build the full image and notice that the intermediate image is separate; you could even push the builder image for caching in CI.
Task 5: Compare Image Sizes
For any of the above, use docker images to see the size difference between a naive single-stage build and your multi-stage build. Document the reduction.
Summary
Key Takeaways
- Multi-stage builds let you use multiple
FROMstatements in one Dockerfile. - Each stage is independent; you can copy artifacts from earlier stages using
COPY --from. - The final image only contains what you explicitly copy, drastically reducing size and attack surface.
- Common patterns exist for compiled languages (Go, Rust, C), interpreted languages with build steps (Node, Python with native extensions), and Java.
- Use
--targetto stop at a specific stage for debugging or testing. - Always aim to use the smallest possible base image in the final stage (e.g.,
alpine,slim,scratch).
Check Your Understanding
- What problem do multi-stage builds solve?
- How do you refer to a previous stage by name when copying files?
- If you have three stages and don't copy anything from stage 2, is stage 2 included in the final image?
- What base image would you use for a statically compiled Go binary to achieve the smallest possible image?
- How can you build only up to a specific stage, and why would you want to do that?
- Write a Dockerfile snippet that copies a file named
app.jarfrom a stage namedbuilderinto the current stage's/appdirectory.
Click to see answers
- They solve the problem of bloated images by separating build-time dependencies (compilers, build tools) from runtime dependencies, so only the final artifact and necessary runtime files end up in the production image.
- Using
COPY --from=stage_namewherestage_nameis the name defined withASin theFROMinstruction. - No. Only the final stage becomes the image. Intermediate stages are used only during the build process.
scratch– an empty base image. It's perfect for statically compiled binaries because they don't need any runtime libraries.- With
docker build --target stage_name. You'd use this for debugging, testing intermediate stages, or creating separate images for development vs. production. COPY --from=builder /app/target/app.jar /app/app.jar(adjust paths as needed).
Additional Resources
- Docker documentation: Multi-stage builds
- Multi-stage builds: patterns and examples
- More language-specific examples (official)
- BuildKit and multi-stage
Next Up
In the next lesson, we'll cover tagging and pushing images to registries, so you can share your optimized images with the world (or your team). See you there!