Skip to content

Lesson 2.1: Dockerfile Basics

Welcome to Phase 2! Now that you can run pre-built images, it's time to create your own. Dockerfiles are the recipe files that define how to build a Docker image. In this lesson, you'll learn the fundamental instructions, how to build an image, and best practices to get started.


Learning Objectives

TIP

By the end of this lesson, you will be able to:

  • Explain what a Dockerfile is and why it's used.
  • Write a simple Dockerfile using FROM, RUN, CMD, and ENTRYPOINT.
  • Understand the concept of build context.
  • Build an image from a Dockerfile using docker build.
  • Run a container from your custom image.
  • Follow basic best practices for writing Dockerfiles.

1. What is a Dockerfile?

A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build, you can create an automated build that executes several command-line instructions in succession.

Think of a Dockerfile as a recipe:

  • It starts with a base image (like a foundation).
  • Each instruction adds a layer (like ingredients).
  • The final image is the finished dish.

Why Dockerfiles?

  • Reproducibility: Anyone can build the exact same image.
  • Version control: Dockerfiles can be stored in Git.
  • Automation: Integrate with CI/CD pipelines.
  • Transparency: See exactly how an image was built.

2. Essential Dockerfile Instructions

Here are the most common instructions you'll use.

2.1. FROM

Every Dockerfile must start with a FROM instruction. It sets the base image for subsequent instructions.

Syntax:

dockerfile
FROM <image>[:<tag>] [AS <name>]
  • <image>: The base image (e.g., ubuntu, alpine, python).
  • <tag>: Optional tag (defaults to latest).
  • AS <name>: Optional name for multi-stage builds (covered later).

Example:

dockerfile
FROM ubuntu:22.04

2.2. RUN

The RUN instruction executes commands in a new layer on top of the current image and commits the results. It's typically used to install packages, create directories, or run build steps.

Two forms:

  • Shell form: RUN <command> (runs in /bin/sh -c).
    dockerfile
    RUN apt-get update && apt-get install -y curl
  • Exec form: RUN ["executable", "param1", "param2"] (preferred for clarity, avoids shell string munging).
    dockerfile
    RUN ["apt-get", "update"]
    RUN ["apt-get", "install", "-y", "curl"]

TIP

Each RUN creates a new layer. To reduce layers and image size, combine commands using && (as shown above).

2.3. CMD

The CMD instruction provides defaults for an executing container. It can be overridden when running the container. There can only be one CMD in a Dockerfile; if you list multiple, only the last takes effect.

Three forms:

  • Exec form (preferred): CMD ["executable", "param1", "param2"]
    dockerfile
    CMD ["nginx", "-g", "daemon off;"]
  • Shell form: CMD command param1 param2 (runs in /bin/sh -c).
  • Parameter form: CMD ["param1", "param2"] (used as default arguments to ENTRYPOINT).

2.4. ENTRYPOINT

ENTRYPOINT configures a container that will run as an executable. It is similar to CMD but not easily overridden; instead, any CMD arguments or docker run command-line arguments are appended to the entrypoint.

Exec form:

dockerfile
ENTRYPOINT ["python"]
CMD ["app.py"]

Now docker run myimage runs python app.py. If you run docker run myimage script.py, it runs python script.py (overriding CMD but preserving ENTRYPOINT).

TIP

When to use?

  • Use ENTRYPOINT when you want the container to act like the binary itself.
  • Use CMD for default arguments or simple commands that users might override.

Shell form: Avoid because it prevents signal forwarding.

Adds metadata to an image.

dockerfile
LABEL version="1.0"
LABEL maintainer="yourname@example.com"

2.6. WORKDIR

Sets the working directory for any RUN, CMD, ENTRYPOINT, COPY, ADD instructions that follow. If the directory doesn't exist, it's created.

dockerfile
WORKDIR /app

2.7. COPY and ADD

  • COPY <src> <dest> copies files/directories from the build context into the container.
  • ADD can also copy from URLs and extract tar files. Prefer COPY for local files (more transparent).
dockerfile
COPY . /app

3. Build Context

When you run docker build, you specify a build context – usually a directory path. Docker sends the entire context (recursively) to the Docker daemon. The context is where your source files and Dockerfile live.

Syntax:

bash
docker build [OPTIONS] PATH
  • PATH is the build context (often . for current directory).
  • Docker looks for a Dockerfile at the root of the context by default. Use -f to specify a different file.

WARNING

Do not include unnecessary files in the context (like node_modules, .git). Use a .dockerignore file to exclude them.

Example .dockerignore:

node_modules
.git
*.log

4. Building an Image

4.1. Simple Dockerfile Example (Python App)

Create a new directory for your project:

bash
mkdir mypythonapp
cd mypythonapp

Create a simple Python script app.py:

python
print("Hello from Docker!")

Now create a file named Dockerfile (no extension) with the following content:

dockerfile
# Use an official Python runtime as base image
FROM python:3.11-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Run the script when the container launches
CMD ["python", "app.py"]

4.2. Build the Image

Run the build command:

bash
docker build -t mypythonapp .
  • -t mypythonapp tags the image with a name.
  • The final . is the build context (current directory).

Output: You'll see each step (instruction) executed, with a hash for each layer. If successful, you'll get a message like Successfully tagged mypythonapp:latest.

4.3. Run the Container

bash
docker run mypythonapp

Output: Hello from Docker!

4.4. Override the Command

Because we used CMD, we can override it:

bash
docker run mypythonapp python -c "print('Overridden')"

Output: Overridden


5. More Examples

Example: Node.js App

app.js:

javascript
console.log("Hello from Node!");

Dockerfile:

dockerfile
FROM node:18-alpine
WORKDIR /app
COPY . .
CMD ["node", "app.js"]

Build and run similarly.

Example: Using ENTRYPOINT

Suppose you want a container that always runs ping to a specified host:

dockerfile
FROM alpine:latest
ENTRYPOINT ["ping"]
CMD ["localhost"]

Now:

  • docker run myping pings localhost.
  • docker run myping google.com pings google.com.

6. Basic Best Practices

Key Takeaways

  • Use specific tags for base images (e.g., python:3.11-slim not python:latest) to avoid unexpected changes.
  • Combine RUN commands to reduce layers (e.g., RUN apt-get update && apt-get install -y package).
  • Clean up temporary files in the same RUN to keep layers small (e.g., && rm -rf /var/lib/apt/lists/*).
  • Order instructions from least to most frequently changing to leverage caching.
  • Use .dockerignore to exclude unnecessary files from the build context.
  • Prefer COPY over ADD unless you need URL fetching or tar extraction.
  • Run as non-root (we'll cover later) for security.

Hands-On Tasks

Task 1: Write a Dockerfile for a Simple Script

  1. Create a directory hello-bash.
  2. Inside, create a file hello.sh:
    bash
    #!/bin/bash
    echo "Hello from Bash!"
    Make it executable: chmod +x hello.sh.
  3. Create a Dockerfile that:
    • Uses bash:latest as base.
    • Copies hello.sh into the container.
    • Runs ./hello.sh when the container starts.
  4. Build the image with tag hello-bash.
  5. Run it and verify output.

Task 2: Experiment with CMD and ENTRYPOINT

  1. Create a Dockerfile based on alpine that:
    • Uses ENTRYPOINT for echo.
    • Uses CMD with default message "Hello, World!".
  2. Build as echocontainer.
  3. Run without arguments – should print "Hello, World!".
  4. Run with an argument: docker run echocontainer "Custom message" – should print "Custom message".
  5. Try to override the entrypoint with --entrypoint flag (advanced).

Task 3: Layer Caching

  1. Create a Dockerfile for a Python app with multiple steps:
    • FROM python:3.11-slim
    • WORKDIR /app
    • COPY requirements.txt . (create an empty requirements.txt for now)
    • RUN pip install -r requirements.txt
    • COPY . .
    • CMD ["python", "app.py"]
  2. Build the image (first build).
  3. Modify app.py (add a comment) and rebuild. Notice that the COPY . . step uses a cached layer? Actually, it will re-run because the file changed.
  4. Now modify requirements.txt (add a comment) and rebuild. Observe that the RUN pip install step is rerun because requirements.txt changed. This demonstrates why you should copy dependency files separately before copying source code.

Task 4: Use .dockerignore

  1. Create a file .dockerignore in your project directory with:
    *.log
    temp/
  2. Create a dummy log file (touch test.log) and a temp directory (mkdir temp).
  3. Build the image and verify (with docker build --no-cache to force fresh) that those files are not included. You can check by running a container and listing files.

Task 5: Inspect the Image Layers

  1. After building an image, run docker history <image> to see the layers and their sizes.
  2. Compare the size of an image built with combined RUN commands vs. separate RUNs.

Summary

Key Takeaways

  • A Dockerfile is a script of instructions to build an image.
  • Key instructions: FROM, RUN, CMD, ENTRYPOINT, COPY, WORKDIR.
  • Build context is the directory sent to the daemon during docker build.
  • Use .dockerignore to exclude unnecessary files.
  • docker build -t name . creates an image from a Dockerfile.
  • Run your custom image with docker run.

Check Your Understanding

  1. What is the purpose of the FROM instruction in a Dockerfile?
  2. Why is it a good practice to combine multiple shell commands into a single RUN instruction?
  3. What is the difference between CMD and ENTRYPOINT?
  4. If you have a Dockerfile that copies your source code, and you change only a comment in the source, will Docker reuse cached layers for the COPY step? Why or why not?
  5. What is the build context, and how do you specify it?
  6. How can you prevent node_modules or __pycache__ from being sent to the Docker daemon during build?
Click to see answers
  1. FROM sets the base image that subsequent instructions will build on top of. Every Dockerfile must start with a FROM instruction.
  2. Combining commands reduces the number of layers and keeps the image smaller, since each layer adds overhead. It also ensures temporary files are cleaned up in the same layer.
  3. CMD provides defaults that can be overridden at runtime with docker run arguments. ENTRYPOINT configures the container to act as an executable, and arguments are appended to it rather than replacing it.
  4. Yes, Docker checks if the files changed by computing a checksum. A comment change modifies the file content, so the checksum differs, invalidating the cache for that layer.
  5. The build context is the directory whose contents are sent to the Docker daemon during build. You specify it as the last argument to docker build, e.g., docker build . for the current directory.
  6. By creating a .dockerignore file in the build context directory, listing the files and directories to exclude.

Additional Resources


Next Up

In the next lesson, we'll dive deeper into layering and caching, and how to optimize your Dockerfiles for speed and size. See you there!