Lesson 2.1: Dockerfile Basics
Welcome to Phase 2! Now that you can run pre-built images, it's time to create your own. Dockerfiles are the recipe files that define how to build a Docker image. In this lesson, you'll learn the fundamental instructions, how to build an image, and best practices to get started.
Learning Objectives
TIP
By the end of this lesson, you will be able to:
- Explain what a Dockerfile is and why it's used.
- Write a simple Dockerfile using
FROM,RUN,CMD, andENTRYPOINT. - Understand the concept of build context.
- Build an image from a Dockerfile using
docker build. - Run a container from your custom image.
- Follow basic best practices for writing Dockerfiles.
1. What is a Dockerfile?
A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build, you can create an automated build that executes several command-line instructions in succession.
Think of a Dockerfile as a recipe:
- It starts with a base image (like a foundation).
- Each instruction adds a layer (like ingredients).
- The final image is the finished dish.
Why Dockerfiles?
- Reproducibility: Anyone can build the exact same image.
- Version control: Dockerfiles can be stored in Git.
- Automation: Integrate with CI/CD pipelines.
- Transparency: See exactly how an image was built.
2. Essential Dockerfile Instructions
Here are the most common instructions you'll use.
2.1. FROM
Every Dockerfile must start with a FROM instruction. It sets the base image for subsequent instructions.
Syntax:
FROM <image>[:<tag>] [AS <name>]<image>: The base image (e.g.,ubuntu,alpine,python).<tag>: Optional tag (defaults tolatest).AS <name>: Optional name for multi-stage builds (covered later).
Example:
FROM ubuntu:22.042.2. RUN
The RUN instruction executes commands in a new layer on top of the current image and commits the results. It's typically used to install packages, create directories, or run build steps.
Two forms:
- Shell form:
RUN <command>(runs in/bin/sh -c).dockerfileRUN apt-get update && apt-get install -y curl - Exec form:
RUN ["executable", "param1", "param2"](preferred for clarity, avoids shell string munging).dockerfileRUN ["apt-get", "update"] RUN ["apt-get", "install", "-y", "curl"]
TIP
Each RUN creates a new layer. To reduce layers and image size, combine commands using && (as shown above).
2.3. CMD
The CMD instruction provides defaults for an executing container. It can be overridden when running the container. There can only be one CMD in a Dockerfile; if you list multiple, only the last takes effect.
Three forms:
- Exec form (preferred):
CMD ["executable", "param1", "param2"]dockerfileCMD ["nginx", "-g", "daemon off;"] - Shell form:
CMD command param1 param2(runs in/bin/sh -c). - Parameter form:
CMD ["param1", "param2"](used as default arguments toENTRYPOINT).
2.4. ENTRYPOINT
ENTRYPOINT configures a container that will run as an executable. It is similar to CMD but not easily overridden; instead, any CMD arguments or docker run command-line arguments are appended to the entrypoint.
Exec form:
ENTRYPOINT ["python"]
CMD ["app.py"]Now docker run myimage runs python app.py. If you run docker run myimage script.py, it runs python script.py (overriding CMD but preserving ENTRYPOINT).
TIP
When to use?
- Use
ENTRYPOINTwhen you want the container to act like the binary itself. - Use
CMDfor default arguments or simple commands that users might override.
Shell form: Avoid because it prevents signal forwarding.
2.5. LABEL (optional but recommended)
Adds metadata to an image.
LABEL version="1.0"
LABEL maintainer="yourname@example.com"2.6. WORKDIR
Sets the working directory for any RUN, CMD, ENTRYPOINT, COPY, ADD instructions that follow. If the directory doesn't exist, it's created.
WORKDIR /app2.7. COPY and ADD
COPY <src> <dest>copies files/directories from the build context into the container.ADDcan also copy from URLs and extract tar files. PreferCOPYfor local files (more transparent).
COPY . /app3. Build Context
When you run docker build, you specify a build context – usually a directory path. Docker sends the entire context (recursively) to the Docker daemon. The context is where your source files and Dockerfile live.
Syntax:
docker build [OPTIONS] PATHPATHis the build context (often.for current directory).- Docker looks for a Dockerfile at the root of the context by default. Use
-fto specify a different file.
WARNING
Do not include unnecessary files in the context (like node_modules, .git). Use a .dockerignore file to exclude them.
Example .dockerignore:
node_modules
.git
*.log4. Building an Image
4.1. Simple Dockerfile Example (Python App)
Create a new directory for your project:
mkdir mypythonapp
cd mypythonappCreate a simple Python script app.py:
print("Hello from Docker!")Now create a file named Dockerfile (no extension) with the following content:
# Use an official Python runtime as base image
FROM python:3.11-slim
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Run the script when the container launches
CMD ["python", "app.py"]4.2. Build the Image
Run the build command:
docker build -t mypythonapp .-t mypythonapptags the image with a name.- The final
.is the build context (current directory).
Output: You'll see each step (instruction) executed, with a hash for each layer. If successful, you'll get a message like Successfully tagged mypythonapp:latest.
4.3. Run the Container
docker run mypythonappOutput: Hello from Docker!
4.4. Override the Command
Because we used CMD, we can override it:
docker run mypythonapp python -c "print('Overridden')"Output: Overridden
5. More Examples
Example: Node.js App
app.js:
console.log("Hello from Node!");Dockerfile:
FROM node:18-alpine
WORKDIR /app
COPY . .
CMD ["node", "app.js"]Build and run similarly.
Example: Using ENTRYPOINT
Suppose you want a container that always runs ping to a specified host:
FROM alpine:latest
ENTRYPOINT ["ping"]
CMD ["localhost"]Now:
docker run mypingpings localhost.docker run myping google.compings google.com.
6. Basic Best Practices
Key Takeaways
- Use specific tags for base images (e.g.,
python:3.11-slimnotpython:latest) to avoid unexpected changes. - Combine
RUNcommands to reduce layers (e.g.,RUN apt-get update && apt-get install -y package). - Clean up temporary files in the same
RUNto keep layers small (e.g.,&& rm -rf /var/lib/apt/lists/*). - Order instructions from least to most frequently changing to leverage caching.
- Use
.dockerignoreto exclude unnecessary files from the build context. - Prefer
COPYoverADDunless you need URL fetching or tar extraction. - Run as non-root (we'll cover later) for security.
Hands-On Tasks
Task 1: Write a Dockerfile for a Simple Script
- Create a directory
hello-bash. - Inside, create a file
hello.sh:bashMake it executable:#!/bin/bash echo "Hello from Bash!"chmod +x hello.sh. - Create a Dockerfile that:
- Uses
bash:latestas base. - Copies
hello.shinto the container. - Runs
./hello.shwhen the container starts.
- Uses
- Build the image with tag
hello-bash. - Run it and verify output.
Task 2: Experiment with CMD and ENTRYPOINT
- Create a Dockerfile based on
alpinethat:- Uses
ENTRYPOINTforecho. - Uses
CMDwith default message "Hello, World!".
- Uses
- Build as
echocontainer. - Run without arguments – should print "Hello, World!".
- Run with an argument:
docker run echocontainer "Custom message"– should print "Custom message". - Try to override the entrypoint with
--entrypointflag (advanced).
Task 3: Layer Caching
- Create a Dockerfile for a Python app with multiple steps:
FROM python:3.11-slimWORKDIR /appCOPY requirements.txt .(create an emptyrequirements.txtfor now)RUN pip install -r requirements.txtCOPY . .CMD ["python", "app.py"]
- Build the image (first build).
- Modify
app.py(add a comment) and rebuild. Notice that theCOPY . .step uses a cached layer? Actually, it will re-run because the file changed. - Now modify
requirements.txt(add a comment) and rebuild. Observe that theRUN pip installstep is rerun becauserequirements.txtchanged. This demonstrates why you should copy dependency files separately before copying source code.
Task 4: Use .dockerignore
- Create a file
.dockerignorein your project directory with:*.log temp/ - Create a dummy log file (
touch test.log) and a temp directory (mkdir temp). - Build the image and verify (with
docker build --no-cacheto force fresh) that those files are not included. You can check by running a container and listing files.
Task 5: Inspect the Image Layers
- After building an image, run
docker history <image>to see the layers and their sizes. - Compare the size of an image built with combined
RUNcommands vs. separateRUNs.
Summary
Key Takeaways
- A Dockerfile is a script of instructions to build an image.
- Key instructions:
FROM,RUN,CMD,ENTRYPOINT,COPY,WORKDIR. - Build context is the directory sent to the daemon during
docker build. - Use
.dockerignoreto exclude unnecessary files. docker build -t name .creates an image from a Dockerfile.- Run your custom image with
docker run.
Check Your Understanding
- What is the purpose of the
FROMinstruction in a Dockerfile? - Why is it a good practice to combine multiple shell commands into a single
RUNinstruction? - What is the difference between
CMDandENTRYPOINT? - If you have a Dockerfile that copies your source code, and you change only a comment in the source, will Docker reuse cached layers for the
COPYstep? Why or why not? - What is the build context, and how do you specify it?
- How can you prevent
node_modulesor__pycache__from being sent to the Docker daemon during build?
Click to see answers
FROMsets the base image that subsequent instructions will build on top of. Every Dockerfile must start with aFROMinstruction.- Combining commands reduces the number of layers and keeps the image smaller, since each layer adds overhead. It also ensures temporary files are cleaned up in the same layer.
CMDprovides defaults that can be overridden at runtime withdocker runarguments.ENTRYPOINTconfigures the container to act as an executable, and arguments are appended to it rather than replacing it.- Yes, Docker checks if the files changed by computing a checksum. A comment change modifies the file content, so the checksum differs, invalidating the cache for that layer.
- The build context is the directory whose contents are sent to the Docker daemon during build. You specify it as the last argument to
docker build, e.g.,docker build .for the current directory. - By creating a
.dockerignorefile in the build context directory, listing the files and directories to exclude.
Additional Resources
- Dockerfile reference (official)
- Best practices for writing Dockerfiles
- .dockerignore file reference
- Understanding image layers
Next Up
In the next lesson, we'll dive deeper into layering and caching, and how to optimize your Dockerfiles for speed and size. See you there!