Lesson 3.1: Container Storage Basics

Welcome to Phase 3! So far, you've built, tagged, and pushed images, and you've run containers. But every container you've run so far has been stateless: any data created inside the container vanished when the container was removed. In this lesson, we'll explore why containers are ephemeral by default, and lay the groundwork for persistent storage solutions like volumes and bind mounts.

Learning Objectives

TIP

By the end of this lesson, you will be able to:

Explain why containers are ephemeral and how the writable container layer works.
Understand the difference between image layers and the container layer.
Observe data loss when a container is removed.
List common use cases where persistent storage is needed.
Describe the storage drivers Docker uses (conceptually) and how they affect performance.
Identify the three main ways to manage data in Docker: volumes, bind mounts, and tmpfs mounts.

1. The Ephemeral Nature of Containers

A container is an instance of an image, with a thin writable layer added on top of the image's read-only layers. When you create a file, modify a configuration, or install software inside a running container, all those changes are written to this writable layer.

Key points:

The writable layer exists only as long as the container exists.
When you stop and restart a container, the writable layer persists (unless you used --rm).
When you remove a container (docker rm), the writable layer is deleted, and all data stored there is lost.

This design is intentional: containers are meant to be disposable. It makes scaling, updating, and replacing containers simple and predictable.

1.1. Why Ephemeral Is Good

Immutability: You can replace a container with a fresh one without worrying about leftover state.
Scalability: Easily spin up multiple copies of a container; each has its own isolated writable layer.
Consistency: A container started from the same image behaves identically, regardless of its history.

1.2. When Ephemeral Is a Problem

Many applications need to preserve data across container restarts and removals:

Databases (MySQL, PostgreSQL)
Content management systems (WordPress)
File uploads, logs, configuration
Any stateful service

For these, you need persistent storage that outlives the container.

2. Where Does Data Live?

Docker stores images and containers in its storage directory (usually /var/lib/docker on Linux). Inside, the storage driver (e.g., overlay2) manages the layers.

2.1. Image Layers

Read-only layers that are shared across containers.

2.2. Container Layer

The writable layer unique to each container.

2.3. Volumes

Docker volumes are special directories managed by Docker, stored outside the container's writable layer, and can survive container removal. We'll dive deep into volumes in the next lesson.

2.4. Bind Mounts

Bind mounts allow you to mount any directory from the host machine into a container. This also provides persistence, but with direct host filesystem access.

2.5. tmpfs Mounts

In-memory storage, not persisted to disk, useful for temporary files or secrets.

3. Demonstration: Data Loss on Container Removal

Let's see this in action.

3.1. Create a Container and Add Data

Run an Ubuntu container and create a file inside:

bash

docker run -it --name demo ubuntu bash

Inside the container:

bash

echo "Important data" > /tmp/data.txt
exit

3.2. Check the Container and Data

List containers:

bash

docker ps -a

You'll see the demo container in Exited state.

Restart the container and verify the file is still there:

bash

docker start -i demo

Inside:

bash

cat /tmp/data.txt   # Outputs: Important data
exit

3.3. Remove the Container and Lose Data

Now remove the container:

bash

docker rm demo

Run a new container from the same image and try to find the file:

bash

docker run --rm ubuntu cat /tmp/data.txt

Error: cat: /tmp/data.txt: No such file or directory. The data is gone.

This illustrates that data stored in the writable layer does not survive container removal.

4. Why Not Just Keep Containers Forever?

You could choose not to remove containers, but that leads to:

Accumulation of stopped containers consuming disk space.
Inability to easily update the image (you'd have to recreate the container).
State tied to a specific container instance, making scaling or moving workloads difficult.

Instead, we separate data from the container lifecycle.

5. Storage Drivers: Under the Hood (Conceptual)

Docker uses a storage driver to manage the layers and the container's writable layer. Common storage drivers:

overlay2: Default on modern Linux, efficient and stable.
aufs, devicemapper, btrfs, zfs (legacy or specialized).

The storage driver affects performance, especially for write-heavy workloads. When using volumes or bind mounts, the storage driver is bypassed for those mounted directories (they are directly accessed by the host filesystem), which often improves performance.

You can see the storage driver with docker info | grep "Storage Driver".

6. Introduction to Persistent Storage Options

Docker provides three main ways to persist data:

6.1. Volumes

Managed by Docker: Stored in /var/lib/docker/volumes/ (on Linux).
Portable: Can be backed up, restored, and managed with Docker CLI commands.
Preferred for production: Volumes are the recommended way to persist data because they are decoupled from the host filesystem and work on all platforms (including Docker Desktop).

6.2. Bind Mounts

Host-controlled: You mount a specific directory from the host into the container.
Flexible: Great for development (live code reload) or providing configuration files.
Less portable: Paths are host-specific; can be security risk if you mount sensitive host directories.

6.3. tmpfs Mounts

In-memory: Stored only in the container's memory; never written to the host disk.
Ephemeral: Useful for temporary data that should not persist (e.g., secrets, cache).

We'll cover each in depth in the next lessons.

Hands-On Tasks

Task 1: Verify Data Loss

Run an interactive Alpine container, create a file in /tmp, and exit.
Remove the container.
Run a new container from the same image and confirm the file is missing.

Task 2: Explore Storage Driver and Docker Root

Run docker info and look for "Storage Driver" and "Docker Root Dir".
If you're on Linux, navigate to /var/lib/docker (requires root) and see the overlay2 subdirectories. (On Docker Desktop, this is inside a VM; you can explore with docker run -it --privileged --pid=host alpine nsenter -t 1 -m -u -i -n sh but that's advanced.)

Task 3: Observe the Container Layer

Run a container with --name test and create a file.
Use docker diff test to see which files have been added (A), changed (C), or deleted (D) in the container's writable layer.
Remove the container and note the changes disappear.

Task 4: Run a Container with `--rm` and Attempt to Preserve Data

Run docker run --rm -it alpine sh, create a file, then exit. What happened to the container? Why couldn't you inspect it later?

Task 5: Multiple Containers from Same Image

Run two containers from the same image (ubuntu or alpine) in detached mode (with a sleep command).
In one container, create a file. Is it visible in the other? (No, because each container has its own writable layer.)
Stop and remove them.

Summary

Key Takeaways

Containers are ephemeral by design; data written inside the container's writable layer is lost when the container is removed.
The writable layer is separate from the image layers and is unique to each container.
For persistent data, Docker offers volumes, bind mounts, and tmpfs mounts.
Storage drivers manage the layering; using volumes/bind mounts bypasses the storage driver for those paths.
Understanding where data lives is the first step to designing stateful containerized applications.

Check Your Understanding

What happens to the data stored in a container's writable layer when the container is removed?
Why would a database need persistent storage in a container?
What command shows the changes made to a container's filesystem?
Name three ways Docker can manage persistent data.
What is the main difference between volumes and bind mounts in terms of Docker management?

Click to see answers

The data is permanently deleted. The writable layer is tied to the container and removed with it.
Databases write data to disk that must persist across container restarts, updates, and removals. Without persistent storage, the database would lose all data on the next container restart.
docker diff <container> shows added (A), changed (C), and deleted (D) files.
Volumes, bind mounts, and tmpfs mounts.
Volumes are managed by Docker using CLI commands and stored in Docker's control. Bind mounts map host directories directly and are managed by the user.

Additional Resources

Next Up

In the next lesson, we'll dive into Volumes – the recommended way to persist data in production. We'll create volumes, inspect them, and see how they survive container removal. See you there!

Lesson 3.1: Container Storage Basics ​

Learning Objectives ​

1. The Ephemeral Nature of Containers ​

1.1. Why Ephemeral Is Good ​

1.2. When Ephemeral Is a Problem ​

2. Where Does Data Live? ​

2.1. Image Layers ​

2.2. Container Layer ​

2.3. Volumes ​

2.4. Bind Mounts ​

2.5. tmpfs Mounts ​

3. Demonstration: Data Loss on Container Removal ​

3.1. Create a Container and Add Data ​

3.2. Check the Container and Data ​

3.3. Remove the Container and Lose Data ​

4. Why Not Just Keep Containers Forever? ​

5. Storage Drivers: Under the Hood (Conceptual) ​

6. Introduction to Persistent Storage Options ​

6.1. Volumes ​

6.2. Bind Mounts ​

6.3. tmpfs Mounts ​

Hands-On Tasks ​

Task 1: Verify Data Loss ​

Task 2: Explore Storage Driver and Docker Root ​

Task 3: Observe the Container Layer ​

Task 4: Run a Container with --rm and Attempt to Preserve Data ​

Task 5: Multiple Containers from Same Image ​

Summary ​

Check Your Understanding ​

Additional Resources ​

Lesson 3.1: Container Storage Basics

Learning Objectives

1. The Ephemeral Nature of Containers

1.1. Why Ephemeral Is Good

1.2. When Ephemeral Is a Problem

2. Where Does Data Live?

2.1. Image Layers

2.2. Container Layer

2.3. Volumes

2.4. Bind Mounts

2.5. tmpfs Mounts

3. Demonstration: Data Loss on Container Removal

3.1. Create a Container and Add Data

3.2. Check the Container and Data

3.3. Remove the Container and Lose Data

4. Why Not Just Keep Containers Forever?

5. Storage Drivers: Under the Hood (Conceptual)

6. Introduction to Persistent Storage Options

6.1. Volumes

6.2. Bind Mounts

6.3. tmpfs Mounts

Hands-On Tasks

Task 1: Verify Data Loss

Task 2: Explore Storage Driver and Docker Root

Task 3: Observe the Container Layer

Task 4: Run a Container with `--rm` and Attempt to Preserve Data

Task 5: Multiple Containers from Same Image

Summary

Check Your Understanding

Additional Resources