Docker Volumes and Storage: Persistent Data Strategies (2026)

Containers are ephemeral by design — when a container is removed, its writable layer is gone. For databases, uploaded files, logs and any data that must survive container restarts, you need volumes. Docker offers three storage types: named volumes (managed by Docker, best for databases), bind mounts (map host directories, best for development), and tmpfs (in-memory, best for sensitive temp data). This phase covers all three in depth, plus volume drivers for cloud storage, database backup and restore patterns, and sharing data between containers.

Storage Types Compared

# Three ways to persist data outside the container's writable layer:
#
# Named Volume          Bind Mount              tmpfs
# ─────────────────     ─────────────────────   ────────────────────
# Managed by Docker     Maps a host path         In-memory only
# /var/lib/docker/      Any directory you own    Not on disk at all
# volumes/name/
#
# Best for:             Best for:               Best for:
# Databases             Dev live-reload         Secrets/tokens
# Persistent app data   Config files            Scratch space
# Production            Source code mounts      Cache that shouldn't
#                                               survive restarts
#
# Performance:          Performance:            Performance:
# Good (native FS)      Good on Linux           Fastest (RAM)
# Slow on Mac/Win       Slow on Mac/Win
# (Docker Desktop VM)   (Docker Desktop VM)
#
# Survives:             Survives:               Survives:
# container rm ✅        container rm ✅          container stop ✅
# docker rm -v ❌        host rm -rf ❌           container rm ❌
# host reboot ✅         host reboot ✅           reboot ❌

Named Volumes

# Create a named volume explicitly
docker volume create postgres_data

# Or let Docker create it automatically (from run or compose)
docker run -d \
  -v postgres_data:/var/lib/postgresql/data \
  postgres:16-alpine

# List volumes
docker volume ls
# DRIVER    VOLUME NAME
# local     myapp_postgres_data
# local     myapp_redis_data

# Inspect a volume — find its actual location on the host
docker volume inspect postgres_data
# [
#   {
#     "Name": "postgres_data",
#     "Driver": "local",
#     "Mountpoint": "/var/lib/docker/volumes/postgres_data/_data",
#     "Labels": {},
#     "Scope": "local"
#   }
# ]

# Remove a volume (must stop containers using it first)
docker volume rm postgres_data

# Remove all unused volumes
docker volume prune

# In Compose — declare volumes at the top level
volumes:
  postgres_data:          # Simplest — uses local driver, Docker-managed path
  uploads:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/uploads   # Custom path on host, but managed as a volume

# Volume naming in Compose: {project}_{volume_name}
# Project 'myapp' + volume 'postgres_data' → myapp_postgres_data

Bind Mounts

# Bind mount: map a specific host path into the container
# Use absolute path (or $(pwd) for current directory)

docker run -d \
  -v /home/user/myapp:/app \        # Absolute path
  -v $(pwd):/app \                  # Current directory
  -v $(pwd)/config:/app/config:ro \ # Read-only
  myapp

# With --mount syntax (more explicit, preferred in scripts):
docker run -d \
  --mount type=bind,source=$(pwd),target=/app \
  --mount type=bind,source=$(pwd)/config,target=/app/config,readonly \
  myapp

# Development pattern: mount source + shadow node_modules
docker run -d \
  -v $(pwd):/app \                  # Mount entire project
  -v /app/node_modules \            # Anonymous volume "shadows" node_modules
  -p 3000:3000 \
  myapp:dev
# The anonymous volume at /app/node_modules prevents the host's node_modules
# (or lack thereof) from overwriting the container's installed dependencies.

# Seeding config files into containers
docker run -d \
  -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro \
  -v $(pwd)/certs:/etc/nginx/certs:ro \
  nginx:alpine

# Collecting logs from containers
docker run -d \
  -v /var/log/myapp:/app/logs \
  myapp
# Host can read /var/log/myapp — useful for log shippers (Filebeat, Fluentd)

tmpfs Mounts

# tmpfs: memory-backed filesystem, not written to disk
# Data is lost when container stops — intentional for security-sensitive temp files

docker run -d \
  --tmpfs /tmp \                    # Simple tmpfs at /tmp
  --tmpfs /run:rw,noexec,nosuid,size=65536k \  # With options
  myapp

# With --mount syntax:
docker run -d \
  --mount type=tmpfs,target=/tmp,tmpfs-size=64m,tmpfs-mode=1777 \
  myapp

# In Compose:
services:
  web:
    tmpfs:
      - /tmp
      - /run
    # Or with options:
    # volumes:
    #   - type: tmpfs
    #     target: /tmp
    #     tmpfs:
    #       size: 67108864   # 64MB in bytes

# Use cases:
# - /tmp scratch space (ephemeral, fast)
# - Session tokens, API keys loaded from secrets at startup
# - Compiled assets that don't need to survive restarts
# - Test databases (SQLite in memory for CI)
# - Any data you explicitly don't want written to disk (compliance)

Volume Drivers

# The local driver stores data on the Docker host.
# Third-party drivers mount external storage (NFS, AWS EFS, Azure Files, GCS).

# NFS volume (share storage across multiple hosts)
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=nfs-server.example.com,rw,nfsvers=4 \
  --opt device=:/exports/data \
  nfs_data

# In Compose:
volumes:
  shared_data:
    driver: local
    driver_opts:
      type: nfs
      o: "addr=nfs-server.example.com,rw,nfsvers=4"
      device: ":/exports/data"

# AWS EFS (Elastic File System) — for ECS or EC2 with Docker
# Use the docker-efs-plugin or mount EFS at a host path first:
# sudo mount -t efs fs-abc123:/ /mnt/efs
docker run -v /mnt/efs/myapp:/app/data myapp

# Popular third-party volume plugins:
# - rexray/ebs  → AWS EBS volumes (block storage, single-host attach)
# - rexray/s3fs → AWS S3 via FUSE (slow but unlimited)
# - portworx     → Enterprise distributed storage
# - longhorn     → Rancher's distributed block storage for K8s

# Install a plugin
docker plugin install store/rexray/ebs:latest
docker plugin ls

Backup and Restore

# Pattern: use a temporary container to access the volume and tar it up

# Backup a named volume to a tar file on the host
docker run --rm \
  -v postgres_data:/data:ro \
  -v $(pwd)/backups:/backup \
  alpine \
  tar czf /backup/postgres_data_$(date +%Y%m%d).tar.gz -C /data .

# Restore from tar back into a volume
docker run --rm \
  -v postgres_data:/data \
  -v $(pwd)/backups:/backup \
  alpine \
  tar xzf /backup/postgres_data_20260614.tar.gz -C /data

# Postgres-specific backup (pg_dump — better than raw file copy for live DB)
# Stop the app first, or use pg_dump for a consistent snapshot
docker exec postgres-container \
  pg_dump -U myuser mydb > backup_$(date +%Y%m%d).sql

# Restore Postgres dump
docker exec -i postgres-container \
  psql -U myuser -d mydb < backup_20260614.sql

# Automated backup script
#!/bin/bash
BACKUP_DIR="/backups"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=7

# Dump Postgres
docker exec myapp-db pg_dumpall -U postgres | \
  gzip > "$BACKUP_DIR/postgres_$DATE.sql.gz"

# Copy to S3
aws s3 cp "$BACKUP_DIR/postgres_$DATE.sql.gz" \
  s3://my-backups/docker/postgres_$DATE.sql.gz

# Delete backups older than 7 days
find "$BACKUP_DIR" -name "*.gz" -mtime +$RETENTION_DAYS -delete

Sharing Between Containers

# Multiple containers can mount the same named volume simultaneously.
# Be careful: only use this for read-sharing or when one writer/many readers.

# Sidecar pattern: main app writes logs, log shipper reads them
services:
  web:
    image: myapp:latest
    volumes:
      - app_logs:/app/logs    # Web writes logs here

  filebeat:
    image: elastic/filebeat:8.14.0
    volumes:
      - app_logs:/app/logs:ro  # Filebeat reads logs (read-only)
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro

volumes:
  app_logs:

# Shared uploads between web and a thumbnail generator
services:
  web:
    volumes:
      - uploads:/app/uploads

  thumbnailer:
    image: myapp-thumbnailer:latest
    volumes:
      - uploads:/data/uploads    # Reads originals, writes thumbnails to same volume

volumes:
  uploads:

# --volumes-from: mount all volumes from another container
# (legacy pattern — prefer named volumes in compose)
docker run -d --name helper --volumes-from web alpine sleep infinity
docker cp helper:/app/uploads ./local-uploads-backup

Production Patterns

# Pattern 1: Never store important data in container's writable layer
# ❌ Bad: database data in container
docker run -d postgres:16-alpine   # Data lost on docker rm!
# ✅ Good: named volume
docker run -d -v pgdata:/var/lib/postgresql/data postgres:16-alpine

# Pattern 2: Use read-only containers where possible
docker run -d \
  --read-only \                          # Container filesystem is read-only
  --tmpfs /tmp \                         # Allow writes only to tmpfs
  --tmpfs /run \
  -v app_logs:/app/logs \                # Allow writes to specific volume
  myapp

# Pattern 3: Volume labels for organisation and cleanup
docker volume create \
  --label project=myapp \
  --label env=production \
  --label created=$(date +%Y-%m-%d) \
  myapp_postgres_data

docker volume ls --filter "label=project=myapp"
docker volume prune --filter "label=env=staging"

# Pattern 4: Init containers for volume initialisation
services:
  init-perms:
    image: alpine
    command: chown -R 1000:1000 /data
    volumes:
      - app_data:/data
    restart: "no"

  web:
    depends_on:
      init-perms:
        condition: service_completed_successfully
    volumes:
      - app_data:/app/data
    user: "1000:1000"

# Pattern 5: Separate data volumes from config volumes
volumes:
  db_data:       # Critical — never prune
  db_config:     # Config — can be regenerated
  app_uploads:   # Critical — back up to S3
  app_cache:     # Safe to prune
Next: Phase 7 — Multi-Stage Builds covers building production-ready slim images using multi-stage Dockerfiles, separating build tools from runtime, and achieving minimal final image sizes.