Dockerfile Best Practices: Layers, Caching and Build Optimization (2026)
A Dockerfile is a recipe: each instruction creates a layer, layers are cached, and cache misses cascade — invalidating everything below the changed line. Understanding this model unlocks dramatic speedups: a well-ordered Dockerfile that changes once a week rebuilds in seconds; a poorly-ordered one rebuilds everything on every source change. This phase covers instruction ordering, .dockerignore, the differences between COPY/ADD and ARG/ENV, health checks, running as a non-root user, choosing slim base images, and BuildKit's cache mount feature for package managers.
Table of Contents
Core Instructions
# Every Dockerfile instruction and what it does:
FROM node:20-alpine # Base image — must be first (except ARG before FROM)
LABEL maintainer="team@co" # Metadata — doesn't add size
WORKDIR /app # Set working dir (creates it if missing); use absolute paths
COPY package*.json ./ # Copy files from build context into image
ADD archive.tar.gz /data/ # Like COPY but auto-extracts tarballs and supports URLs
RUN npm ci # Execute command — creates a new layer
# Chain commands to avoid extra layers:
RUN apt-get update \
&& apt-get install -y curl \
&& rm -rf /var/lib/apt/lists/* # Always clean apt cache in same RUN
ARG NODE_ENV=production # Build-time variable (not in final image)
ENV PORT=3000 # Runtime env var (persists in image)
EXPOSE 3000 # Document which port the app listens on (informational only)
COPY . . # Copy remaining source (after deps — see caching section)
USER node # Switch to non-root user
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
ENTRYPOINT ["node"] # Fixed executable — cannot be overridden without --entrypoint
CMD ["server.js"] # Default arguments to ENTRYPOINT (or default command if no ENTRYPOINT)
# CMD can be overridden: docker run myapp node other.js
Layer Caching Strategy
# The golden rule: put things that change LESS OFTEN higher in the file.
# A cache miss invalidates ALL layers below it.
#
# Change frequency (least → most):
# Base image → system deps → package.json → source code
# ❌ Bad: COPY . . before npm install
# Any source file change → npm install re-runs
FROM node:20-alpine
WORKDIR /app
COPY . . # Source change invalidates everything below
RUN npm ci # Re-runs on every source change — slow!
CMD ["node", "server.js"]
# ✅ Good: copy package files first, install, then copy source
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./ # Only changes when deps change
RUN npm ci # Cached until package*.json changes
COPY . . # Source changes here — cheap layer
CMD ["node", "server.js"]
# Same pattern for Python
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt . # Cache pip install separately
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "main.py"]
# Same pattern for Go
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./ # Cache module download separately
RUN go mod download
COPY . .
RUN go build -o /app/server ./cmd/server
.dockerignore
# .dockerignore works like .gitignore — excludes files from the build context.
# The build context is sent to the Docker daemon on every build.
# Without .dockerignore: node_modules (500MB+) gets sent every time.
# .dockerignore
node_modules
.git
.gitignore
.env
.env.*
*.log
npm-debug.log*
coverage/
.nyc_output/
dist/ # Built output (let Docker build it)
.DS_Store
Thumbs.db
README.md
docs/
*.test.ts
*.spec.ts
__tests__/
.eslintrc*
.prettierrc*
docker-compose*.yml
Dockerfile* # Exclude Dockerfiles themselves from context
# For Python projects
__pycache__/
*.pyc
*.pyo
.venv/
venv/
.pytest_cache/
.mypy_cache/
*.egg-info/
# Check what's in your build context
# (build to /dev/null to see only the context transfer)
docker build --no-cache -t test . 2>&1 | head -5
# Sending build context to Docker daemon 1.23MB ← should be small
COPY vs ADD
# COPY: simple, explicit, preferred for most cases
COPY src/ /app/src/
COPY package.json package-lock.json ./
COPY --chown=node:node . . # Copy with ownership (avoids RUN chown layer)
# ADD: use only for these two specific cases:
# 1. Auto-extracting local tarballs
ADD app.tar.gz /app/ # Extracts into /app/ automatically
# 2. Fetching from a URL (but prefer RUN curl/wget for better caching control)
ADD https://example.com/file.txt /tmp/file.txt # Avoid — no cache control
# COPY --from: copy from another stage or image
COPY --from=builder /app/dist ./dist
COPY --from=nginx:alpine /etc/nginx/nginx.conf /etc/nginx/nginx.conf
# COPY with glob patterns
COPY *.json ./
COPY src/**/*.ts ./src/
# Always prefer COPY over ADD unless you specifically need tar extraction.
# ADD's URL fetching bypasses the layer cache (always re-fetches).
ARG vs ENV
# ARG: build-time only — available during docker build, NOT at runtime
# ENV: runtime — available during build AND when the container runs
# ARG before FROM: controls which base image to use
ARG BASE_TAG=20-alpine
FROM node:${BASE_TAG}
# ARG for build customization
ARG NODE_ENV=production
ARG BUILD_DATE
ARG GIT_SHA
# Use ARG value to set a label (metadata)
LABEL build-date=${BUILD_DATE} git-sha=${GIT_SHA}
# ARG → ENV: promote a build arg to a runtime env var
ARG APP_VERSION
ENV APP_VERSION=${APP_VERSION}
# WARNING: ARG values are visible in docker history — don't use for secrets!
# docker history myimage --no-trunc ← shows all ARG values
# For secrets during build, use BuildKit secret mounts (see below).
# ENV sets runtime config
ENV NODE_ENV=production \
PORT=3000 \
LOG_LEVEL=info
# Override at runtime
docker run -e LOG_LEVEL=debug myapp # Overrides the ENV default
# Check which ENV vars an image sets
docker inspect myimage --format='{{json .Config.Env}}' | jq .
Non-Root User
# By default containers run as root (uid 0).
# Root in a container can become root on the host if container escape occurs.
# Always switch to a non-root user before CMD/ENTRYPOINT.
# Node.js — node:* images include a built-in 'node' user (uid 1000)
FROM node:20-alpine
WORKDIR /app
COPY --chown=node:node package*.json ./
RUN npm ci
COPY --chown=node:node . .
USER node # Switch to non-root
CMD ["node", "server.js"]
# Python / generic — create a dedicated user
FROM python:3.12-slim
RUN groupadd --gid 1001 appuser \
&& useradd --uid 1001 --gid appuser --shell /bin/bash --create-home appuser
WORKDIR /app
COPY --chown=appuser:appuser requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY --chown=appuser:appuser . .
USER appuser
CMD ["python", "main.py"]
# Alpine — addgroup/adduser (BusyBox syntax)
FROM alpine:3.19
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
# Verify: the running process should show non-root uid
# docker exec mycontainer whoami → node (or appuser)
# docker exec mycontainer id → uid=1000(node) gid=1000(node)
HEALTHCHECK
# HEALTHCHECK tells Docker how to test if a container is working.
# Used by Docker Compose (depends_on: condition: service_healthy)
# and orchestrators (K8s uses its own probes, but Docker Swarm uses HEALTHCHECK).
# HTTP health check
HEALTHCHECK --interval=30s \
--timeout=5s \
--start-period=10s \
--retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
# --interval: how often to run the check (default 30s)
# --timeout: how long before the check is considered failed (default 30s)
# --start-period: grace period after container start (default 0s)
# --retries: consecutive failures before status = unhealthy (default 3)
# TCP check (when curl isn't available)
HEALTHCHECK CMD nc -z localhost 5432 || exit 1
# Using wget (Alpine doesn't have curl by default)
HEALTHCHECK CMD wget -qO- http://localhost:3000/health || exit 1
# Check health status
docker ps # Shows health column: healthy/unhealthy/starting
docker inspect --format='{{.State.Health.Status}}' mycontainer
docker inspect --format='{{json .State.Health}}' mycontainer | jq .
# Expose a /health endpoint in your app:
# GET /health → 200 OK {"status":"ok","uptime":1234}
# Return non-200 if the app is degraded (DB unreachable, etc.)
BuildKit Cache Mounts
# BuildKit (default since Docker 23) adds RUN --mount for advanced caching.
# Cache mounts persist the directory between builds — package manager caches
# don't get blown away on each build, making rebuilds dramatically faster.
# syntax=docker/dockerfile:1 ← Pin to specific Dockerfile syntax version
# npm cache mount
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci
COPY . .
CMD ["node", "server.js"]
# pip cache mount (Python)
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
COPY . .
# apt cache mount (Debian/Ubuntu — also lock the apt lists)
FROM debian:bookworm-slim
RUN --mount=type=cache,target=/var/cache/apt \
--mount=type=cache,target=/var/lib/apt \
apt-get update \
&& apt-get install -y --no-install-recommends curl ca-certificates
# Go module cache
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod \
go mod download
COPY . .
RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
go build -o /app/server ./cmd/server
# Secret mount — never ends up in any layer or docker history
RUN --mount=type=secret,id=npm_token \
NPM_TOKEN=$(cat /run/secrets/npm_token) npm install
# Build: docker build --secret id=npm_token,src=.npmrc .
Next: Phase 4 — Docker Compose covers multi-container applications: service definitions, dependency ordering, environment file management, networks, volumes and override files for dev vs production.