Kubernetes ImagePullPolicy and Private Registry Setup
Getting container images into Kubernetes nodes sounds straightforward, but ImagePullBackOff is one of the most common errors teams encounter in production. The root causes range from choosing the wrong imagePullPolicy to misconfigured credentials for private registries. This guide covers all three imagePullPolicy values, how to create and attach imagePullSecrets for popular private registries including AWS ECR, GCR, and Docker Hub, and how to set up a pull-through cache to reduce registry egress costs and improve resilience.
Table of Contents
ImagePullPolicy: Always, IfNotPresent, Never
The imagePullPolicy field controls when the kubelet attempts to pull a container image from a registry versus using a locally cached copy on the node.
- Always — kubelet contacts the registry on every pod start to verify the image exists and check if the digest has changed. If the digest matches the local cache, the cached image is used. If not, it pulls. Use this for mutable tags like
latestormain. - IfNotPresent — kubelet only pulls if the image is not already present on the node. This is the default when a specific digest or versioned tag (anything other than
latest) is specified. It is the most efficient policy for immutable, versioned images. - Never — kubelet never attempts a registry pull. The image must be pre-loaded on the node. Used in air-gapped environments or when images are pre-pulled by a DaemonSet.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
template:
spec:
containers:
- name: api-server
image: myrepo/api-server:1.4.2 # versioned tag
imagePullPolicy: IfNotPresent # safe for immutable tags
- name: sidecar
image: myrepo/sidecar:latest
imagePullPolicy: Always # always check for updates on latest
latest tags in production. Pin images to immutable digests (image@sha256:abc123...) or versioned semver tags, and use imagePullPolicy: IfNotPresent. This ensures reproducible deployments and avoids accidental image changes during pod restarts.
Diagnosing ImagePullBackOff
ImagePullBackOff and ErrImagePull are the same underlying problem — Kubernetes cannot pull the image. The events section of kubectl describe pod gives the specific error.
kubectl describe pod failing-pod -n production
# Events:
# Warning Failed kubelet Failed to pull image "myrepo/app:1.0":
# rpc error: code = Unknown
# desc = failed to pull and unpack image:
# failed to resolve reference "myrepo/app:1.0":
# unexpected status code 401 Unauthorized
# Common causes and their Events messages:
# 401 Unauthorized → missing or wrong imagePullSecret
# 403 Forbidden → credentials valid but no pull permission
# not found / 404 → wrong image name or tag does not exist
# timeout → network issue reaching registry, or registry is down
# no space left → node disk is full (check with: df -h on the node)
# Verify the image exists before troubleshooting credentials
docker pull myrepo/app:1.0 # from your local machine with valid credentials
# Check that the secret exists in the right namespace
kubectl get secret my-registry-secret -n production
# Verify secret content (base64 decoded)
kubectl get secret my-registry-secret -n production \
-o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq .
Creating imagePullSecrets for Docker Hub
A Kubernetes imagePullSecret is a Secret of type kubernetes.io/dockerconfigjson containing base64-encoded registry credentials. The simplest way to create one is with kubectl create secret docker-registry.
# Create a Docker Hub pull secret
kubectl create secret docker-registry dockerhub-secret \
--docker-server=https://index.docker.io/v1/ \
--docker-username=myuser \
--docker-password=my-access-token \
--docker-email=ops@example.com \
-n production
# Verify
kubectl get secret dockerhub-secret -n production -o yaml
# Reference in pod spec
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
spec:
template:
spec:
imagePullSecrets:
- name: dockerhub-secret
containers:
- name: app
image: myorg/myapp:1.2.0
AWS ECR: Rotating Credentials Automatically
AWS ECR tokens expire every 12 hours, which means a static Kubernetes Secret created with aws ecr get-login-password will stop working half a day after creation. The solution is automated token rotation using a CronJob or the aws-ecr-credential-helper.
# One-time manual creation (expires in 12h — not for production)
kubectl create secret docker-registry ecr-secret \
--docker-server=123456789.dkr.ecr.us-east-1.amazonaws.com \
--docker-username=AWS \
--docker-password=$(aws ecr get-login-password --region us-east-1) \
-n production
# Production: CronJob that refreshes the ECR secret every 6 hours
apiVersion: batch/v1
kind: CronJob
metadata:
name: ecr-token-refresh
namespace: production
spec:
schedule: "0 */6 * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: ecr-refresher
containers:
- name: refresh
image: amazon/aws-cli:latest
command:
- /bin/sh
- -c
- |
TOKEN=$(aws ecr get-login-password --region us-east-1)
kubectl create secret docker-registry ecr-secret \
--docker-server=123456789.dkr.ecr.us-east-1.amazonaws.com \
--docker-username=AWS \
--docker-password=$TOKEN \
--namespace=production \
--dry-run=client -o yaml | kubectl apply -f -
restartPolicy: OnFailure
For EKS, the preferred approach is to use IRSA (IAM Roles for Service Accounts) — annotate the node's EC2 instance profile or service account with an IAM role that has ECR pull permissions, and ECR authentication happens transparently without any Secrets.
GCR and Artifact Registry
For Google Container Registry (GCR) and Artifact Registry on GKE, Workload Identity is the recommended approach — it eliminates service account key files entirely. For non-GKE clusters, use a JSON service account key.
# Create secret from GCP service account key file
kubectl create secret docker-registry gcr-secret \
--docker-server=us-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat my-sa-key.json)" \
--docker-email=sa@my-project.iam.gserviceaccount.com \
-n production
# For GCR (older):
# --docker-server=gcr.io
Attaching Secrets to ServiceAccounts
Instead of adding imagePullSecrets to every pod spec, you can attach the secret to the namespace's default ServiceAccount. Kubernetes then automatically injects the pull secret into every pod that uses that ServiceAccount.
# Patch the default ServiceAccount to include the pull secret
kubectl patch serviceaccount default \
-n production \
-p '{"imagePullSecrets": [{"name": "ecr-secret"}]}'
# Verify
kubectl get serviceaccount default -n production -o yaml
# imagePullSecrets:
# - name: ecr-secret
For namespaces that contain many deployments pulling from the same registry, this approach saves considerable repetition. The downside is that every pod in the namespace — including pods you did not intend — gets the pull credentials injected. For stricter security, create a dedicated ServiceAccount with the pull secret and reference it only in the pods that need registry access.
Pull-Through Cache with Harbor or ECR
A pull-through cache proxies image pulls from upstream registries and caches them locally. This reduces external bandwidth costs, speeds up pod startup (images are served from within the cluster VPC), and provides resilience when the upstream registry has an outage.
# Harbor project configured as a Docker Hub proxy
# In Harbor UI: New Project > Enable "Proxy Cache" > Target: Docker Hub
# Once Harbor is configured, reference images via your Harbor hostname:
# harbor.internal.example.com/dockerhub-cache/library/nginx:1.25
# ECR pull-through cache for Docker Hub images
# aws ecr create-pull-through-cache-rule \
# --ecr-repository-prefix dockerhub \
# --upstream-registry-url registry-1.docker.io \
# --region us-east-1
# Then use images as:
# 123456789.dkr.ecr.us-east-1.amazonaws.com/dockerhub/library/nginx:1.25
crictl pull for your key images. This is especially useful for large base images like Java or Python that take 30+ seconds to pull.
Production Tips
- Pin image digests in CI/CD — resolve the tag to a digest at build time and write the digest into the Deployment manifest. This guarantees exactly the same bits run in every environment.
- Namespace-scoped secrets — imagePullSecrets must exist in the same namespace as the pod that references them. Copy secrets to all relevant namespaces or use a controller like
kubernetes-reflectorto sync secrets across namespaces automatically. - Image size matters — large images slow down node scale-outs. Use multi-stage builds and distroless or Alpine base images. A 50 MB image starts in seconds; a 2 GB image can take minutes on a cold node.
- Rate limiting — Docker Hub limits unauthenticated pulls to 100/6h per IP and authenticated pulls to 200/6h for free accounts. In clusters with many nodes, all nodes share the same IP egress, hitting limits quickly. Use authenticated pulls or a pull-through cache.
# Check Docker Hub rate limit remaining
TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | jq -r .token)
curl -s --head -H "Authorization: Bearer $TOKEN" \
https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest \
| grep -i ratelimit