Kubernetes Deployments: Rolling Updates, Rollbacks and Scaling (2026)
Deployments are the workhorse of Kubernetes for running stateless applications. They wrap ReplicaSets with rich update semantics — declarative rolling updates, one-command rollbacks, revision history, and horizontal scaling. Understanding every knob on the Deployment spec separates engineers who can confidently ship to production from those who cause incidents during deploys.
Table of Contents
Deployment Spec Deep Dive
A Deployment manages a desired number of identical pod replicas. Under the hood it creates and manages ReplicaSets — when you update the pod template, a new ReplicaSet is created and the old one is scaled down.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
namespace: production
labels:
app: api-server
spec:
replicas: 4
# Pods are selected by this label — must match pod template labels
selector:
matchLabels:
app: api-server
# Retain last 10 ReplicaSets for rollback history
revisionHistoryLimit: 10
# How long to wait for a pod to become ready before marking the rollout as failed
progressDeadlineSeconds: 600
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Allow 1 extra pod above replicas during update
maxUnavailable: 0 # Never reduce below desired replica count
template:
metadata:
labels:
app: api-server
annotations:
# Trigger a rolling restart when a ConfigMap changes by updating this annotation
checksum/config: "abc123"
spec:
containers:
- name: api
image: myrepo/api-server:2.1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
livenessProbe:
httpGet:
path: /health/live
port: 8080
periodSeconds: 10
failureThreshold: 3
Rolling Update Strategy
The rolling update strategy replaces old pods with new ones gradually, keeping the application available throughout. The two key parameters are:
- maxSurge — how many extra pods above
replicascan exist during the update. Can be an integer or percentage (e.g., 25%). - maxUnavailable — how many pods below
replicasare acceptable during the update. Can also be a percentage.
maxUnavailable: 0 and maxSurge: 1 (or 25%). This ensures your service never drops below full capacity during a deploy. The trade-off is you need capacity for one extra pod temporarily. For cost-sensitive clusters where capacity is tight, use maxSurge: 0, maxUnavailable: 1 — this does a replace-before-terminate cycle.
# Trigger a rolling update by changing the image
kubectl set image deployment/api-server api=myrepo/api-server:2.2.0 -n production
# Watch the rollout progress
kubectl rollout status deployment/api-server -n production
# Waiting for deployment "api-server" rollout to finish: 1 out of 4 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 2 out of 4 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 3 out of 4 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 1 old replicas are pending termination...
# deployment "api-server" successfully rolled out
# See which ReplicaSets exist (old ones are scaled to 0)
kubectl get rs -n production -l app=api-server
# NAME DESIRED CURRENT READY AGE
# api-server-7d9f8b6c4d 4 4 4 2m <-- new
# api-server-6c8b7a5d3e 0 0 0 3d <-- old
Rollbacks and History
Kubernetes keeps a configurable number of old ReplicaSets to enable instant rollbacks. No re-building of images required — the old ReplicaSet already has the correct pod template.
# View rollout history
kubectl rollout history deployment/api-server -n production
# REVISION CHANGE-CAUSE
# 1 Initial deployment
# 2 Update to v2.1.0
# 3 Update to v2.2.0
# Annotate the reason for a change (shows up in history)
kubectl annotate deployment/api-server kubernetes.io/change-cause="Update to v2.2.0" -n production
# Inspect a specific revision
kubectl rollout history deployment/api-server --revision=2 -n production
# Rollback to previous revision
kubectl rollout undo deployment/api-server -n production
# Rollback to a specific revision
kubectl rollout undo deployment/api-server --to-revision=1 -n production
# Watch the rollback
kubectl rollout status deployment/api-server -n production
revisionHistoryLimit defaults to 10. Each retained revision keeps a ReplicaSet object in etcd. For clusters with many deployments and frequent deploys, reduce this to 3–5 to keep etcd lean.
Manual and Automatic Scaling
Manual Scaling
# Scale a deployment
kubectl scale deployment/api-server --replicas=8 -n production
# Conditional scale (only if current replicas match)
kubectl scale deployment/api-server --replicas=8 --current-replicas=4 -n production
Horizontal Pod Autoscaler (HPA)
HPA automatically adjusts replicas based on observed metrics. The most common trigger is CPU utilisation, but custom metrics (from Prometheus) and external metrics are also supported.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 4
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60 # target 60% CPU across all pods
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # wait 5 min before scaling down
policies:
- type: Pods
value: 2
periodSeconds: 60 # remove at most 2 pods per minute
scaleUp:
stabilizationWindowSeconds: 0 # scale up immediately
policies:
- type: Pods
value: 4
periodSeconds: 60 # add at most 4 pods per minute
# Check HPA status
kubectl get hpa -n production
# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
# api-server-hpa Deployment/api-server 45%/60%, 30%/70% 4 20 4
Recreate vs Rolling Update
The Recreate strategy terminates all existing pods before creating new ones. This causes downtime but is necessary when your new version is incompatible with the old one (e.g., incompatible database schema changes that cannot run with the old code simultaneously).
strategy:
type: Recreate
# No rollingUpdate block needed
Use Recreate for:
- Stateful apps that cannot run two versions concurrently
- Apps that grab an exclusive lock on startup
- Batch jobs where a clean slate is required
Canary Deployment Pattern
A canary release routes a small percentage of traffic to the new version while the majority still goes to the stable version. In Kubernetes without a service mesh, you approximate this with two Deployments sharing the same Service selector.
# stable deployment — 9 replicas
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server-stable
namespace: production
spec:
replicas: 9
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
track: stable
spec:
containers:
- name: api
image: myrepo/api-server:2.1.0
---
# canary deployment — 1 replica (~10% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server-canary
namespace: production
spec:
replicas: 1
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
track: canary
spec:
containers:
- name: api
image: myrepo/api-server:2.2.0
---
# Single Service selects pods from BOTH deployments
apiVersion: v1
kind: Service
metadata:
name: api-server
namespace: production
spec:
selector:
app: api-server # matches both stable and canary pods
ports:
- port: 80
targetPort: 8080
Blue/Green Deployments
In a blue/green deployment, both versions run simultaneously but only one receives traffic. Switching over is instant — you just update the Service selector. If the new version has issues, switching back is equally instant.
# Blue deployment (currently live)
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server-blue
namespace: production
spec:
replicas: 4
selector:
matchLabels:
app: api-server
slot: blue
template:
metadata:
labels:
app: api-server
slot: blue
spec:
containers:
- name: api
image: myrepo/api-server:2.1.0
---
# Green deployment (new version, warming up)
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server-green
namespace: production
spec:
replicas: 4
selector:
matchLabels:
app: api-server
slot: green
template:
metadata:
labels:
app: api-server
slot: green
spec:
containers:
- name: api
image: myrepo/api-server:2.2.0
---
# Service — switch by changing slot: blue to slot: green
apiVersion: v1
kind: Service
metadata:
name: api-server
namespace: production
spec:
selector:
app: api-server
slot: blue # change to 'green' to cut over
ports:
- port: 80
targetPort: 8080
# Instant cutover: patch the Service selector
kubectl patch service api-server -n production \
-p '{"spec":{"selector":{"app":"api-server","slot":"green"}}}'
# Instant rollback if issues found
kubectl patch service api-server -n production \
-p '{"spec":{"selector":{"app":"api-server","slot":"blue"}}}'
Frequently Asked Questions
What is the difference between a Deployment and a ReplicaSet?
A ReplicaSet ensures a specified number of pod replicas are running at all times, but it has no concept of update history or rolling updates. A Deployment wraps and manages ReplicaSets, adding declarative update semantics, rollback capability, and revision history. You should almost always use Deployments rather than creating ReplicaSets directly.
How do I force a rolling restart without changing the image?
Use kubectl rollout restart deployment/api-server -n production. This triggers a rolling restart by adding a kubectl.kubernetes.io/restartedAt annotation to the pod template, which counts as a template change and triggers a new rollout.
My deployment is stuck with some old pods still running. What do I check?
First run kubectl rollout status deployment/api-server to see if it's progressing. Then check kubectl describe deployment/api-server for conditions. Common causes: new pods failing readiness probes (so the rollout doesn't proceed), insufficient cluster resources to schedule new pods, or an image pull error. Check new pod logs with kubectl logs and events with kubectl get events.
Can HPA and manual scaling conflict?
Yes. If you manually set replicas with kubectl scale and HPA is active, HPA will override your manual setting on its next evaluation cycle (every 15 seconds by default). Treat HPA as the authority for replica count when it's enabled. If you need to temporarily override it (e.g., to scale to zero during maintenance), pause the HPA first.
What is progressDeadlineSeconds and why does it matter?
progressDeadlineSeconds (default 600 seconds) is the maximum time Kubernetes will wait for a Deployment rollout to make progress before marking it as failed with a ProgressDeadlineExceeded condition. This is important for CI/CD pipelines — you can use kubectl rollout status --timeout=10m to block the pipeline until the rollout succeeds or fails, rather than polling indefinitely.