Kubernetes Pod Disruption Budgets: High Availability Guide (2026)

Understanding Kubernetes Disruptions

In a production Kubernetes cluster, pods are evicted and rescheduled constantly. Not all of these events are equal — some are planned by operators and some are imposed by the infrastructure. Kubernetes formalises this distinction as voluntary and involuntary disruptions.

Voluntary Disruptions

Voluntary disruptions are operator-initiated actions that intentionally remove pods from a node:

Node drain — kubectl drain evicts all pods before maintenance or decommissioning.
Node upgrades — rolling OS or Kubernetes version upgrades drain nodes one at a time.
Cluster autoscaler scale-down — the autoscaler evicts pods to consolidate workloads onto fewer nodes and reduce cost.
Deployment rollouts — new pod versions replace old ones; this is voluntary at the application layer.
Manual pod deletion — kubectl delete pod for debugging or forced restarts.

Involuntary Disruptions

Involuntary disruptions are caused by failures outside operator control:

Node hardware failure — a physical or virtual machine dies unexpectedly.
Kernel panic or OS crash — the node becomes NotReady and Kubernetes evicts its pods.
Out-of-memory (OOM) kill — the kubelet kills pods that breach their memory limits.
Network partition — a node becomes unreachable and its pods are eventually evicted.
Cloud provider preemption — spot/preemptible instances are reclaimed by the cloud provider.

Key insight: Pod Disruption Budgets only protect against voluntary disruptions. They have no effect on involuntary disruptions such as node hardware failures. PDBs work in combination with replica counts and topology spread to achieve full HA.

PDB Spec: minAvailable vs maxUnavailable

A PodDisruptionBudget is a namespaced resource that tells the Kubernetes Eviction API how many pods of a given selector must remain healthy during a voluntary disruption. The spec has two mutually exclusive fields — use exactly one.

minAvailable

Specifies the minimum number (or percentage) of pods that must remain available after the disruption. If evicting a pod would drop availability below this threshold, the eviction is denied.

# Absolute number: at least 2 pods must always be running
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
  namespace: production
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: web-frontend

# Percentage: at least 75% of pods must always be running
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb-pct
  namespace: production
spec:
  minAvailable: "75%"
  selector:
    matchLabels:
      app: web-frontend

maxUnavailable

Specifies the maximum number (or percentage) of pods that may be unavailable at any time. This is the mirror of minAvailable and is often more intuitive when you think in terms of "how much can I take down at once?"

# Allow at most 1 pod to be unavailable at a time
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
  namespace: production
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: api-service

# Allow at most 25% of pods to be unavailable
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb-pct
  namespace: production
spec:
  maxUnavailable: "25%"
  selector:
    matchLabels:
      app: api-service

Percentages are rounded down for minAvailable and rounded down for maxUnavailable. For a 4-replica deployment with minAvailable: "75%", Kubernetes requires at least 3 pods (floor(4 × 0.75) = 3), so at most 1 can be evicted at a time.

API Version

From Kubernetes 1.21+, policy/v1 is the stable API. The older policy/v1beta1 was removed in 1.25. Always use policy/v1 for new PDBs.

PDB with Deployments: Protecting a Web Service

The most common use case is protecting a stateless web or API service running as a Deployment. Here is a complete example for a 3-replica frontend deployment. See our Kubernetes Deployments guide for deployment fundamentals.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-frontend
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-frontend
  template:
    metadata:
      labels:
        app: web-frontend
    spec:
      containers:
      - name: frontend
        image: myregistry/frontend:v2.4.1
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10

# pdb.yaml — companion PDB for the deployment above
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-frontend-pdb
  namespace: production
spec:
  minAvailable: 2          # always keep at least 2 of 3 pods running
  selector:
    matchLabels:
      app: web-frontend

With this configuration, a node drain can evict at most 1 frontend pod before it must wait for the replacement pod to become ready. This guarantees two-thirds of capacity during any single maintenance window.

Readiness probes are essential. A pod counts as "available" in PDB calculations only when its readiness probe passes. Always define a readiness probe on pods protected by a PDB, otherwise the budget may count not-yet-ready replacement pods and allow more disruptions than intended.

PDB with StatefulSets: Database Quorum Protection

Stateful workloads like databases, message brokers, and distributed caches require quorum — a majority of members must be available for the cluster to remain writable. A PDB is critical here. See our StatefulSets guide for the underlying concepts.

# statefulset.yaml (3-node database cluster)
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres-cluster
  namespace: data
spec:
  serviceName: postgres
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:16
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password

# pdb-database.yaml — quorum guard for 3-node Postgres
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: postgres-pdb
  namespace: data
spec:
  minAvailable: 2          # quorum = majority of 3 = 2
  selector:
    matchLabels:
      app: postgres

For a 5-node cluster (e.g., etcd or Kafka), quorum is 3, so set minAvailable: 3. Never set minAvailable below the quorum threshold or you risk split-brain during a node drain.

Kafka tip: Kafka brokers should set minAvailable to match the replication factor of your most critical topics. If replication.factor=3 and min.insync.replicas=2, set minAvailable: 2 so at least 2 ISR brokers are always up.

kubectl drain and the Eviction API

When you run kubectl drain <node>, Kubernetes does not simply delete pods. It calls the Eviction API for each pod, which enforces all active PDBs before proceeding.

Eviction Flow

kubectl sends an Eviction object to the API server for a pod.
The API server checks all PDBs whose selector matches the pod.
If evicting the pod would violate any PDB, the API server returns 429 Too Many Requests.
kubectl retries the eviction on a backoff until the PDB allows it (e.g., when a replacement pod becomes ready).
Once allowed, the pod is gracefully terminated (SIGTERM, then SIGKILL after terminationGracePeriodSeconds).

The --force Flag

kubectl drain --force bypasses PDB checks and deletes pods immediately. Never use this in production unless you fully understand the consequences — it can take a quorum-sensitive service offline instantly.

# Safe drain (respects PDBs)
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data

# Dangerous drain (ignores PDBs — production risk!)
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data --force

Warning: --force also deletes pods not managed by a controller (bare pods). These pods will NOT be rescheduled anywhere. Only use --force in controlled recovery scenarios where data loss is acceptable.

Node Upgrade Workflow: Cordon → Drain → Upgrade → Uncordon

The standard pattern for zero-downtime node maintenance is a four-step process. Here is a complete bash script you can adapt for your cluster:

#!/bin/bash
# node-upgrade.sh — safe rolling node upgrade respecting PDBs
set -euo pipefail

NODE="$1"
if [ -z "$NODE" ]; then
  echo "Usage: $0 <node-name>"
  exit 1
fi

echo "=== Step 1: Cordon node (mark unschedulable) ==="
kubectl cordon "$NODE"

echo "=== Step 2: Drain node (evict pods, respects PDBs) ==="
kubectl drain "$NODE" \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --grace-period=60 \
  --timeout=300s

echo "=== Step 3: Perform maintenance ==="
echo "Node $NODE is now empty. Perform your upgrade here."
echo "Press ENTER when upgrade is complete..."
read -r

echo "=== Step 4: Uncordon node (mark schedulable again) ==="
kubectl uncordon "$NODE"

echo "=== Done! Node $NODE is back in rotation ==="
kubectl get node "$NODE"

The --timeout=300s flag sets a 5-minute deadline for the entire drain. If PDBs prevent all evictions within this window, the drain fails gracefully rather than hanging indefinitely. Tune the timeout based on your pod startup time.

Pro tip: Always cordon before draining. Cordoning prevents new pods from being scheduled onto the node while you wait for existing pods to be evicted. Without it, the scheduler may place replacement pods back on the very node you are trying to drain.

Cluster Autoscaler and PDB Interaction

The Cluster Autoscaler (CA) scales down nodes by evicting pods and terminating idle nodes. PDBs directly affect this process — if a PDB would be violated by evicting a pod on a candidate scale-down node, the CA skips that node.

How CA Respects PDBs

CA calls the same Eviction API as kubectl drain.
If any pod on the candidate node is protected by a PDB that would be violated, the node is marked "not safe to remove" and skipped for that cycle.
CA retries scale-down on the next cycle (typically every 10 minutes).

The safe-to-evict Annotation

For pods that are safe to evict regardless of PDB (e.g., batch jobs, log shippers), add the following annotation to allow CA to proceed without waiting:

# Allow CA to evict this pod even if a PDB would block it
apiVersion: v1
kind: Pod
metadata:
  name: log-shipper
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
spec:
  containers:
  - name: fluentd
    image: fluentd:v1.16

Conversely, to prevent CA from ever evicting a pod on a critical node:

cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

Note: Setting safe-to-evict: "false" on all pods on a node will permanently block CA from scaling that node down. Use sparingly — it can lead to wasted cloud spend on idle nodes.

For more on resource-aware scheduling, see our Kubernetes Resource Management guide.

Reading PDB Status: disruptionsAllowed, currentHealthy, desiredHealthy

After applying a PDB, use kubectl get pdb to observe its live status. Understanding these fields is essential for diagnosing why a drain is stuck.

# List all PDBs in a namespace
kubectl get pdb -n production

# Detailed view including status
kubectl get pdb web-frontend-pdb -n production -o yaml

Example output:

NAME                 MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
web-frontend-pdb     2               N/A               1                     3d

The key status fields in the YAML output:

status:
  conditions:
  - lastTransitionTime: "2026-06-10T08:00:00Z"
    message: ""
    reason: SufficientPods
    status: "True"
    type: DisruptionAllowed
  currentHealthy: 3       # pods currently passing readiness probe
  desiredHealthy: 2       # minimum required healthy pods (= minAvailable)
  disruptionsAllowed: 1   # how many pods can be evicted right now
  expectedPods: 3         # total pods matched by selector
  observedGeneration: 1

Field meanings:

currentHealthy — pods that are Running and passing their readiness probe.
desiredHealthy — the floor derived from your minAvailable or maxUnavailable spec.
disruptionsAllowed — currentHealthy - desiredHealthy. When this is 0, all evictions are blocked.
expectedPods — the total pods matched by the PDB selector, as reported by the controller.

Debugging stuck drains: If kubectl drain hangs, run kubectl get pdb -n <ns> and look for ALLOWED DISRUPTIONS: 0. Then check why currentHealthy is at or below desiredHealthy — usually a pod is stuck in Pending or failing its readiness probe.

Common Pitfall: PDB with a Single Replica

The most frequent misconfiguration seen in production clusters is applying minAvailable: 1 to a single-replica Deployment. This seems harmless but creates a permanent drain blocker.

# DANGEROUS: single-replica deployment with minAvailable: 1
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: single-app-pdb
spec:
  minAvailable: 1     # requires 1 pod, but there is only 1 pod total
  selector:
    matchLabels:
      app: single-app

With 1 replica and minAvailable: 1, disruptionsAllowed is always 0. No eviction is ever permitted. Node drains will hang indefinitely waiting for this pod to be evictable — which it never will be.

Solutions:

Scale the deployment to at least 2 replicas so the math works: currentHealthy(2) - desiredHealthy(1) = 1 disruption allowed.
Use maxUnavailable: 1 instead of minAvailable: 1. For a single-replica deployment this still allows 0 disruptions, but at least expresses intent correctly.
Remove the PDB entirely if the workload does not require HA and can tolerate brief downtime during maintenance.

Warning: The same trap applies to percentage-based budgets. minAvailable: "100%" on any deployment is equivalent — it permanently blocks all voluntary disruptions. Only use 100% if you truly require zero tolerance for pod eviction.

PDB for Jobs and CronJobs

PDBs can technically be applied to pods created by Jobs and CronJobs, but the semantics are different and often counter-productive.

When PDB Applies to Jobs

If a Job pod is running and a PDB with a matching selector is present, eviction of that pod will be gated by the budget.
This means a node drain may be blocked until a Job pod completes — which could be hours or days for long-running batch jobs.

When PDB Does NOT Apply

Completed pods (phase Succeeded or Failed) are not counted and are not subject to PDB eviction checks.
CronJob-spawned pods follow the same rules as regular Job pods once running.

Recommendation: For batch Jobs, prefer the safe-to-evict: "true" annotation over a PDB. For long-running Jobs that must complete without interruption, set a PDB with minAvailable: 1 but also scale the Job's parallelism so disruptionsAllowed > 0.

# Job with safe-to-evict — allows CA to evict without a PDB
apiVersion: batch/v1
kind: Job
metadata:
  name: data-migration
spec:
  template:
    metadata:
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
    spec:
      restartPolicy: OnFailure
      containers:
      - name: migrator
        image: myregistry/migrator:v1.2

Testing PDBs: Simulate a Drain in Dev

Before relying on a PDB in production, validate it in a staging or dev cluster. The goal is to verify that traffic is not dropped during a drain.

Step 1: Start a continuous curl loop

# In one terminal — fire requests every 0.5 seconds and log failures
while true; do
  HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://your-service-endpoint/healthz)
  if [ "$HTTP_CODE" != "200" ]; then
    echo "$(date) — FAILED: HTTP $HTTP_CODE"
  else
    echo "$(date) — OK: $HTTP_CODE"
  fi
  sleep 0.5
done

Step 2: Drain a node while the curl loop runs

# In a second terminal — drain the node
kubectl drain dev-node-1 \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --grace-period=30 \
  --timeout=120s

Step 3: Observe

With a correctly configured PDB and readiness probe, the curl loop should show no failures. The drain will proceed one pod at a time, each replacement becoming ready before the next eviction is allowed.

Also verify with kubectl: In a third terminal, run kubectl get pods -n production -w to watch pod transitions in real time. You should see each old pod Terminating only after a new pod reaches Running/Ready.

Multi-Zone HA: PDB + topologySpreadConstraints

A PDB alone does not guarantee zone-level availability. If all replicas happen to land on nodes in the same availability zone, a single zone failure takes down your entire service — regardless of PDB. Combine PDB with topologySpreadConstraints to enforce zone spread. See also our Affinity Guide and Taints & Tolerations guide.

# Full HA deployment: 6 replicas spread across 3 zones + PDB
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ha-web-service
  namespace: production
spec:
  replicas: 6
  selector:
    matchLabels:
      app: ha-web
  template:
    metadata:
      labels:
        app: ha-web
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: ha-web
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: ha-web
      containers:
      - name: web
        image: myregistry/web:v3.1
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          periodSeconds: 5

# PDB companion — allow at most 1 disruption at a time
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: ha-web-pdb
  namespace: production
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: ha-web

With this setup:

topologySpreadConstraints ensures no single zone or node has more than 1 extra pod compared to others (maxSkew: 1).
The PDB ensures at most 1 pod is evicted at a time during any drain operation.
Even if an entire AZ fails (involuntary), 4 of 6 pods survive across the remaining 2 zones — enough to serve traffic.

HPA + PDB + topology spread is the gold standard for stateless HA in Kubernetes. Add Horizontal Pod Autoscaling to dynamically adjust replicas under load. See our HPA Scaling guide for details.

Monitoring PDB Health

Integrate PDB status into your observability stack. Prometheus (via kube-state-metrics) exposes kube_poddisruptionbudget_status_disruptions_allowed. Alert when this drops to 0 for an extended period — it indicates a stuck pod blocking future maintenance. See our Kubernetes Monitoring with Prometheus guide for the full setup.

# Prometheus alert: PDB blocking all disruptions for 15 minutes
groups:
- name: pdb-alerts
  rules:
  - alert: PDBDisruptionsAllowedZero
    expr: kube_poddisruptionbudget_status_disruptions_allowed == 0
    for: 15m
    labels:
      severity: warning
    annotations:
      summary: "PDB {{ $labels.poddisruptionbudget }} has 0 disruptions allowed"
      description: "Node drains will be blocked until this PDB allows at least 1 disruption."

Summary

Pod Disruption Budgets are a small but critical piece of Kubernetes high-availability architecture. Here is the quick-reference checklist:

Always create a PDB for every workload with 2 or more replicas in production.
Use minAvailable for quorum-sensitive stateful workloads; maxUnavailable is often more intuitive for stateless services.
Never set minAvailable: 1 on a single-replica deployment — it permanently blocks drains.
Define readiness probes on all PDB-protected pods so currentHealthy reflects actual service health.
Combine PDB with topologySpreadConstraints for zone-level resilience.
Monitor disruptionsAllowed in Prometheus and alert on persistent zeros.
Test your PDB with a curl loop + kubectl drain before relying on it in production.

Explore related topics in our Kubernetes series: