Kubernetes etcd Backup and Restore

etcd is the distributed key-value store that serves as the single source of truth for all Kubernetes cluster state — every pod, service, configmap, secret, and custom resource is stored in etcd. If etcd data is lost or corrupted without a backup, the entire cluster configuration is gone. A disciplined etcd backup strategy with tested restore procedures is therefore a non-negotiable requirement for production Kubernetes clusters.

Understanding etcd in Kubernetes

In a standard kubeadm-deployed cluster, etcd runs as a static pod on each control plane node. Its data directory is typically /var/lib/etcd. The kube-apiserver is the only component that communicates directly with etcd — all other components (controller manager, scheduler, kubelet) interact with the cluster state through the API server.

etcd uses the Raft consensus algorithm to maintain consistency across its members. In a 3-member etcd cluster, the cluster can tolerate 1 node failure. In a 5-member cluster, 2 failures can be tolerated. The minimum recommended configuration for production is 3 etcd members, deployed across separate availability zones.

# Check etcd member list and health
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list

# Check etcd cluster health
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint health
Certificate paths: The etcd TLS certificates are always required when running etcdctl against a secured etcd cluster (which is the default for kubeadm). The paths shown above are the standard kubeadm locations. Verify your paths with kubectl describe pod etcd-master -n kube-system | grep -A5 command.

Manual Backup with etcdctl

The etcdctl snapshot save command creates a point-in-time snapshot of etcd data. The snapshot is consistent and safe to take on a live cluster — etcd creates the snapshot atomically without interrupting cluster operations.

# Create a snapshot — run on a control plane node
BACKUP_FILE="/backup/etcd-$(date +%Y%m%d-%H%M%S).db"

sudo ETCDCTL_API=3 etcdctl snapshot save "$BACKUP_FILE" \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Verify the snapshot integrity
sudo ETCDCTL_API=3 etcdctl snapshot status "$BACKUP_FILE" \
  --write-out=table

The snapshot status output shows the snapshot hash, revision, total keys, and total size. A typical production cluster snapshot is 50-500 MB depending on the number of resources and secrets stored.

# Example output of snapshot status
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| 0x3d2a1b | 2847391  |    12453   |   128 MB   |
+----------+----------+------------+------------+

Automated Backup CronJob

Manual backups are error-prone and easily forgotten. Automate etcd backups using a Kubernetes CronJob that runs on the control plane node using the host network and etcd certificate mounts.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: etcd-backup
  namespace: kube-system
spec:
  schedule: "0 */6 * * *"   # Every 6 hours
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      template:
        spec:
          hostNetwork: true
          nodeSelector:
            node-role.kubernetes.io/control-plane: ""
          tolerations:
            - key: node-role.kubernetes.io/control-plane
              effect: NoSchedule
          restartPolicy: OnFailure
          containers:
            - name: etcd-backup
              image: bitnami/etcd:3.5
              command:
                - /bin/sh
                - -c
                - |
                  BACKUP_FILE="/backup/etcd-$(date +%Y%m%d-%H%M%S).db"
                  ETCDCTL_API=3 etcdctl snapshot save "$BACKUP_FILE" \
                    --endpoints=https://127.0.0.1:2379 \
                    --cacert=/etc/kubernetes/pki/etcd/ca.crt \
                    --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
                    --key=/etc/kubernetes/pki/etcd/healthcheck-client.key
                  ETCDCTL_API=3 etcdctl snapshot status "$BACKUP_FILE" --write-out=table
                  # Keep only last 10 backups
                  ls -t /backup/*.db | tail -n +11 | xargs rm -f
                  echo "Backup complete: $BACKUP_FILE"
              volumeMounts:
                - name: etcd-certs
                  mountPath: /etc/kubernetes/pki/etcd
                  readOnly: true
                - name: backup-dir
                  mountPath: /backup
          volumes:
            - name: etcd-certs
              hostPath:
                path: /etc/kubernetes/pki/etcd
            - name: backup-dir
              hostPath:
                path: /var/etcd-backups

Storing Backups in S3

Local disk backups are insufficient — if the control plane node is lost, the backups go with it. Upload every snapshot to S3 (or GCS/Azure Blob) immediately after creation.

#!/bin/bash
# etcd-backup-s3.sh — run as CronJob or systemd timer

set -euo pipefail

BACKUP_FILE="/tmp/etcd-$(date +%Y%m%d-%H%M%S).db"
S3_BUCKET="s3://my-company-etcd-backups"
CLUSTER_NAME="${CLUSTER_NAME:-production}"

# Take snapshot
ETCDCTL_API=3 etcdctl snapshot save "$BACKUP_FILE" \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
  --key=/etc/kubernetes/pki/etcd/healthcheck-client.key

# Verify
ETCDCTL_API=3 etcdctl snapshot status "$BACKUP_FILE"

# Upload to S3
aws s3 cp "$BACKUP_FILE" \
  "${S3_BUCKET}/${CLUSTER_NAME}/$(basename $BACKUP_FILE)" \
  --sse aws:kms \
  --kms-key-id alias/etcd-backups

# Delete local temp file
rm -f "$BACKUP_FILE"

echo "Backup uploaded to ${S3_BUCKET}/${CLUSTER_NAME}/"

# Delete backups older than 30 days
aws s3 ls "${S3_BUCKET}/${CLUSTER_NAME}/" \
  | awk '{print $4}' \
  | while read key; do
    date=$(echo "$key" | grep -oP '\d{8}')
    cutoff=$(date -d '30 days ago' +%Y%m%d)
    if [[ "$date" < "$cutoff" ]]; then
      aws s3 rm "${S3_BUCKET}/${CLUSTER_NAME}/$key"
    fi
  done
Encryption at rest: Always encrypt etcd backups using S3 server-side encryption with KMS. etcd snapshots contain all Kubernetes Secrets in plaintext (unless you have encryption at rest enabled on the API server). A leaked snapshot reveals every secret in your cluster.

Restoring etcd from a Snapshot

Restoring etcd is a disruptive operation that requires stopping the API server and all control plane components. For a single control plane node cluster, the procedure is straightforward. For HA clusters with multiple etcd members, you must restore all members from the same snapshot simultaneously.

# Step 1: Download the snapshot from S3
aws s3 cp s3://my-company-etcd-backups/production/etcd-20260616-060000.db /tmp/restore.db

# Step 2: Move the API server static pod manifest out of the manifests directory
# (this stops the API server without systemctl)
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sudo mv /etc/kubernetes/manifests/kube-controller-manager.yaml /tmp/
sudo mv /etc/kubernetes/manifests/kube-scheduler.yaml /tmp/

# Wait for containers to stop
sleep 30

# Step 3: Stop etcd (by removing its manifest too)
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/
sleep 10

# Step 4: Remove the old etcd data directory
sudo mv /var/lib/etcd /var/lib/etcd.bak

# Step 5: Restore the snapshot
sudo ETCDCTL_API=3 etcdctl snapshot restore /tmp/restore.db \
  --data-dir=/var/lib/etcd \
  --name=master-01 \
  --initial-cluster="master-01=https://10.0.0.1:2380" \
  --initial-cluster-token=etcd-cluster-1 \
  --initial-advertise-peer-urls=https://10.0.0.1:2380

# Step 6: Restore the manifests — Kubernetes will restart all control plane pods
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/
sleep 15
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
sudo mv /tmp/kube-controller-manager.yaml /etc/kubernetes/manifests/
sudo mv /tmp/kube-scheduler.yaml /etc/kubernetes/manifests/

# Step 7: Wait for API server to be ready
kubectl wait --for=condition=Ready node --all --timeout=300s

Application-Level Backup with Velero

etcd snapshots back up all cluster state but do not back up persistent volume data. Velero is a CNCF project that provides application-consistent backups including PersistentVolumes, making it complementary to etcd snapshots rather than a replacement.

# Install Velero with S3 backend
velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.8.0 \
  --bucket my-velero-backups \
  --backup-location-config region=us-east-1 \
  --snapshot-location-config region=us-east-1 \
  --secret-file ./credentials-velero

# Create a scheduled backup of the production namespace
velero schedule create production-daily \
  --schedule="0 2 * * *" \
  --include-namespaces production \
  --ttl 720h    # 30 days retention

# Restore from a specific backup
velero restore create --from-backup production-daily-20260615020000

Testing Your Backup and Restore

An untested backup is not a backup — it is hope. Run a full restore drill at least quarterly in a non-production environment. Document the exact commands, time taken, and any surprises encountered. The worst time to discover your restore procedure is broken is during an actual disaster.

  • Spin up a temporary cluster (e.g., with kind or a cloud VM)
  • Restore your production etcd snapshot to it
  • Verify that key resources (namespaces, deployments, secrets, configmaps) are present with kubectl get all -A
  • Test that your most critical applications can be started from the restored state
  • Measure total Recovery Time Objective (RTO) from snapshot download to cluster ready

Monitoring etcd Health

Proactive monitoring catches etcd degradation before it becomes a disaster. Key metrics and alerts to configure:

# Prometheus alerting rules for etcd
groups:
  - name: etcd
    rules:
      - alert: EtcdMemberDown
        expr: up{job="etcd"} == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "etcd member {{ $labels.instance }} is down"

      - alert: EtcdHighCommitDuration
        expr: histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) > 0.25
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "etcd commit p99 latency is {{ $value }}s — disk may be slow"

      - alert: EtcdDatabaseSizeHigh
        expr: etcd_mvcc_db_total_size_in_bytes > 6e9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "etcd database size is {{ $value | humanize }}B — approaching 8GB limit"

      - alert: EtcdNoRecentBackup
        expr: time() - etcd_backup_last_success_timestamp > 86400
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "No successful etcd backup in the last 24 hours"
Compaction: etcd keeps all historical versions of keys by default, causing the database to grow indefinitely. Enable automatic compaction with --auto-compaction-retention=8 (hours) on the etcd pod arguments to keep the database size manageable.