Kubernetes Admission Controllers: Webhooks and Policies

Kubernetes Admission Controllers are plugins that intercept requests to the API server after authentication and authorization but before the object is persisted to etcd. They are the last line of defense for enforcing cluster-wide policies — ensuring every pod has resource limits, every image comes from an approved registry, every namespace has required labels, and no workload runs as root. Understanding admission controllers is essential for platform teams building guardrails that developers cannot accidentally bypass.

Admission Request Flow
Built-in Admission Controllers
Writing a Mutating Admission Webhook
Writing a Validating Admission Webhook
Policy Enforcement with OPA Gatekeeper
Policy Enforcement with Kyverno
Webhook TLS and cert-manager
Production Considerations

Admission Request Flow

When a client sends a request to the Kubernetes API server (e.g., kubectl apply), it flows through three phases before the object is written to etcd:

Authentication — verifies who is making the request (certificates, tokens, OIDC)
Authorization — checks if the identity is allowed to perform the requested action (RBAC)
Admission — runs the request through all registered admission controllers in order

The admission phase itself has two sub-phases: Mutating webhooks run first and can modify the object (e.g., inject a sidecar, add default resource limits). Then Validating webhooks run and can only accept or reject — they cannot modify the object. If any validating webhook rejects the request, the entire operation fails with the webhook's error message returned to the client.

Failure policy: Every webhook has a failurePolicy of either Fail (reject the request if the webhook is unreachable) or Ignore (allow the request if the webhook is unreachable). In production, use Fail for security-critical webhooks and ensure they are highly available.

Built-in Admission Controllers

Kubernetes ships with over 30 built-in admission controllers enabled by default on most managed clusters. The most important ones for security and operations:

LimitRanger — applies default resource requests and limits from a LimitRange object to pods that don't specify them.
ResourceQuota — enforces namespace-level resource consumption limits from ResourceQuota objects.
PodSecurity — enforces Pod Security Standards (restricted, baseline, privileged) per namespace via labels. Replaced PodSecurityPolicy (deprecated in 1.21, removed in 1.25).
ServiceAccount — auto-creates and mounts ServiceAccount tokens for pods.
DefaultStorageClass — automatically sets a default StorageClass on PVCs that don't request one.
MutatingAdmissionWebhook / ValidatingAdmissionWebhook — the extension points that allow custom webhooks.

# Apply Pod Security Standards via namespace labels
kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/enforce-version=latest \
  pod-security.kubernetes.io/warn=restricted \
  pod-security.kubernetes.io/audit=restricted

Writing a Mutating Admission Webhook

A mutating webhook is an HTTPS server that receives an AdmissionReview JSON object and returns it with a patch field containing a JSON Patch array to apply to the object. The most common use case is injecting sidecar containers or adding default annotations.

# MutatingWebhookConfiguration resource
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: sidecar-injector
webhooks:
  - name: inject-sidecar.example.com
    admissionReviewVersions: ["v1"]
    clientConfig:
      service:
        name: sidecar-injector-svc
        namespace: webhook-system
        path: /inject
      caBundle: BASE64_CA_CERT
    rules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE"]
        resources: ["pods"]
    namespaceSelector:
      matchLabels:
        sidecar-injection: enabled
    failurePolicy: Fail
    sideEffects: None
    timeoutSeconds: 5

# Minimal webhook server (Python/Flask example)
from flask import Flask, request, jsonify
import base64, json

app = Flask(__name__)

@app.route('/inject', methods=['POST'])
def inject():
    review = request.get_json()
    uid = review['request']['uid']

    # JSON patch to add a logging sidecar
    patch = [
        {
            "op": "add",
            "path": "/spec/containers/-",
            "value": {
                "name": "log-shipper",
                "image": "fluent/fluent-bit:3.0",
                "resources": {
                    "limits": {"cpu": "50m", "memory": "64Mi"}
                }
            }
        }
    ]

    patch_b64 = base64.b64encode(json.dumps(patch).encode()).decode()

    return jsonify({
        "apiVersion": "admission.k8s.io/v1",
        "kind": "AdmissionReview",
        "response": {
            "uid": uid,
            "allowed": True,
            "patchType": "JSONPatch",
            "patch": patch_b64
        }
    })

Writing a Validating Admission Webhook

A validating webhook returns an allowed: true/false response with an optional status message. A common use case is enforcing that all deployments have resource limits set.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: require-resource-limits
webhooks:
  - name: resource-limits.example.com
    admissionReviewVersions: ["v1"]
    clientConfig:
      service:
        name: policy-webhook-svc
        namespace: webhook-system
        path: /validate
      caBundle: BASE64_CA_CERT
    rules:
      - apiGroups: ["apps"]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["deployments"]
    failurePolicy: Fail
    sideEffects: None

@app.route('/validate', methods=['POST'])
def validate():
    review = request.get_json()
    uid = review['request']['uid']
    deployment = review['request']['object']

    containers = deployment['spec']['template']['spec']['containers']
    violations = []

    for c in containers:
        limits = c.get('resources', {}).get('limits', {})
        if not limits.get('cpu') or not limits.get('memory'):
            violations.append(f"Container '{c['name']}' missing resource limits")

    if violations:
        return jsonify({
            "apiVersion": "admission.k8s.io/v1",
            "kind": "AdmissionReview",
            "response": {
                "uid": uid,
                "allowed": False,
                "status": {
                    "code": 403,
                    "message": "; ".join(violations)
                }
            }
        })

    return jsonify({"apiVersion": "admission.k8s.io/v1", "kind": "AdmissionReview",
                    "response": {"uid": uid, "allowed": True}})

Policy Enforcement with OPA Gatekeeper

Building and maintaining custom webhooks is time-consuming. OPA Gatekeeper provides a framework that lets you write policies in Rego (OPA's policy language) as Kubernetes CRDs, without writing any webhook server code.

# Install Gatekeeper
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm upgrade --install gatekeeper gatekeeper/gatekeeper \
  --namespace gatekeeper-system \
  --create-namespace

# ConstraintTemplate: define the policy logic in Rego
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: requireresourcelimits
spec:
  crd:
    spec:
      names:
        kind: RequireResourceLimits
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package requireresourcelimits
        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not container.resources.limits.cpu
          msg := sprintf("Container '%v' must set resources.limits.cpu", [container.name])
        }
        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not container.resources.limits.memory
          msg := sprintf("Container '%v' must set resources.limits.memory", [container.name])
        }

---
# Constraint: apply the template to all pods in production
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: RequireResourceLimits
metadata:
  name: require-limits-production
spec:
  match:
    kinds:
      - apiGroups: ["apps"]
        kinds: ["Deployment"]
    namespaces: ["production"]

Policy Enforcement with Kyverno

Kyverno is a Kubernetes-native policy engine that uses YAML-based policies instead of Rego. Many teams find Kyverno easier to adopt because policies look like Kubernetes resource patterns they already know.

helm repo add kyverno https://kyverno.github.io/kyverno/
helm upgrade --install kyverno kyverno/kyverno \
  --namespace kyverno \
  --create-namespace

# Kyverno policy: require resource limits on all pods
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-container-limits
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "All containers must specify CPU and memory limits."
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    cpu: "?*"
                    memory: "?*"

---
# Kyverno mutating policy: add default labels if missing
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-labels
spec:
  rules:
    - name: add-team-label
      match:
        any:
          - resources:
              kinds: ["Deployment"]
      mutate:
        patchStrategicMerge:
          metadata:
            labels:
              +(managed-by): platform-team

Kyverno vs Gatekeeper: Kyverno policies are easier to write for Kubernetes operators already comfortable with YAML patterns. Gatekeeper with Rego is more powerful for complex logic and is the better choice if you already use OPA elsewhere in your stack (e.g., for API authorization).

Webhook TLS and cert-manager

All admission webhooks must serve HTTPS. The API server validates the webhook's certificate against the caBundle specified in the webhook configuration. Managing this TLS lifecycle manually is error-prone — use cert-manager to automate it.

# cert-manager Certificate for the webhook server
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: webhook-tls
  namespace: webhook-system
spec:
  secretName: webhook-tls-secret
  dnsNames:
    - policy-webhook-svc.webhook-system.svc
    - policy-webhook-svc.webhook-system.svc.cluster.local
  issuerRef:
    name: cluster-ca-issuer
    kind: ClusterIssuer

# Inject the CA bundle automatically using cert-manager annotation
# Add this annotation to the WebhookConfiguration:
# cert-manager.io/inject-ca-from: webhook-system/webhook-tls
# cert-manager will patch the caBundle field automatically when the cert is issued

Production Considerations

High availability — run at least 2 webhook replicas with a PodDisruptionBudget. A single-replica webhook that crashes blocks all pod creation in affected namespaces.
Timeout — set timeoutSeconds: 5. The API server default is 10 seconds but short timeouts prevent slow webhooks from causing API latency cascades.
Exclude system namespaces — always add a namespaceSelector that excludes kube-system and kube-public. A webhook failure that blocks system pod creation can take down the entire cluster.
Dry-run mode — test new policies in audit/warn mode (Gatekeeper's dryrun or Kyverno's Audit) before switching to enforce. This shows violations without blocking workloads.
Monitor webhook latency — expose Prometheus metrics from your webhook server and alert if p99 latency exceeds 3 seconds. Add the apiserver_admission_webhook_admission_duration_seconds metric from the API server as well.