Kubernetes Audit Logging: Security Compliance Guide

Kubernetes audit logging records every API server request — who made it, what resource was affected, and what the outcome was. This audit trail is essential for security incident investigation, compliance frameworks like SOC2, PCI-DSS, and HIPAA, and detecting insider threats or compromised service accounts. Without audit logging, answering "who deleted that namespace?" or "which pod read that secret?" is impossible. This guide covers configuring a comprehensive audit policy, shipping logs to a SIEM, and writing detection rules for common attack patterns.

Audit Levels and Event Stages
Writing an Audit Policy
Enabling Audit Logging on the API Server
Webhook Backend for Real-Time Shipping
Querying and Analysing Audit Logs
Threat Detection Rules
Compliance Requirements (SOC2, PCI-DSS)
Audit Logging on Managed Clusters

Audit Levels and Event Stages

Kubernetes audit events are generated at four possible levels of detail. The level determines how much information is captured for each event:

None — no events are recorded for this rule
Metadata — records request metadata (user, verb, resource, namespace) but not the request or response body. Low volume, always safe to enable.
Request — records metadata plus the request body. Useful for capturing what was created or updated.
RequestResponse — records metadata, request body, and response body. Highest verbosity; use sparingly on large resources.

Events are also associated with stages in the request lifecycle:

RequestReceived — event generated as soon as the API server receives the request
ResponseStarted — for streaming requests (watch), when the response headers are sent
ResponseComplete — when the response body has been sent completely
Panic — when the API server panics handling a request

Practical choice: For most compliance use cases, audit at Metadata level for most resources and Request level for sensitive operations (secret access, exec into pods, privilege escalation). RequestResponse is extremely verbose and should only be used for targeted forensic investigations.

Writing an Audit Policy

The audit policy file is a YAML document that defines ordered rules. The first matching rule wins. Design your policy from most specific to most general.

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
# Omit RequestReceived stage events (reduces volume by ~50%)
omitStages:
  - "RequestReceived"

rules:
  # Never audit health check endpoints — creates enormous noise
  - level: None
    nonResourceURLs:
      - "/healthz*"
      - "/livez*"
      - "/readyz*"
      - "/version"
      - "/metrics"

  # Never audit watch events from controllers — very high volume
  - level: None
    users:
      - "system:kube-scheduler"
      - "system:kube-controller-manager"
    verbs:
      - "watch"
      - "list"

  # Log secret access at Request level — capture what was read
  - level: Request
    resources:
      - group: ""
        resources: ["secrets"]
    verbs:
      - "get"
      - "list"

  # Log secret mutations at RequestResponse level
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["secrets"]
    verbs:
      - "create"
      - "update"
      - "patch"
      - "delete"

  # Log exec, port-forward, attach — critical for insider threat detection
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["pods/exec", "pods/portforward", "pods/attach"]

  # Log RBAC changes at RequestResponse
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]
    verbs:
      - "create"
      - "update"
      - "patch"
      - "delete"

  # Log namespace-level resource mutations at Request level
  - level: Request
    resources:
      - group: ""
        resources: ["pods", "services", "configmaps", "persistentvolumeclaims"]
      - group: "apps"
        resources: ["deployments", "statefulsets", "daemonsets"]
    verbs:
      - "create"
      - "update"
      - "patch"
      - "delete"

  # Catch-all: Metadata level for everything else
  - level: Metadata
    omitStages:
      - "RequestReceived"

Enabling Audit Logging on the API Server

Audit logging is configured via API server flags. For kubeadm clusters, edit the static pod manifest at /etc/kubernetes/manifests/kube-apiserver.yaml.

# /etc/kubernetes/manifests/kube-apiserver.yaml — relevant section
spec:
  containers:
  - command:
    - kube-apiserver
    # ... existing flags ...
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
    - --audit-log-path=/var/log/kubernetes/audit/audit.log
    - --audit-log-maxage=30          # days to retain log files
    - --audit-log-maxbackup=10       # max number of rotated files
    - --audit-log-maxsize=100        # max size in MB before rotation
    - --audit-log-compress=true      # gzip rotated files
    volumeMounts:
    - mountPath: /etc/kubernetes/audit-policy.yaml
      name: audit-policy
      readOnly: true
    - mountPath: /var/log/kubernetes/audit/
      name: audit-log
  volumes:
  - name: audit-policy
    hostPath:
      path: /etc/kubernetes/audit-policy.yaml
      type: File
  - name: audit-log
    hostPath:
      path: /var/log/kubernetes/audit/
      type: DirectoryOrCreate

Restart behaviour: Editing a static pod manifest causes kubelet to automatically restart the API server pod. The API server will be briefly unavailable (typically 10-30 seconds). Plan this change during low-traffic periods and verify the API server restarts cleanly with kubectl get pods -n kube-system | grep apiserver.

Webhook Backend for Real-Time Shipping

File-based audit logs must be scraped by a log agent (Fluentbit, Promtail) and shipped to your SIEM or log store. The webhook backend sends audit events directly to an HTTP endpoint in real time, enabling near-instant alerting on suspicious activity.

# audit-webhook-config.yaml
apiVersion: v1
kind: Config
clusters:
  - name: audit-webhook
    cluster:
      server: https://siem.internal.example.com/kubernetes/audit
      certificate-authority: /etc/kubernetes/audit-webhook-ca.crt
contexts:
  - name: default
    context:
      cluster: audit-webhook
      user: audit-user
current-context: default
users:
  - name: audit-user
    user:
      token: my-webhook-token

# Add to kube-apiserver flags
- --audit-webhook-config-file=/etc/kubernetes/audit-webhook-config.yaml
- --audit-webhook-batch-max-size=400
- --audit-webhook-batch-max-wait=5s
- --audit-webhook-mode=batch   # or 'blocking' for guaranteed delivery

Querying and Analysing Audit Logs

Audit log entries are JSON objects. Use jq for quick analysis of local log files, or structured queries in your SIEM platform.

# Find all secret reads in the last hour
jq 'select(.objectRef.resource == "secrets" and .verb == "get")
  | {time: .requestReceivedTimestamp, user: .user.username, secret: .objectRef.name, ns: .objectRef.namespace}' \
  /var/log/kubernetes/audit/audit.log

# Find all exec commands (insider threat hunting)
jq 'select(.objectRef.subresource == "exec")
  | {time: .requestReceivedTimestamp, user: .user.username, pod: .objectRef.name, ns: .objectRef.namespace, command: .requestObject}' \
  /var/log/kubernetes/audit/audit.log

# Find failed API calls (potential brute force or misconfigured service accounts)
jq 'select(.responseStatus.code >= 403)
  | {time: .requestReceivedTimestamp, user: .user.username, verb: .verb, resource: .objectRef.resource, code: .responseStatus.code}' \
  /var/log/kubernetes/audit/audit.log | head -50

Threat Detection Rules

These audit log patterns should trigger immediate security alerts in your SIEM or Falco rule set:

Anonymous API access: user.username == "system:anonymous" with any verb other than get on health endpoints
ClusterRole escalation: create or update of ClusterRoleBinding granting cluster-admin
Secret mass read: single user reads more than 10 secrets within 5 minutes
Exec into privileged pod: exec subresource on a pod in kube-system namespace
Namespace deletion: delete verb on resource namespaces — almost always catastrophic if accidental
ServiceAccount token creation: create on serviceaccounts/token by non-controller users

# Example Falco rule for Kubernetes audit events
- rule: K8s Cluster-Admin Binding Created
  desc: A ClusterRoleBinding granting cluster-admin was created
  condition: >
    ka.verb=create and
    ka.target.resource=clusterrolebindings and
    ka.req.binding.role=cluster-admin
  output: >
    Cluster admin binding created (user=%ka.user.name binding=%ka.target.name)
  priority: CRITICAL
  source: k8s_audit

Compliance Requirements (SOC2, PCI-DSS)

Audit logging satisfies multiple control requirements across common compliance frameworks. Here is the mapping:

SOC2 CC6.1, CC6.6: Logical and physical access restrictions — audit logs prove who accessed what resources and when
PCI-DSS Req 10: Track and monitor all access to network resources and cardholder data — Kubernetes audit logs cover API-level access to secrets and config containing cardholder data
HIPAA §164.312(b): Audit controls — audit logs of access to pods and secrets containing PHI satisfy this requirement
CIS Kubernetes Benchmark 3.2.1: Ensure audit log is enabled — minimum 90-day retention required

Retention: Most compliance frameworks require 90-day minimum hot retention and 1-year archive. Store audit logs in tamper-evident storage (S3 with Object Lock or a WORM-compliant SIEM) to prevent log tampering after a breach.

Audit Logging on Managed Clusters

Managed Kubernetes services provide built-in audit logging that routes to the cloud provider's native logging service.

EKS: Enable API server audit logs via the EKS console or CLI (aws eks update-cluster-config --logging). Logs go to CloudWatch Logs.
GKE: Audit logs are automatically enabled and sent to Cloud Audit Logs (Data Access logs must be explicitly enabled for secret reads).
AKS: Enable via Diagnostics settings — send to Log Analytics workspace for Kusto queries.

# Enable EKS control plane audit logging
aws eks update-cluster-config \
  --name my-cluster \
  --region us-east-1 \
  --logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'

# Query EKS audit logs in CloudWatch
aws logs filter-log-events \
  --log-group-name /aws/eks/my-cluster/cluster \
  --log-stream-name-prefix kube-apiserver-audit \
  --filter-pattern '{ $.objectRef.resource = "secrets" && $.verb = "get" }'