Kubernetes Headless Services and StatefulSet DNS
A headless service is a Kubernetes Service with clusterIP: None — it creates no virtual IP and does no load balancing. Instead, DNS queries for the service name return A records for each individual pod IP, letting clients discover and connect directly to specific pods. This is the mechanism that gives StatefulSet pods stable, addressable DNS names like kafka-0.kafka-headless.production.svc.cluster.local — essential for distributed systems that need peer-to-peer communication with known, stable addresses.
Table of Contents
Headless vs Normal ClusterIP Services
Understanding the difference between a normal ClusterIP service and a headless service is key to knowing when to use each:
| Feature | ClusterIP Service | Headless Service (clusterIP: None) |
|---|---|---|
| Virtual IP | Yes — kube-proxy creates iptables rules | No — no virtual IP assigned |
| DNS response | Single A record pointing to ClusterIP | Multiple A records, one per pod IP |
| Load balancing | Yes — kube-proxy round-robins across pods | No — client must choose which pod to connect to |
| Pod-specific DNS | No | Yes — pod-name.service-name.namespace.svc.cluster.local |
| Use case | Stateless services, any HTTP API | StatefulSets, databases, Kafka, Cassandra, Elasticsearch |
For stateless microservices, a regular ClusterIP service is almost always the right choice because kube-proxy's load balancing is free and transparent. For stateful distributed systems where individual pod identity matters — Kafka broker IDs, Cassandra node addresses, Elasticsearch master nodes — headless services are essential.
Creating a Headless Service
A headless service is identical to a regular Service except for clusterIP: None. The service still uses label selectors to associate with pods and still creates Endpoints objects — it just doesn't get a virtual IP or kube-proxy rules.
apiVersion: v1
kind: Service
metadata:
name: my-app-headless
namespace: production
spec:
clusterIP: None # This is what makes it headless
selector:
app: my-app
ports:
- name: app
port: 8080
targetPort: 8080
# Verify the service has no ClusterIP
kubectl get svc my-app-headless -n production
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# my-app-headless ClusterIP None <none> 8080/TCP 5m
# Verify DNS returns individual pod IPs (run from inside a pod)
kubectl run dnstest --image=busybox:1.28 --restart=Never --rm -it \
-- nslookup my-app-headless.production.svc.cluster.local
# Returns multiple A records, one per pod
StatefulSet DNS: How Pod Names Work
When a StatefulSet is paired with a headless service (via spec.serviceName), Kubernetes creates stable DNS entries for each pod. The naming pattern is deterministic and survives pod restarts — a pod that is deleted and recreated gets the same name and the same DNS entry.
The DNS entry format for StatefulSet pods is:
{pod-name}.{service-name}.{namespace}.svc.{cluster-domain}
# Examples for a StatefulSet named "kafka" with serviceName "kafka-headless":
kafka-0.kafka-headless.production.svc.cluster.local
kafka-1.kafka-headless.production.svc.cluster.local
kafka-2.kafka-headless.production.svc.cluster.local
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
namespace: production
spec:
serviceName: kafka-headless # MUST match the headless service name
replicas: 3
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: confluentinc/cp-kafka:7.6.0
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: KAFKA_BROKER_ID
# Extract numeric ordinal from pod name (kafka-0 → 0)
value: "$(POD_NAME##*-)"
DNS Lookup Patterns
CoreDNS handles all cluster DNS. Understanding the different lookup patterns helps you configure clients correctly and debug DNS failures.
# Full FQDN lookup — works from any namespace
nslookup kafka-0.kafka-headless.production.svc.cluster.local
# Short name — works from within the same namespace
nslookup kafka-0.kafka-headless
# SRV record lookup — returns all pods with port information
nslookup -type=SRV _kafka._tcp.kafka-headless.production.svc.cluster.local
# Returns:
# kafka-headless.production.svc.cluster.local service = 0 33 9092 kafka-0.kafka-headless.production.svc.cluster.local
# kafka-headless.production.svc.cluster.local service = 0 33 9092 kafka-1.kafka-headless.production.svc.cluster.local
# kafka-headless.production.svc.cluster.local service = 0 33 9092 kafka-2.kafka-headless.production.svc.cluster.local
SRV records are particularly useful for clients that need to discover all members of a cluster dynamically. Cassandra and Elasticsearch use SRV-based discovery to find seed nodes. Configure the client's seed discovery to use the SRV record and it will automatically discover new pods as the StatefulSet scales.
Real Example: Kafka with Headless Service
A production Kafka StatefulSet requires two services: a headless service for inter-broker communication and pod-specific DNS, and a regular ClusterIP service for client access.
# Headless service: broker-to-broker communication
apiVersion: v1
kind: Service
metadata:
name: kafka-headless
namespace: production
spec:
clusterIP: None
selector:
app: kafka
ports:
- name: kafka
port: 9092
- name: controller
port: 9093
---
# Regular service: client access (load balanced)
apiVersion: v1
kind: Service
metadata:
name: kafka
namespace: production
spec:
selector:
app: kafka
ports:
- name: kafka
port: 9092
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
namespace: production
spec:
serviceName: kafka-headless
replicas: 3
template:
spec:
containers:
- name: kafka
image: confluentinc/cp-kafka:7.6.0
env:
- name: KAFKA_ADVERTISED_LISTENERS
# Each broker advertises its own stable DNS name
value: "PLAINTEXT://$(POD_NAME).kafka-headless.production.svc.cluster.local:9092"
- name: KAFKA_CONTROLLER_QUORUM_VOTERS
value: "0@kafka-0.kafka-headless.production.svc.cluster.local:9093,1@kafka-1.kafka-headless.production.svc.cluster.local:9093,2@kafka-2.kafka-headless.production.svc.cluster.local:9093"
Real Example: Cassandra StatefulSet
Cassandra's gossip protocol requires each node to know the addresses of seed nodes. Using stable StatefulSet DNS names as seeds means the configuration never needs to change when pods restart.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
namespace: production
spec:
serviceName: cassandra-headless
replicas: 3
template:
spec:
containers:
- name: cassandra
image: cassandra:5.0
env:
- name: CASSANDRA_SEEDS
# First two pods as seeds — stable because of StatefulSet naming
value: "cassandra-0.cassandra-headless.production.svc.cluster.local,cassandra-1.cassandra-headless.production.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "production-cluster"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: CASSANDRA_LISTEN_ADDRESS
value: "$(POD_IP)"
readinessProbe:
exec:
command: ["/bin/bash", "-c", "nodetool status | grep -E '^UN\\s+$(POD_IP)'"]
initialDelaySeconds: 30
periodSeconds: 10
Headless Services with Deployments
Headless services can also be used with regular Deployments — not just StatefulSets. In this case, DNS returns all pod IPs but there are no per-pod hostnames (since Deployment pods have random suffixes). This is useful for client-side load balancing in gRPC services, where the gRPC client library handles connection pooling and load balancing using the list of IP addresses returned by DNS.
# Headless service for a gRPC Deployment
apiVersion: v1
kind: Service
metadata:
name: grpc-service-headless
namespace: production
spec:
clusterIP: None
selector:
app: grpc-service
ports:
- name: grpc
port: 50051
# gRPC client using DNS-based discovery with round-robin load balancing
import grpc
channel = grpc.insecure_channel(
'dns:///grpc-service-headless.production.svc.cluster.local:50051',
options=[('grpc.lb_policy_name', 'round_robin')]
)
# gRPC resolves the DNS name to all pod IPs and distributes calls
Troubleshooting StatefulSet DNS
# Check that the headless service exists and has the right selector
kubectl get svc kafka-headless -n production -o yaml | grep -A5 selector
# Check that pods have the right labels to match the service selector
kubectl get pods -n production -l app=kafka --show-labels
# Check that Endpoints are populated (empty = selector mismatch or pods not Ready)
kubectl get endpoints kafka-headless -n production
# Verify DNS resolution from inside a pod
kubectl exec -it kafka-0 -n production -- \
nslookup kafka-0.kafka-headless.production.svc.cluster.local
# Check CoreDNS logs for resolution failures
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
# Check pod hostname and subdomain (must match serviceName)
kubectl exec -it kafka-0 -n production -- hostname -f
# Expected: kafka-0.kafka-headless.production.svc.cluster.local
spec.serviceName in the StatefulSet must exactly match the metadata.name of the headless Service. A mismatch means pods start but don't get their stable DNS entries, causing cluster bootstrap failures in Kafka and Cassandra.