CoreDNS in Kubernetes
A pod cannot resolve an external domain. You exec into the pod and run
dig google.com. It returnsSERVFAIL. CoreDNS pods are running. The node can resolvegoogle.comjust fine. Kubernetes DNS is broken, but only for this pod.You spend an hour checking CoreDNS logs, ConfigMaps, and network policies. Finally, you notice the pod spec has
dnsPolicy: Nonewith nodnsConfig— the pod has no DNS configuration at all. A one-line fix. But without understanding how Kubernetes DNS works end-to-end, you could not have found it.
CoreDNS: The DNS Server Inside Every Cluster
Since Kubernetes 1.13, CoreDNS is the default DNS server in every cluster. It replaced kube-dns and runs as a Deployment in the kube-system namespace (typically two replicas for high availability).
Every pod in the cluster is configured to use CoreDNS as its DNS resolver. When a pod resolves any hostname — internal service names like my-service.default.svc.cluster.local or external names like google.com — the query goes to CoreDNS first.
# Check CoreDNS is running
kubectl get pods -n kube-system -l k8s-app=kube-dns
# NAME READY STATUS RESTARTS
# coredns-5d78c9869d-abcde 1/1 Running 0
# coredns-5d78c9869d-fghij 1/1 Running 0
# CoreDNS runs as a Service (usually 10.96.0.10)
kubectl get svc -n kube-system kube-dns
# NAME TYPE CLUSTER-IP PORT(S)
# kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP
# Every pod's /etc/resolv.conf points to this IP
kubectl exec my-pod -- cat /etc/resolv.conf
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5
The CoreDNS Service IP (typically 10.96.0.10) is configured at cluster creation time and is hardcoded into every pod's /etc/resolv.conf by the kubelet. If CoreDNS pods are down, every pod in the cluster loses DNS resolution — including DNS for external services. CoreDNS is the single most critical service in your cluster after the API server.
The Corefile: CoreDNS Configuration
CoreDNS is configured through a ConfigMap called coredns in the kube-system namespace. The configuration is called the Corefile.
# View the CoreDNS configuration
kubectl get configmap coredns -n kube-system -o yaml
A typical Corefile:
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
Let us break down each plugin:
| Plugin | Purpose |
|---|---|
errors | Log errors to stdout |
health | Expose /health endpoint on port 8080 |
ready | Expose /ready endpoint on port 8181 |
kubernetes | Resolve Kubernetes service/pod DNS names |
prometheus | Export metrics on port 9153 |
forward | Forward non-cluster queries to upstream resolvers |
cache | Cache responses for 30 seconds |
loop | Detect and break forwarding loops |
reload | Automatically reload Corefile on ConfigMap changes |
loadbalance | Round-robin A record responses |
The forward plugin determines where external DNS queries go. By default, it forwards to the node's /etc/resolv.conf (which contains the VPC or host resolver). You can change this to forward to specific resolvers like 8.8.8.8 or 1.1.1.1 for more predictable behavior. Just edit the ConfigMap: forward . 8.8.8.8 1.1.1.1.
Kubernetes DNS Names: The Full Picture
CoreDNS creates DNS records for every Service and Pod in the cluster. The naming convention follows a strict hierarchy.
Service DNS Names
For a Service named my-service in namespace production:
my-service.production.svc.cluster.local
The components:
my-service— service nameproduction— namespacesvc— indicates this is a Servicecluster.local— the default cluster domain
# Full DNS name
kubectl exec debug -- dig my-service.production.svc.cluster.local A +short
# 10.96.0.42
# Within the same namespace, you can use short names:
# From a pod in the "production" namespace:
kubectl exec debug -- dig my-service A +short
# 10.96.0.42
# (search domains make this work — more on this below)
Headless Service DNS
A headless service (clusterIP: None) does not get a ClusterIP. Instead, DNS returns the individual pod IPs.
apiVersion: v1
kind: Service
metadata:
name: my-headless
spec:
clusterIP: None
selector:
app: my-app
ports:
- port: 8080
# Headless service — returns all pod IPs
kubectl exec debug -- dig my-headless.default.svc.cluster.local A +short
# 10.244.0.5
# 10.244.1.8
# 10.244.2.3
# SRV records include port information
kubectl exec debug -- dig _http._tcp.my-headless.default.svc.cluster.local SRV +short
# 0 33 8080 10-244-0-5.my-headless.default.svc.cluster.local.
# 0 33 8080 10-244-1-8.my-headless.default.svc.cluster.local.
# 0 33 8080 10-244-2-3.my-headless.default.svc.cluster.local.
Headless services are essential for StatefulSets. Each StatefulSet pod gets a stable DNS name: pod-name.headless-svc.namespace.svc.cluster.local. For example, postgres-0.postgres-headless.production.svc.cluster.local always resolves to the specific pod. This stable identity persists across pod restarts, which is critical for databases and distributed systems.
ExternalName Services
ExternalName services create a CNAME record pointing to an external hostname:
apiVersion: v1
kind: Service
metadata:
name: external-db
spec:
type: ExternalName
externalName: mydb.us-east-1.rds.amazonaws.com
kubectl exec debug -- dig external-db.default.svc.cluster.local +short
# mydb.us-east-1.rds.amazonaws.com.
# 10.0.1.50
The ndots Problem: Why DNS Is Slow in Kubernetes
This is the single most impactful DNS configuration issue in Kubernetes. If you understand nothing else about K8s DNS, understand this.
Look at the default /etc/resolv.conf in every pod:
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
What ndots:5 means: If a hostname has fewer than 5 dots, the resolver appends each search domain before trying the name as-is.
When a pod resolves google.com (which has 1 dot, fewer than 5):
- Query
google.com.default.svc.cluster.local— NXDOMAIN - Query
google.com.svc.cluster.local— NXDOMAIN - Query
google.com.cluster.local— NXDOMAIN - Query
google.com.(as-is, fully qualified) — SUCCESS
That is 4 DNS queries for a single external hostname resolution. Every external DNS lookup generates 3 wasted queries.
# Watch the DNS queries in real-time with tcpdump on the node
kubectl exec debug -- tcpdump -i eth0 port 53 -nn
# When the pod resolves "google.com":
# 10:00:00.001 query google.com.default.svc.cluster.local A?
# 10:00:00.002 response NXDOMAIN
# 10:00:00.003 query google.com.svc.cluster.local A?
# 10:00:00.004 response NXDOMAIN
# 10:00:00.005 query google.com.cluster.local A?
# 10:00:00.006 response NXDOMAIN
# 10:00:00.007 query google.com A?
# 10:00:00.008 response A 142.250.80.46
CoreDNS in a Kubernetes Cluster
Hover components for details
A team running a microservices platform noticed their P99 latency spiked by 15ms across all services. The root cause: CoreDNS was overloaded. Each pod was making hundreds of external API calls per second, and each call generated 4 DNS queries due to ndots:5. With 200 pods, CoreDNS was handling 800 queries per second of pure waste. Reducing ndots to 2 cut DNS query volume by 60% and brought latency back to normal.
Fixing the ndots Problem
Option 1: Reduce ndots in pod spec
apiVersion: v1
kind: Pod
spec:
dnsConfig:
options:
- name: ndots
value: "2"
With ndots:2, names with 2 or more dots (like api.external.com) are queried as-is first. Only names with fewer than 2 dots get search domains appended. This eliminates wasted queries for most external hostnames.
Option 2: Use fully qualified domain names (trailing dot)
# In your application code, add a trailing dot to external hostnames:
# "google.com." instead of "google.com"
# The trailing dot tells the resolver: this is already fully qualified, skip search domains
Option 3: Use a dedicated dnsPolicy
apiVersion: v1
kind: Pod
spec:
dnsPolicy: "None"
dnsConfig:
nameservers:
- 10.96.0.10
searches:
- default.svc.cluster.local
- svc.cluster.local
options:
- name: ndots
value: "2"
For most production workloads, setting ndots:2 is the best trade-off. It still allows short service names like my-service to resolve via search domains (because my-service has 0 dots, which is fewer than 2), while external domains like api.stripe.com (2 dots) are queried directly without wasted searches. Set this as a default in your pod templates or admission webhook.
Pod DNS Policy
The dnsPolicy field controls how a pod's /etc/resolv.conf is configured:
| Policy | Behavior | Use Case |
|---|---|---|
ClusterFirst (default) | Use CoreDNS for everything. Cluster names resolved locally, external names forwarded upstream. | 99% of workloads |
Default | Use the node's DNS configuration (/etc/resolv.conf from the host). Does not use CoreDNS. | Host networking pods that need node DNS |
ClusterFirstWithHostNet | Like ClusterFirst but for pods with hostNetwork: true. | Pods using host network that still need cluster DNS |
None | No automatic DNS config. You must provide dnsConfig manually. | Custom DNS configurations |
# Check a pod's DNS policy
kubectl get pod my-pod -o jsonpath='{.spec.dnsPolicy}'
# ClusterFirst
# Check the resulting resolv.conf
kubectl exec my-pod -- cat /etc/resolv.conf
Setting dnsPolicy: Default means the pod uses the node's resolver, not CoreDNS. This means cluster service names like my-service.default.svc.cluster.local will NOT resolve. This is a common mistake when running pods with hostNetwork: true — you probably want ClusterFirstWithHostNet instead.
NodeLocal DNSCache: Scaling DNS
In large clusters (hundreds or thousands of pods), CoreDNS can become a bottleneck. Every DNS query from every pod goes to the CoreDNS Service IP, which is load-balanced to one of the CoreDNS pods. This creates:
- Network hops: queries cross the pod network to reach CoreDNS
- Conntrack pressure: each UDP DNS query creates a conntrack entry
- Single point of contention: two CoreDNS pods serve the entire cluster
NodeLocal DNSCache solves this by running a DNS cache on every node as a DaemonSet. Pods send DNS queries to the local cache (via a link-local IP like 169.254.20.10), which handles most queries from cache and only forwards cache misses to CoreDNS.
# Check if NodeLocal DNSCache is running
kubectl get daemonset -n kube-system node-local-dns
# NAME DESIRED CURRENT READY
# node-local-dns 10 10 10
# With NodeLocal DNS, pod resolv.conf changes to:
kubectl exec my-pod -- cat /etc/resolv.conf
# nameserver 169.254.20.10 <-- Local cache on the node
# search default.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5
NodeLocal DNSCache reduces DNS latency from milliseconds (network round-trip to CoreDNS pod) to microseconds (local cache hit on the same node). It also eliminates conntrack issues for DNS traffic since the queries go to a local address. If you run more than 50 nodes or have DNS-heavy workloads, deploying NodeLocal DNSCache is strongly recommended.
Common CoreDNS Failures
1. SERVFAIL for External Domains
Symptom: Pods can resolve cluster services but external domains return SERVFAIL.
kubectl exec debug -- dig google.com A
# ;; ->>HEADER<<- status: SERVFAIL
Causes:
- CoreDNS cannot reach the upstream resolver (check the
forwardplugin target) - A NetworkPolicy is blocking CoreDNS egress on port 53
- The upstream resolver (VPC resolver, ISP resolver) is down
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
# [ERROR] plugin/forward: no nameservers found
# Test upstream connectivity from CoreDNS pod
kubectl exec -n kube-system coredns-abc123 -- nslookup google.com 8.8.8.8
2. NXDOMAIN for Service Names
Symptom: External DNS works but cluster service names return NXDOMAIN.
kubectl exec debug -- dig my-service.default.svc.cluster.local A
# ;; ->>HEADER<<- status: NXDOMAIN
Causes:
- Service does not exist (typo in service name or namespace)
- Service has no endpoints (no pods match the selector)
- CoreDNS
kubernetesplugin is misconfigured
# Verify the service exists
kubectl get svc my-service -n default
# Verify endpoints exist
kubectl get endpoints my-service -n default
3. DNS Timeouts
Symptom: DNS queries hang and eventually time out.
Causes:
- CoreDNS pods are overloaded (CPU throttled, too many queries)
- Network issue between pods and CoreDNS Service IP
- conntrack table full (UDP DNS entries filling conntrack)
# Check CoreDNS resource usage
kubectl top pods -n kube-system -l k8s-app=kube-dns
# NAME CPU MEMORY
# coredns-abc123 980m 128Mi <-- CPU maxed out!
# Check conntrack
cat /proc/sys/net/netfilter/nf_conntrack_count
cat /proc/sys/net/netfilter/nf_conntrack_max
A cluster with 500 pods experienced intermittent DNS timeouts every few minutes. CoreDNS metrics showed normal query volume. The issue was conntrack table exhaustion on the nodes. Every DNS query (UDP) created a conntrack entry with a 30-second timeout. With ndots:5 multiplying queries by 4x, the conntrack table hit its limit and started dropping packets silently. Fix: deploy NodeLocal DNSCache (bypasses conntrack for local queries) and increase nf_conntrack_max.
4. DNS Resolution Race Condition
Symptom: Applications intermittently fail DNS resolution during startup, especially for A and AAAA queries sent simultaneously.
This is a known Linux kernel bug (before 5.0) where simultaneous UDP packets to the same destination from the same source port can trigger a conntrack race condition, causing one packet to be dropped.
# Workaround: force single-request in dnsConfig
apiVersion: v1
kind: Pod
spec:
dnsConfig:
options:
- name: single-request-reopen
value: ""
Key Concepts Summary
- CoreDNS is the DNS server in every Kubernetes cluster — it runs in kube-system and resolves both cluster and external names
- The Corefile configures CoreDNS via a ConfigMap — the
kubernetesplugin handles cluster names,forwardsends external queries upstream - Service DNS format:
service.namespace.svc.cluster.local— short names work within the same namespace via search domains - Headless services return pod IPs directly — essential for StatefulSets and service discovery
- ndots:5 causes 4 DNS queries per external lookup — reduce to ndots:2 to eliminate wasted queries
- DNS policies control resolver configuration: ClusterFirst (default and correct for most pods), Default (host DNS only), None (custom)
- NodeLocal DNSCache reduces DNS latency and eliminates conntrack pressure — recommended for clusters with more than 50 nodes
- The most common CoreDNS failures are: SERVFAIL (upstream unreachable), NXDOMAIN (service does not exist), and timeouts (CoreDNS overloaded or conntrack full)
Common Mistakes
- Setting
dnsPolicy: Defaulton pods that need to resolve cluster services — they will fail because the node resolver does not know about svc.cluster.local - Not reducing ndots from the default of 5 — wastes DNS queries and overloads CoreDNS at scale
- Forgetting that headless services return pod IPs that change when pods restart — applications must handle DNS re-resolution
- Blocking CoreDNS egress with NetworkPolicy — CoreDNS needs to reach upstream resolvers on port 53
- Not monitoring CoreDNS resource usage — CPU throttling causes cluster-wide DNS timeouts
- Assuming DNS resolution is instant — even cached responses take microseconds, and cache misses can take milliseconds
A pod in the production namespace resolves my-api.staging.svc.cluster.local. The request fails with NXDOMAIN, but the service exists in the staging namespace. What should you check first?