Kubernetes Service Types Deep Dive
You create a Service of type LoadBalancer in your Kubernetes cluster. You run
kubectl get svcand wait. The EXTERNAL-IP column says<pending>. Five minutes pass. Still pending. Ten minutes. Still pending.You check the pod — it is Running. The Endpoints are populated. The Service has a ClusterIP. Everything looks correct from inside the cluster. But no external IP appears.
The problem? Your cluster does not have a cloud controller manager configured. There is nothing in the cluster that knows how to talk to your cloud provider and provision a load balancer. The Service is waiting for something that will never come.
To debug this kind of issue, you need to understand what each Service type actually creates under the hood — not just what
kubectl get svcshows you.
Part 1: ClusterIP — The Foundation of All Services
Every Kubernetes Service starts as a ClusterIP. Even NodePort and LoadBalancer services have a ClusterIP underneath. Understanding ClusterIP means understanding how all Services work.
What ClusterIP Creates
When you create a ClusterIP Service, Kubernetes assigns a virtual IP address from the Service CIDR range (typically 10.96.0.0/12). This IP does not belong to any network interface. No pod, no node, no device has this IP. It exists only in iptables/IPVS rules.
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
type: ClusterIP # default
selector:
app: api
ports:
- port: 80 # Service port (what clients connect to)
targetPort: 8080 # Pod port (where the app listens)
protocol: TCP
$ kubectl get svc api-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
api-service ClusterIP 10.96.45.123 <none> 80/TCP 5m
$ kubectl get endpoints api-service
NAME ENDPOINTS AGE
api-service 10.244.1.5:8080,10.244.2.8:8080,10.244.3.2:8080 5m
The Service maps 10.96.45.123:80 to the three pod IPs on port 8080. But how does the traffic actually get from the ClusterIP to the pod? That depends on which kube-proxy mode your cluster uses.
A ClusterIP is a virtual IP that only exists in the data plane rules (iptables, IPVS, or eBPF maps). You cannot ping it. You cannot traceroute to it. No device has this IP on any network interface. It only works because every node in the cluster has rules that intercept traffic to this IP and redirect it to a real pod IP. If you are debugging and try to ping a ClusterIP and it does not respond — that is normal.
kube-proxy Mode: iptables (Default)
In iptables mode, kube-proxy writes NAT rules that intercept traffic destined for the ClusterIP and DNAT (destination NAT) it to a randomly chosen pod IP.
# Simplified view of what kube-proxy creates for a 3-pod Service:
# Step 1: Match traffic to the Service ClusterIP
iptables -t nat -A KUBE-SERVICES \
-d 10.96.45.123/32 -p tcp --dport 80 \
-j KUBE-SVC-ABCDEF
# Step 2: Probabilistic load balancing
# Rule 1: 33.3% chance → Pod A
iptables -t nat -A KUBE-SVC-ABCDEF \
-m statistic --mode random --probability 0.33333 \
-j KUBE-SEP-POD-A
# Rule 2: 50% of remaining (= 33.3% total) → Pod B
iptables -t nat -A KUBE-SVC-ABCDEF \
-m statistic --mode random --probability 0.50000 \
-j KUBE-SEP-POD-B
# Rule 3: Everything else (33.3%) → Pod C
iptables -t nat -A KUBE-SVC-ABCDEF \
-j KUBE-SEP-POD-C
# Step 3: DNAT to the actual pod IP
iptables -t nat -A KUBE-SEP-POD-A \
-p tcp -j DNAT --to-destination 10.244.1.5:8080
iptables mode does not do real load balancing. It uses random selection via the statistic module. There is no "least connections" or "round robin" — just random. For most workloads this is fine. But if you have long-lived connections (WebSocket, gRPC streams), the randomness can lead to uneven distribution. One pod might get 5 long-lived connections while another gets 1.
kube-proxy Mode: IPVS
IPVS (IP Virtual Server) is a Linux kernel module designed specifically for load balancing. When kube-proxy runs in IPVS mode, it creates IPVS virtual servers instead of iptables rules.
# Check if your cluster uses IPVS
kubectl logs -n kube-system -l k8s-app=kube-proxy | grep "Using ipvs"
# View IPVS rules
ipvsadm -Ln
# IP Virtual Server version 1.2.1 (size=4096)
# Prot LocalAddress:Port Scheduler Flags
# -> RemoteAddress:Port Forward Weight ActiveConn InActConn
# TCP 10.96.45.123:80 rr
# -> 10.244.1.5:8080 Masq 1 12 45
# -> 10.244.2.8:8080 Masq 1 11 42
# -> 10.244.3.2:8080 Masq 1 13 48
IPVS provides real load balancing algorithms:
| Algorithm | Flag | Description |
|---|---|---|
| Round Robin | rr | Sequential rotation (default) |
| Least Connections | lc | Fewest active connections |
| Destination Hashing | dh | Hash destination IP — consistent routing |
| Source Hashing | sh | Hash source IP — session affinity |
| Shortest Expected Delay | sed | Factors in connection count AND weight |
| Never Queue | nq | Sends to idle server if available, else SED |
# Switch kube-proxy to IPVS mode
kubectl edit configmap kube-proxy -n kube-system
# Change: mode: "ipvs"
# Change: ipvs.scheduler: "lc" # least connections
# Restart kube-proxy pods to apply
kubectl rollout restart daemonset kube-proxy -n kube-system
IPVS mode is significantly better than iptables at scale. With iptables, adding a Service with 100 endpoints means adding 100+ iptables rules. Rule evaluation is O(n) — the kernel walks through rules linearly. IPVS uses hash tables for O(1) lookups. If your cluster has more than 1,000 Services or more than 5,000 Endpoints, switch to IPVS mode. The performance difference is dramatic.
kube-proxy Replacement: eBPF (Cilium)
Cilium can replace kube-proxy entirely using eBPF programs attached to the kernel networking stack. This is the most performant option.
# Cilium kube-proxy replacement:
# - No iptables rules for Services
# - No IPVS virtual servers
# - eBPF programs attached to socket and TC hooks
# - Socket-level redirection: traffic is redirected BEFORE
# it even enters the kernel networking stack
# Check if Cilium is replacing kube-proxy
cilium status | grep KubeProxyReplacement
# KubeProxyReplacement: True
kube-proxy Modes: iptables vs IPVS vs eBPF
iptables (Default)
Simple, works everywhere
IPVS / eBPF
Scalable, real load balancing
Part 2: NodePort — Exposing Services Outside the Cluster
NodePort builds on top of ClusterIP. It takes a ClusterIP Service and adds one thing: it opens a specific port on every node in the cluster.
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
type: NodePort
selector:
app: api
ports:
- port: 80 # ClusterIP port (internal)
targetPort: 8080 # Pod port
nodePort: 30080 # Port opened on every node (30000-32767)
protocol: TCP
$ kubectl get svc api-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
api-service NodePort 10.96.45.123 <none> 80:30080/TCP 5m
# Now reachable at ANY node IP on port 30080:
curl http://node-1-ip:30080 # works
curl http://node-2-ip:30080 # works (even if no api pods run on node-2)
curl http://node-3-ip:30080 # works
The traffic flow: Client → NodeIP:30080 → iptables DNAT → PodIP:8080
NodePort opens the port on ALL nodes, not just nodes running the target pods. If you have 100 nodes and only 3 run your pods, all 100 nodes accept traffic on port 30080 and forward it to one of the 3 pods. This is useful because external load balancers can point at all nodes without knowing which ones run the pods.
externalTrafficPolicy: Cluster vs Local
This is one of the most important and least understood Service settings.
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
type: NodePort
externalTrafficPolicy: Local # or Cluster (default)
Cluster (default): Traffic arriving at any node is distributed to ALL pods in the Service, regardless of which node they are on. This means a request might arrive at Node A, get forwarded to a pod on Node C. The downside: an extra network hop, and the source IP is lost (because the node SNATs the packet).
Local: Traffic arriving at a node is only sent to pods running ON THAT NODE. If no pods run on that node, the traffic is dropped (the node returns ICMP unreachable). The upside: no extra hop, and the source IP is preserved. The downside: uneven distribution if pods are not evenly spread across nodes.
# externalTrafficPolicy: Cluster (default)
# Client (1.2.3.4) → Node A:30080
# → iptables SNATs source to Node A IP
# → Forwards to Pod on Node C
# → Pod sees source IP = Node A (NOT the client)
# externalTrafficPolicy: Local
# Client (1.2.3.4) → Node A:30080
# → iptables forwards to Pod on Node A (local only)
# → Pod sees source IP = 1.2.3.4 (the real client IP!)
Our security team needed client source IPs for audit logging. All our Services were using the default externalTrafficPolicy: Cluster, which meant every request appeared to come from an internal node IP. We switched to Local, but immediately saw uneven load: 2 nodes had 3 pods each and got most traffic, while 3 other nodes had 1 pod each and got less. The fix was a combination of Local policy and a PodAntiAffinity rule to spread pods evenly across nodes.
externalTrafficPolicy: Cluster vs Local
Click each step to explore
Part 3: LoadBalancer — Cloud Integration
LoadBalancer type builds on NodePort. It creates a NodePort Service AND asks the cloud provider to create an external load balancer pointing at the NodePort on all nodes.
apiVersion: v1
kind: Service
metadata:
name: api-service
annotations:
# AWS-specific: use NLB instead of Classic LB
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
# AWS: make it internal (VPC only)
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
spec:
type: LoadBalancer
selector:
app: api
ports:
- port: 80
targetPort: 8080
protocol: TCP
$ kubectl get svc api-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
api-service LoadBalancer 10.96.45.123 a1b2c3-1234567890.elb.amazonaws.com 80:31245/TCP 2m
How It Works Under the Hood
The cloud controller manager (running in the cluster or as a cloud-hosted component) watches for Services of type LoadBalancer. When it sees one:
- Creates a cloud load balancer (AWS NLB/CLB, GCP Network LB, Azure LB)
- Configures health checks pointing at the NodePort on all nodes
- Registers all nodes as targets
- Updates the Service with the external IP/hostname in the
status.loadBalancer.ingressfield
The "EXTERNAL-IP pending" problem has exactly three causes: (1) The cloud controller manager is not running or not configured with correct credentials. (2) You hit a cloud quota limit (e.g., max Elastic IPs, max load balancers). (3) You are running on bare metal with no cloud integration — use MetalLB instead. Check cloud controller manager logs first: kubectl logs -n kube-system -l component=cloud-controller-manager.
Cloud Provider Specifics
# AWS: Creates an NLB by default (used to be CLB)
# Annotations control NLB behavior:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
# GCP: Creates a Network Load Balancer (L4)
# For L7, use GKE Ingress with BackendConfig instead
# Azure: Creates an Azure Load Balancer
# Annotations control SKU and internal/external
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
Each LoadBalancer Service creates a separate cloud load balancer. If you have 20 services of type LoadBalancer, you have 20 cloud load balancers — each with its own IP, its own cost, and its own management overhead. This gets expensive fast. Use an Ingress Controller with a single LoadBalancer Service instead, and route multiple services through path-based or host-based rules.
Part 4: ExternalName and Headless Services
ExternalName — DNS Alias
ExternalName is the simplest Service type. It creates a DNS CNAME record. No proxying, no load balancing, no iptables rules.
apiVersion: v1
kind: Service
metadata:
name: database
spec:
type: ExternalName
externalName: mydb.abc123.us-east-1.rds.amazonaws.com
# Inside any pod:
dig database.default.svc.cluster.local
# Returns: CNAME mydb.abc123.us-east-1.rds.amazonaws.com
# Your app connects to "database:5432" and it resolves to the RDS endpoint.
# If you migrate databases, change the ExternalName — no app changes needed.
ExternalName is useful for giving cluster-internal DNS names to external services (RDS databases, SaaS APIs, services in other clusters). It lets your application code reference database.default.svc.cluster.local instead of a cloud-specific hostname. If you migrate from AWS RDS to a self-hosted database inside the cluster, you just change the Service type from ExternalName to ClusterIP — no application changes.
Headless Service (clusterIP: None)
A headless Service has no ClusterIP. Instead of returning a virtual IP, DNS returns the IP addresses of all pods directly.
apiVersion: v1
kind: Service
metadata:
name: database
spec:
clusterIP: None # Headless — no virtual IP
selector:
app: database
ports:
- port: 5432
# Normal Service DNS:
dig api-service.default.svc.cluster.local
# Returns: 10.96.45.123 (ClusterIP — virtual IP)
# Headless Service DNS:
dig database.default.svc.cluster.local
# Returns: 10.244.1.5, 10.244.2.8, 10.244.3.2 (actual pod IPs!)
When to use headless Services:
- StatefulSets — each pod needs a stable DNS name (
database-0.database.default.svc.cluster.local) - Client-side load balancing — the client (or client library) picks which pod to connect to
- Service discovery — the client needs to know all pod IPs (e.g., for gossip protocols, Elasticsearch cluster formation)
Kubernetes Service Types — Layered
Everything below, PLUS creates a cloud load balancer with an external IP/hostname. Cloud controller manager provisions the LB and registers nodes as targets. Each LoadBalancer Service creates a separate cloud LB.
Everything below, PLUS opens a port (30000-32767) on every node in the cluster. External traffic reaches NodeIP:NodePort and gets forwarded to a pod via iptables/IPVS. externalTrafficPolicy controls routing and source IP preservation.
The foundation. Assigns a virtual IP from the Service CIDR. kube-proxy programs iptables/IPVS/eBPF rules on every node to DNAT traffic from ClusterIP:port to a randomly selected PodIP:targetPort. Only reachable from inside the cluster.
The Endpoints controller watches pods matching the Service selector and populates the Endpoints object with their IPs and ports. This is the source of truth for which pods receive traffic. No matching pods = empty Endpoints = Service routes nowhere.
The actual application containers. Each pod has a unique IP assigned by the CNI. Pods are ephemeral — their IPs change on restart. This is why Services exist: to provide a stable IP in front of changing pod IPs.
Hover to expand each layer
Part 5: Debugging Service Connectivity
When a Service is not routing traffic, work through this checklist:
# Step 1: Does the Service exist?
kubectl get svc api-service -n production
# Check: TYPE, CLUSTER-IP, PORT(S), EXTERNAL-IP
# Step 2: Do Endpoints exist?
kubectl get endpoints api-service -n production
# If empty: no pods match the selector, or pods are not Ready
# CRITICAL: selector labels must EXACTLY match pod labels
# Step 3: Does DNS resolve?
kubectl run debug --image=nicolaka/netshoot -it --rm -- \
dig api-service.production.svc.cluster.local
# Should return the ClusterIP
# Step 4: Can you reach the ClusterIP from inside the cluster?
kubectl run debug --image=nicolaka/netshoot -it --rm -- \
curl -v http://api-service.production.svc.cluster.local:80
# If timeout: check NetworkPolicy, kube-proxy logs, iptables rules
# Step 5: Can you reach the pod directly (bypassing the Service)?
kubectl run debug --image=nicolaka/netshoot -it --rm -- \
curl -v http://10.244.1.5:8080
# If this works but Step 4 fails: the issue is in kube-proxy/iptables
# Step 6: Check kube-proxy logs
kubectl logs -n kube-system -l k8s-app=kube-proxy | tail -50
# Step 7: Check iptables rules on the node (if iptables mode)
# SSH to the node, then:
iptables -t nat -L KUBE-SERVICES | grep api-service
A developer created a Service with selector: {app: api} but their pods had label app: api-server. The Service had zero Endpoints. Traffic went nowhere. They spent 2 hours checking NetworkPolicies, DNS, and firewall rules. The fix was changing one label. Always start debugging with kubectl get endpoints — if it is empty, the problem is in your selector, not the network.
Kubernetes 1.21+ introduced EndpointSlices as a more scalable replacement for Endpoints. If your Service has hundreds of pods, check kubectl get endpointslices instead of kubectl get endpoints. EndpointSlices break the endpoint list into smaller chunks (default 100 per slice) to reduce API server load when pods change frequently.
Key Concepts Summary
- ClusterIP is the foundation — a virtual IP that exists only in iptables/IPVS/eBPF rules, reachable only from inside the cluster
- kube-proxy iptables mode uses random selection, O(n) rule evaluation — fine for small clusters, degrades at scale
- kube-proxy IPVS mode provides real load balancing algorithms and O(1) lookups — switch to it for clusters with more than 1,000 Services
- eBPF (Cilium) replaces kube-proxy entirely with socket-level redirection — most performant option
- NodePort opens a port on every node (30000-32767) — all nodes route traffic, not just nodes with pods
- externalTrafficPolicy: Local preserves client source IPs but requires even pod distribution across nodes
- LoadBalancer creates a cloud load balancer automatically — each Service creates a separate LB, which gets expensive
- ExternalName is a DNS CNAME alias — no proxying, useful for referencing external services with cluster-internal DNS
- Headless Services (clusterIP: None) return pod IPs directly in DNS — required for StatefulSets and client-side load balancing
- Empty Endpoints is the number one cause of "Service not routing" — always check selector labels match pod labels
Common Mistakes
- Assuming you can ping a ClusterIP — it is a virtual IP in iptables rules, not a real network interface, and ping (ICMP) is not translated by DNAT rules
- Using
externalTrafficPolicy: Localwithout pod anti-affinity — leads to severely uneven load distribution - Creating many LoadBalancer Services instead of one Ingress Controller — wastes cloud load balancers and money
- Forgetting that NodePort range is 30000-32767 — specifying a port outside this range will be rejected
- Mismatched selector labels between Service and pods — the most common cause of empty Endpoints
- Not checking EndpointSlices in large clusters —
kubectl get endpointsmay appear empty when EndpointSlices actually exist - Leaving
externalTrafficPolicy: Clusterwhen source IP preservation is needed — all requests appear to come from internal node IPs
You have a NodePort Service with externalTrafficPolicy: Local. Node A has 2 pods, Node B has 0 pods. What happens when traffic arrives at Node B on the NodePort?