All posts
Kubernetes Security

A Pod in Your Cluster Just Got Compromised. Walk Me Through the Blast Radius.

One container gets popped — an RCE in an app, a malicious dependency, a leaked token. The junior answer is 'kill the pod.' The senior answer traces the blast radius: from the mounted ServiceAccount token to the API server, across a flat pod network to the cloud metadata endpoint, and through a privileged pod to the node and every secret on it. The attacker's path layer by layer, and the single control that caps the damage at each one — the difference between 'one pod' and 'whole cluster.'

By Sharon Sahadevan··13 min read

"A pod in your cluster just had its container compromised — an RCE in the app, say. Walk me through the blast radius. What can the attacker reach, and what stops them?"

This is one of the most revealing questions in a senior Kubernetes security interview, and the answer that fails is the reflexive one: "I'd kill the pod and roll the deployment." That contains the symptom, not the breach. It also tells the interviewer you think of a compromised pod as an isolated event rather than the start of a path. The whole point of the question is the path — how far one container compromise propagates before something stops it, and whether anything stops it at all.

In a default-configured cluster, the honest answer is alarmingly far. Kubernetes optimizes for developer velocity out of the box, and most of its security boundaries are opt-in. A pod compromise in a cluster nobody hardened can become a cluster compromise in a handful of steps. In a cluster someone did harden, the same compromise dead-ends at the pod. This post is that path, layer by layer, and the single control that caps the damage at each one — because "blast radius" is not a vibe, it is a specific chain of reachable resources, and reasoning about it is the entire skill.

KEY CONCEPT

The security question is never "did a pod get compromised" — assume it will. The question is "how far does one container compromise propagate before a control stops it." That distance is your blast radius, and it is set entirely by configuration you chose (or didn't): RBAC scope, token mounting, network policy, pod security level, egress control. A flat default cluster collapses to "one pod = whole cluster." A hardened one holds the line at "one pod = one pod." Same breach, completely different incident.

Step 1: the ServiceAccount token sitting in the pod#

The first thing a competent attacker does inside a container is look for credentials, and Kubernetes hands them one by default. Unless you opted out, every pod has a ServiceAccount token mounted at /var/run/secrets/kubernetes.io/serviceaccount/:

/var/run/secrets/kubernetes.io/serviceaccount/
├── token        # a JWT, bearer credential for the Kubernetes API
├── ca.crt        # the API server's CA, so the attacker can trust the endpoint
└── namespace     # which namespace this pod runs in

That token is a bearer credential to the Kubernetes API. The attacker reads it, points curl (or a smuggled kubectl) at the in-cluster API endpoint, and now acts as that ServiceAccount. How much that buys them is entirely a function of RBAC — what that SA is authorized to do.

This is where most of the real-world damage lives, because the default is better than people fear but the common practice is worse. A freshly created default ServiceAccount has essentially no permissions under modern RBAC. But clusters drift: an app was granted list secrets in its namespace "to read a config," a Helm chart bound a broad role "to make it work," someone attached a ClusterRole with * verbs because debugging was faster that way. The attacker enumerates exactly this — what can this token do? — and the answer is the next link in the chain.

  • The token is just a JWT, and the API server validates it the same way any resource server validates a bearer token — which is why the JWT validation post is more than tangential here: the same signature, audience, and expiry checks are what stand between a stolen token and impersonation.

The control: least-privilege RBAC, audited regularly, plus automountServiceAccountToken: false on every pod that does not actually call the API (which is most of them). If the compromised pod never needed API access, the token should not have been there to steal. A cluster with 47 ClusterRoleBindings to cluster-admin — the audit scenario from the security interview loop — is a cluster where this step is a coin flip away from total compromise.

Step 2: from the API to more credentials and more pods#

Suppose the token can read Secrets in its namespace. Now the blast radius includes every credential those Secrets hold — database passwords, third-party API keys, other service tokens — and the breach starts spreading outside the cluster, into whatever those credentials unlock.

Suppose instead the token can create pods. That is quietly one of the most dangerous permissions in Kubernetes, because the attacker no longer needs to escape the current pod — they can schedule a new one with whatever they want: a pod that mounts the host filesystem via hostPath, runs privileged, and tolerates control-plane taints to land on a control-plane node. create pods plus no admission control is a direct path to node and control-plane access without ever touching a kernel exploit.

The control: RBAC that treats create/update pods, create rolebindings, escalate, impersonate, and Secret read access as the high-privilege verbs they are, granted to almost nothing — and admission control (Pod Security Admission, OPA Gatekeeper, or Kyverno) that refuses to admit the privileged, host-mounting pod even if the attacker is authorized to create pods. Authorization says who can ask; admission says what is allowed to exist. You need both.

Step 3: the flat network and the metadata endpoint#

While the API path plays out, the attacker also probes the network — and here the default is genuinely dangerous. Kubernetes pod networking is flat by default: every pod can reach every other pod, the API server, and the node's link-local addresses, with no policy in the way. Absent a NetworkPolicy, a compromised pod in the frontend namespace can open connections to your database pods in data, your internal admin services, and everything in between. Lateral movement is the default posture, not an exploit.

The sharpest edge is the cloud metadata endpoint at 169.254.169.254. From inside almost any pod on a managed cluster, that address is reachable, and on AWS (without IMDSv2 hardening) it will hand back the node's instance IAM role credentials. Those are typically far broader than anything the pod should have — pull from ECR, read S3, sometimes describe or modify infrastructure. A single compromised pod reaching IMDS can mean cloud-account credentials, which is a breach that has now left Kubernetes entirely. This is the cloud-credential cousin of the federation mechanics in the GitHub Actions OIDC to AWS post: the same instance-role and token plumbing that makes CI/CD convenient is what an attacker harvests from a popped pod.

The control: a default-deny NetworkPolicy in every namespace (deny all ingress and egress, then allow only what each workload needs), which both stops lateral movement and — critically — blocks egress to 169.254.169.254. On AWS, enforce IMDSv2 with a hop limit of 1 so a pod cannot reach the node's metadata creds at all. Most clusters write a few ingress policies and forget egress entirely; the egress rule is the one that contains a breach.

WARNING

The most common network-security mistake is treating NetworkPolicy as inbound-only. Teams write ingress rules to restrict who can reach a service and consider it done — leaving egress wide open. But the blast radius of a compromised pod is almost entirely an egress story: reaching other pods, reaching the metadata endpoint, exfiltrating data, calling home. A policy that controls ingress and ignores egress secures the front door and leaves the building through the back. Default-deny has to mean both directions.

Step 4: container escape to the node#

If the compromised pod is privileged — or has hostPath mounts, hostPID, hostNetwork, or dangerous capabilities like CAP_SYS_ADMIN — the attacker may not need any of the above. They can escape the container to the node directly. A pod that mounts the host root filesystem can write to it; a privileged pod can manipulate the host's devices and cgroups; hostPID exposes every process on the node. Once on the node, the attacker has the kubelet's credentials and access to every container running there — including the secrets mounted into those pods.

Node Authorization and the NodeRestriction admission plugin limit a kubelet to the secrets of pods actually scheduled to its node, so a single node compromise is not automatically a cluster compromise — but it is every workload on that node, and if a privileged or control-plane-adjacent pod happens to run there, the escalation continues. The ultimate target behind all of this is etcd, where every Secret and the entire cluster state live in (sometimes unencrypted) plaintext; reach etcd and the cluster is wholly owned.

The control: Pod Security Admission set to restricted (or an equivalent Gatekeeper/Kyverno policy) on every namespace that runs application workloads. restricted forbids privileged containers, host namespaces, host path mounts, and added capabilities — it removes the escape primitives before they can be used. This is also the control most disrupted by migration: moving from the old PodSecurityPolicy to Pod Security Admission breaks workloads that quietly depended on privileges, which is why the audit-then-enforce rollout (label namespaces warn/audit first, find what breaks, fix it, then enforce) is the pattern that does not cause an outage.

Step 5: did anyone even see it?#

Run the whole chain back and ask the question that actually decides the incident: would you know? Every step above generates signal — anomalous API calls from a workload SA, a pod reaching the metadata endpoint, a container spawning a shell, an unexpected create pods. Whether that signal is captured is its own control surface, and it is the difference between a contained incident and a forensic guessing game weeks later.

  • API audit logging with a real audit policy captures the token's API calls — the enumeration, the Secret reads, the pod creation. Without it, you cannot reconstruct what the attacker touched.
  • Runtime detection (Falco, Tetragon, eBPF-based monitoring) catches the post-exploitation behavior — a shell in a container that should never spawn one, a process reading the SA token path, an outbound connection to 169.254.169.254. These fire on the actions, not the configuration, so they catch the steps your preventive controls missed.

The control: audit logging tuned to record the high-value verbs (not everything — that drowns you), shipped off-cluster so an attacker who reaches the node cannot erase it, plus runtime detection with alerts wired to something a human reads. Detection does not shrink the blast radius; it bounds the time the attacker operates undetected, which is the other axis of how bad an incident gets.

The reasoning the question is testing#

Notice the shape of a strong answer. It does not list security products. It traces a path — token, API, network, node, etcd — and at each node of the path it names both the attacker's next move and the specific control that caps it. That is the move a senior security interview is built to surface, and it is the same move that runs a real incident review: not "are we secure" (a question with no answer) but "for each layer, how far does a compromise propagate, and what stops it."

It also forces the trade-off conversation that separates seniority from certification knowledge. Default-deny egress will break workloads that legitimately call external services — so you need the egress allowlist and the rollout plan. restricted Pod Security breaks privileged workloads — so you need audit-then-enforce. Tight RBAC breaks the app that was quietly reading Secrets it shouldn't — so you need the audit before the lockdown. Every control has an operational cost, and the candidate who can sequence the lockdown without taking production down is the one who has actually done it. This is the security-round counterpart to the reasoning gap in Most Courses Teach Tools. Senior DevOps Interviews Test Architecture. — knowing what restricted does is the knowledge answer; sequencing the migration without an outage is the reasoning answer.

Common mistakes#

Mounting ServiceAccount tokens into pods that never call the API. The default is to mount; the right default for most workloads is automountServiceAccountToken: false. A token that isn't there can't be stolen.

RBAC that drifted to permissive. cluster-admin bindings handed out for convenience, list secrets granted "temporarily," wildcard verbs in app roles. Audit for the high-privilege verbs (create pods, escalate, impersonate, bind, Secret access) and shrink relentlessly.

Ingress-only NetworkPolicy. The blast radius is an egress story. Default-deny must cover egress, and the metadata endpoint must be explicitly blocked.

No Pod Security enforcement. Without restricted (or equivalent), one misconfigured deployment with privileged: true is a node-escape primitive sitting in your cluster waiting to be used.

Reachable cloud metadata. A pod that can hit 169.254.169.254 can often grab the node's IAM role. Enforce IMDSv2 hop-limit 1 and block egress to the link-local range.

Plaintext Secrets in etcd, no encryption at rest. etcd is the highest-value target; if it is unencrypted, reaching it hands over every credential at once. Turn on encryption at rest and lock down etcd access.

No audit log, no runtime detection. If you cannot see the API calls and the in-container behavior, you cannot scope the breach or even know it happened. Detection bounds dwell time; skipping it makes every incident maximal.

Treating "kill the pod" as remediation. Killing the pod removes the foothold if that was the only foothold. After a real compromise you rotate the credentials it could reach, review audit logs for what it touched, and assume lateral movement until proven otherwise.

The mental model#

A Kubernetes cluster is not secure or insecure as a binary — it has a blast radius, the measured distance a single pod compromise travels before a control stops it, and your job is to make that distance as short as possible while keeping production running. The defaults make it long: a token in every pod, a flat network, reachable metadata, no enforced pod security, no egress control. Each hardening step is a wall placed across the attacker's path — least-privilege RBAC and no auto-mounted tokens cap step 1, admission control and verb-scoped RBAC cap step 2, default-deny egress and IMDSv2 cap step 3, restricted Pod Security caps step 4, audit and runtime detection bound the time across all of them.

You will never prevent every compromise; that is the wrong goal and a senior interviewer knows it. The goal is that when a pod is compromised — and one eventually will be — the answer to "walk me through the blast radius" is short, because every layer the attacker reaches for has a wall already standing. That is what it means to secure a cluster the way an attacker thinks: not a checklist of products, but a path you have already walked, and closed, before they get to.


The full attacker's-eye path through a cluster — API server hardening, the RBAC and ServiceAccount-token deep dive, STRIDE threat modeling, Pod Security Admission with the PSP-migration patterns, default-deny network policy, runtime detection with Falco and Tetragon, supply-chain and image security, audit logging that actually catches incidents, and the IR playbook for a compromised pod — is the Kubernetes Security course. The identity primitives underneath it (tokens, OIDC, mTLS, PKI) are the Identity and Trust course, the security-system-design walkthrough lives in Kubernetes System Design Interview Prep, and the incident-response and forensics side pairs with Kubernetes Debugging. Related reading: Most Courses Teach Tools. Senior DevOps Interviews Test Architecture. for the reasoning frame this question tests, How GitHub Actions OIDC to AWS Actually Works for the cloud-credential and supply-chain plumbing an attacker harvests, and Your JWT Validation Is Broken for how the ServiceAccount token is verified — and how that verification fails.