Status Codes That Matter in Production
Your monitoring dashboard lights up. The 502 error rate on the API gateway just jumped from 0.01% to 15%. PagerDuty fires. Three teams jump on a call. The frontend team says it is a backend issue. The backend team says their pods are healthy. The platform team checks the ingress controller.
Everyone is guessing. But the status code itself — 502 — tells you exactly where the problem is. A 502 means the reverse proxy got a bad response from the upstream. The backend is reachable but responding with garbage, or crashing mid-response. This is fundamentally different from a 503 (overloaded) or a 504 (timeout).
This lesson covers every status code you will encounter in production Kubernetes environments, what each one actually means at the protocol level, and what to do when you see it.
Part 1: 2xx — Success
The 2xx range means the request was received, understood, and accepted. But which 2xx code matters.
The Codes
| Code | Name | Meaning | When to use |
|---|---|---|---|
| 200 | OK | Request succeeded, response body has the result | GET requests, search results |
| 201 | Created | A new resource was created | POST requests that create entities |
| 202 | Accepted | Request accepted for processing, not yet complete | Async operations (job started, will finish later) |
| 204 | No Content | Request succeeded, no body to return | DELETE requests, PUT that does not return the entity |
# 200 — standard success
curl -s -o /dev/null -w "%{http_code}" https://api.example.com/users
# 200
# 201 — resource created (check the Location header for the new URL)
curl -v -X POST https://api.example.com/users \
-H "Content-Type: application/json" \
-d '{"name": "Alice"}'
# < HTTP/2 201
# < Location: /users/456
# 204 — success, no body
curl -v -X DELETE https://api.example.com/users/456
# < HTTP/2 204
# (empty body)
When designing APIs, use 201 (not 200) for resource creation and include a Location header pointing to the new resource. Use 204 (not 200 with empty body) when there is nothing to return. These distinctions matter for client libraries, API documentation generators, and any middleware that behaves differently based on status codes. Many monitoring tools track 201s separately from 200s to measure creation rates.
Part 2: 3xx — Redirects
The 3xx range means the client needs to take additional action to complete the request — usually following a different URL.
The Codes
| Code | Name | Cacheable? | Method preserved? | Use case |
|---|---|---|---|---|
| 301 | Moved Permanently | Yes | No (may change to GET) | Domain migration, HTTP to HTTPS |
| 302 | Found | No | No (may change to GET) | Temporary redirect (legacy, ambiguous) |
| 307 | Temporary Redirect | No | Yes (must keep method) | Temporary redirect (correct behavior) |
| 308 | Permanent Redirect | Yes | Yes (must keep method) | Permanent redirect (preserves method) |
The critical difference between 301/302 and 307/308 is method preservation. A 301 redirect of a POST request may be changed to a GET by the browser (and most clients do this). A 308 redirect of a POST keeps it as a POST. If your API redirects are changing POST requests to GET and losing the body, you need 307/308 instead of 301/302.
# See the redirect chain
curl -v -L https://example.com/old-path
# < HTTP/2 301
# < location: https://example.com/new-path
# > GET /new-path HTTP/2
# < HTTP/2 200
# Common in Kubernetes: HTTP to HTTPS redirect
curl -v http://api.example.com/users
# < HTTP/1.1 308 Permanent Redirect
# < Location: https://api.example.com/users
# NGINX Ingress annotation for HTTP to HTTPS redirect
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
Redirect chains (A redirects to B, B redirects to C, C redirects to D) multiply latency and confuse debugging. Each redirect is a full HTTP round trip. Most clients follow a maximum of 5-10 redirects before giving up with "too many redirects." In Kubernetes, the most common redirect loop is when the ingress redirects HTTP to HTTPS, and a downstream proxy redirects HTTPS back to HTTP. Check your ingress annotations and backend configuration to ensure they agree on the protocol.
Part 3: 4xx — Client Errors
The 4xx range means the request was wrong — bad syntax, missing authentication, forbidden resource, or nonexistent URL. The client needs to fix something before retrying.
The Codes
| Code | Name | What it really means | Common cause in K8s |
|---|---|---|---|
| 400 | Bad Request | Request syntax is invalid | Malformed JSON, missing required field, wrong Content-Type |
| 401 | Unauthorized | Authentication missing or invalid | Expired token, missing Authorization header |
| 403 | Forbidden | Authenticated but not authorized | RBAC denial, missing permissions |
| 404 | Not Found | The URL does not match any route | Wrong path, missing Ingress rule, backend route not defined |
| 405 | Method Not Allowed | The HTTP method is not supported for this path | POST to a GET-only endpoint |
| 408 | Request Timeout | Client took too long to send the request | Slow client, large upload on slow connection |
| 413 | Content Too Large | Request body exceeds server limits | File upload exceeds NGINX client_max_body_size |
| 415 | Unsupported Media Type | Content-Type header not accepted | Sending JSON without application/json header |
| 429 | Too Many Requests | Rate limit exceeded | API throttling, too many requests from one client |
The HTTP spec named 401 "Unauthorized" but it actually means "Unauthenticated" — the server does not know who you are. Code 403 "Forbidden" means "Authenticated but not Authorized" — the server knows who you are but you do not have permission. This naming confusion causes bugs in API design everywhere. When building or debugging APIs: 401 means "show the login page," 403 means "show the access denied page."
429 Too Many Requests — Rate Limiting
Rate limiting is essential in production. When you hit a 429, check the response headers:
curl -v https://api.example.com/users
# < HTTP/2 429
# < Retry-After: 30
# < X-RateLimit-Limit: 100
# < X-RateLimit-Remaining: 0
# < X-RateLimit-Reset: 1699900800
| Header | Meaning |
|---|---|
Retry-After | Seconds to wait before retrying |
X-RateLimit-Limit | Total requests allowed per window |
X-RateLimit-Remaining | Requests left in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
When your service returns 429, always include a Retry-After header. Well-behaved clients use it to back off. Without it, clients will retry immediately and make the overload worse. On the client side, implement exponential backoff with jitter: wait 1s, then 2s, then 4s (plus random jitter) — never retry in a tight loop.
413 Content Too Large — The NGINX Gotcha
# Your file upload fails with 413
curl -v -X POST https://api.example.com/upload \
-F "file=@large-report.pdf"
# < HTTP/1.1 413 Request Entity Too Large
In Kubernetes with NGINX Ingress, the default client_max_body_size is 1MB. Any request body larger than that gets rejected before it reaches your backend:
# Fix: increase the limit in the Ingress annotation
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "50m" # Allow up to 50MB
A team deployed a new document upload feature that worked perfectly in development (no ingress, direct port-forward). In staging, uploads over 1MB silently failed. The frontend showed a generic "upload failed" error. The backend logs showed nothing — the request never reached the backend. It took four hours to realize the NGINX Ingress was rejecting the request before it was proxied. The 413 response was only visible in the browser network tab. Always check ingress-level limits when debugging request failures that do not appear in backend logs.
Part 4: 5xx — Server Errors
The 5xx range means the server failed to fulfill a valid request. The client did nothing wrong — the server (or the infrastructure in front of it) has a problem.
The Codes
| Code | Name | What broke | Where to look |
|---|---|---|---|
| 500 | Internal Server Error | The application crashed or threw an unhandled exception | Application logs |
| 502 | Bad Gateway | The reverse proxy got an invalid response from the backend | Ingress logs, backend health |
| 503 | Service Unavailable | The server is overloaded or explicitly refusing requests | Pod readiness, circuit breakers |
| 504 | Gateway Timeout | The reverse proxy timed out waiting for the backend | Backend latency, proxy timeout config |
5xx Error Decision Tree — 502 vs 503 vs 504
Where the error originates
Understanding the error source
What to check first
Debugging each error type
502 Bad Gateway — The Proxy Got Garbage
A 502 means the reverse proxy (NGINX Ingress, AWS ALB, Envoy) successfully connected to the backend but received an invalid response. Common causes:
# Cause 1: Backend pod crashed mid-response
# The proxy opened a TCP connection, sent the request, but the backend died
# before completing the response
# Cause 2: Protocol mismatch
# Ingress is sending HTTP/1.1 but the backend expects HTTP/2 (or vice versa)
# Fix for NGINX Ingress:
# nginx.ingress.kubernetes.io/backend-protocol: "GRPC" (for gRPC backends)
# nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" (for HTTPS backends)
# Cause 3: targetPort mismatch
# Service points to port 80 but the pod listens on 8080
# The proxy connects to port 80, nothing responds properly → 502
# Cause 4: Pod terminated during rolling update
# Ingress still has the old pod IP in its upstream list
# Fix: proper readiness probes + preStop hooks
# Debug: check the NGINX Ingress error log
kubectl logs -n ingress-nginx deploy/ingress-nginx-controller | grep "502"
# upstream prematurely closed connection while reading response header
"502 Bad Gateway" almost always means the problem is between the reverse proxy and the backend, not between the client and the reverse proxy. The client's request was fine. The proxy forwarded it to the backend, and the backend either crashed, returned nonsense, or closed the connection prematurely. Start debugging at the backend pod — check logs, check if it is running, check if it is listening on the correct port and protocol.
503 Service Unavailable — Explicitly Overloaded
A 503 means the backend is explicitly saying "I cannot handle this request right now." This is different from a 502 (where the backend broke) — a 503 is an intentional response.
# Common sources of 503 in Kubernetes:
# 1. No ready pods — all pods are failing readiness probes
kubectl get pods -l app=api
# NAME READY STATUS RESTARTS AGE
# api-abc123 0/1 Running 0 5m ← not ready
# 2. All endpoints removed from the Service
kubectl get endpoints api-service
# NAME ENDPOINTS AGE
# api-service <none> 5m ← no backends
# 3. Circuit breaker open (Istio/Envoy)
# The service mesh detected too many errors and is short-circuiting requests
# 4. Application-level rate limiting
# The app itself returns 503 when it is at capacity
504 Gateway Timeout — Backend Too Slow
A 504 means the reverse proxy waited for the backend to respond, and the backend did not respond within the timeout period. The request may still be processing on the backend.
# NGINX Ingress default timeouts
# proxy-connect-timeout: 5s (time to establish TCP connection to backend)
# proxy-send-timeout: 60s (time to send the request to backend)
# proxy-read-timeout: 60s (time to read the response from backend)
# If your API endpoint takes 90 seconds to process a report:
# → NGINX gives up after 60 seconds → returns 504
# → But the backend is still processing the request!
# Fix: increase timeout for slow endpoints
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
nginx.ingress.kubernetes.io/proxy-send-timeout: "120"
Increasing timeouts is a band-aid, not a fix. If your backend regularly takes more than 60 seconds to respond, you have an architectural problem. Convert long-running operations to async: accept the request immediately (return 202 Accepted), process it in the background, and let the client poll for completion or receive a webhook. Never make a user's browser wait 2 minutes for an HTTP response.
Part 5: Debugging Status Codes in Kubernetes
When you see an error status code, the first question is: who generated it? Was it the application, the ingress controller, the cloud load balancer, or the service mesh sidecar?
The Debugging Chain
# Step 1: Is the error from the application or the ingress?
# Check the ingress controller logs:
kubectl logs -n ingress-nginx deploy/ingress-nginx-controller --tail=50 | grep "api.example.com"
# If the error is in the ingress log with "upstream" messages → problem is behind the ingress
# If the error is NOT in the ingress log → problem is in front of the ingress (LB, DNS)
# Step 2: Can the ingress reach the backend?
# Port-forward directly to the pod, bypassing the ingress:
kubectl port-forward pod/api-abc123 8080:8080
curl -v http://localhost:8080/users
# If this works → problem is in the ingress configuration
# If this fails → problem is in the application
# Step 3: Is the Service routing correctly?
kubectl get endpoints api-service
# ENDPOINTS
# 10.244.1.5:8080,10.244.2.10:8080 ← healthy endpoints exist
# Step 4: Check for recent pod restarts
kubectl get pods -l app=api --sort-by=.status.containerStatuses[0].restartCount
# If restarts are happening, 502s coincide with pods restarting
# Step 5: Check backend response time
kubectl exec -n ingress-nginx deploy/ingress-nginx-controller -- \
curl -s -o /dev/null -w "time_total: %{time_total}s\n" \
http://api-service.default.svc:80/users
# If time_total exceeds proxy-read-timeout → 504s expected
Status Code Debugging Flowchart
Click each step to explore
Build a mental model of the request path: Client to Cloud LB to Ingress Controller to K8s Service to Pod. Each layer can generate different error codes. When debugging, bisect the path: port-forward directly to the pod to determine if the problem is in the application or in the infrastructure. This single technique eliminates half the search space immediately.
Key Concepts Summary
- 2xx means success: 200 (OK), 201 (Created), 204 (No Content) — use the specific code, not just 200 for everything
- 3xx means redirect: 301/308 are permanent, 307 is temporary. 307/308 preserve the HTTP method; 301/302 may change POST to GET
- 4xx means the client is wrong: 400 (bad syntax), 401 (not authenticated), 403 (not authorized), 404 (not found), 429 (rate limited)
- 401 means "unauthenticated" not "unauthorized" despite the confusing name — 403 is the real "unauthorized"
- 5xx means the server is broken: the critical trio is 502/503/504, each pointing to a different failure
- 502 Bad Gateway: proxy connected to backend but got an invalid response — check targetPort, protocol, pod health
- 503 Service Unavailable: backend explicitly refuses — check readiness probes, endpoints, circuit breakers
- 504 Gateway Timeout: proxy waited too long for backend — check backend latency, proxy-read-timeout
- The ingress controller is the most common source of 5xx in K8s — always check its logs first
Common Mistakes
- Returning 200 for everything and putting the real status in the JSON body — breaks HTTP caching, monitoring, and middleware
- Confusing 401 (not authenticated) with 403 (not authorized) — these have different UI and retry implications
- Assuming 502 means "the backend is down" — it means the backend responded badly, which could be a protocol mismatch or crash mid-response
- Increasing proxy-read-timeout to 5 minutes instead of making the operation async — you are masking a design problem
- Not setting
proxy-body-sizein the ingress annotation — file uploads fail silently at 1MB default - Ignoring
Retry-Afterheaders from 429 responses and retrying immediately — this makes rate limiting worse - Not checking ingress controller logs when debugging 5xx — the application logs may show nothing if the request never reached the app
Your monitoring shows a spike in 502 errors on an NGINX Ingress. The backend pods show no errors in their logs. What is the most likely cause?