Observability Fundamentals for Engineers

The Prometheus Data Model

A developer adds their first Prometheus metric. They pick Gauge because it "seems flexible", it can go up and down. Two weeks later, someone on the team tries to query rate() on it and gets nonsense results. The developer shrugs: "rate must be broken." It is not broken, rate() is specifically for counters, which only go up. Their gauge was the wrong data type. Queries for one data type silently produce garbage when applied to another. Prometheus does not protect you; it assumes you chose the right metric type.

The Prometheus data model: time series, labels, samples, and the four metric types (Counter, Gauge, Histogram, Summary), is small but easy to misuse. This lesson covers what each type is, when to use each, and the queries they enable. Every metric you create should start with a deliberate choice of type.

Time Series, Labels, Samples

Every Prometheus metric is a time series: a sequence of (timestamp, value) samples identified by a metric name + label set.

http_requests_total{method="GET", status="200", endpoint="/api/users"}

Samples:
  2026-04-21T10:00:00   1000
  2026-04-21T10:00:15   1015
  2026-04-21T10:00:30   1030
  ...

The combination of metric name and label values uniquely identifies a time series. Changing any label value creates a new series. This is why cardinality (Lesson 1.3) matters, every unique combination is a separate series with its own memory and storage cost.

Scrape intervals

Prometheus pulls metrics on a schedule, the scrape interval, default 15 seconds. Every 15 seconds, Prometheus hits every target's /metrics endpoint, reads the current values, and appends a sample to each series.

Samples are stored in compressed columnar blocks. Default retention: 15 days (configurable, often 30-90 days in practice).

# Point a browser at your service's /metrics endpoint to see the raw format
curl -s http://localhost:8080/metrics | head -10

# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",endpoint="/api/users",status="200"} 1030
http_requests_total{method="GET",endpoint="/api/users",status="500"} 12
# HELP node_cpu_seconds_total CPU time by mode
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="user"} 12345.67

This is the Prometheus exposition format, a plain-text format that every Prometheus client library emits, and every compatible scraper understands.

KEY CONCEPT

The Prometheus data model has a tiny vocabulary: time series = metric name + label set, sample = timestamp + value, metric type = counter/gauge/histogram/summary. Master these four concepts and every tool compatible with Prometheus (VictoriaMetrics, Mimir, Thanos, Cortex, Datadog's Prometheus API) is immediately familiar. The data model is small, durable, and portable, this is its major strength.

Counter

A Counter is a monotonically-increasing value. It only goes up (or resets to zero on process restart).

Total HTTP requests served.
Total bytes written.
Total errors encountered.
Cumulative CPU time used.

The ABSOLUTE value of a counter is usually meaningless: "the service has handled 1,247,893 requests since last restart." What matters is the rate of change: "requests per second over the last 5 minutes."

Query counters with rate() or increase():

# Requests per second, averaged over 5 minutes
rate(http_requests_total[5m])

# Total increase over 5 minutes (equivalent to rate * duration)
increase(http_requests_total[5m])

# Per-second rate summed across all endpoints
sum(rate(http_requests_total[5m]))

Counter resets

When a process restarts, its counter drops to 0. Prometheus' rate() function detects this and handles it correctly, a drop is treated as a reset, not a negative value. This is why you should never subtract one raw counter value from another manually; use rate() / increase().

Counter naming

Convention: end counter names with _total. Examples:

http_requests_total
errors_total
bytes_written_total

The _total suffix is not required but widely adopted; tooling (recording rules, dashboards) often assumes it.

Gauge

A Gauge is a value that can go up or down. It represents a "current state" snapshot.

Current memory usage in bytes.
Current number of active connections.
Current temperature.
Current queue depth.

Gauges are simple. You just set the value:

# Python prometheus_client
queue_depth = Gauge('queue_depth', 'Items waiting in queue')
queue_depth.set(42)  # set absolute
queue_depth.inc()    # add 1
queue_depth.dec(3)   # subtract 3

Querying gauges

Just query them directly, no rate() needed:

# Current queue depth per service
queue_depth

# Average memory usage across all pods
avg(memory_usage_bytes)

# Max queue depth over the last 5 minutes
max_over_time(queue_depth[5m])

# Change in queue depth over time (delta, not rate)
delta(queue_depth[5m])

Gauges support delta() (for signed changes) but NOT rate()/increase() (which are for monotonic counters only).

Common mistake: counter as gauge

You ship a total_requests_served as a gauge. Your service restarts; the value drops from 10,000 to 0. You query delta(total_requests_served[5m]) and get a huge negative number. Wrong type. Counters have reset-awareness; gauges do not.

Histogram

A Histogram samples observations into buckets, letting you compute percentiles server-side.

Example: observing request durations.

REQUEST_DURATION = Histogram(
    'http_request_duration_seconds',
    'HTTP request duration in seconds',
    buckets=[0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10]
)

# Every request
with REQUEST_DURATION.time():
    handle_request()

Internally, a histogram exposes three things:

http_request_duration_seconds_bucket{le="0.01"} 142
http_request_duration_seconds_bucket{le="0.05"} 820
http_request_duration_seconds_bucket{le="0.1"}  1500
http_request_duration_seconds_bucket{le="0.5"}  1800
http_request_duration_seconds_bucket{le="1"}    1850
http_request_duration_seconds_bucket{le="+Inf"} 1890

http_request_duration_seconds_sum   120.45
http_request_duration_seconds_count 1890

Each _bucket{le="X"} is a counter of requests completed in X seconds or less (cumulative). _sum is total observed time; _count is total observations.

Computing percentiles

Prometheus computes percentiles from histogram buckets using histogram_quantile():

# p99 request duration over the last 5 minutes
histogram_quantile(0.99,
  rate(http_request_duration_seconds_bucket[5m]))

# p50 (median)
histogram_quantile(0.50,
  rate(http_request_duration_seconds_bucket[5m]))

# p95 per endpoint
histogram_quantile(0.95,
  sum by (endpoint, le) (
    rate(http_request_duration_seconds_bucket[5m])
  ))

Key insight: you can compute any percentile, aggregate across any dimensions, after the fact. This is what makes histograms the right choice for latency metrics, you do not need to decide in advance "record p95 and p99 but not p50."

Bucket choice matters

Histogram buckets are fixed at metric definition time. Choose them to span the range you care about with reasonable resolution:

# For a web service with typical latencies 10ms-5s
buckets=[0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]

# For a slow batch service with typical latencies 1s-60s
buckets=[0.5, 1, 2, 5, 10, 30, 60, 120]

Too few buckets → percentile estimates are coarse. Too many buckets → cardinality explodes (each bucket is a separate series × your label set).

Prometheus default histograms have ~11 buckets. Custom counts work but watch cardinality.

Native histograms (Prometheus 2.40+)

Prometheus' new native histogram type stores buckets dynamically with exponential boundaries: no pre-chosen buckets, low cardinality, much better percentile accuracy. As of 2024-2025 it is gaining adoption. If you are instrumenting new services, native histograms are worth evaluating.

Summary

A Summary is similar to a histogram but computes percentiles on the client side.

REQUEST_DURATION = Summary('http_request_duration_seconds', '...',
                          ['method', 'endpoint'])

Output looks like:

http_request_duration_seconds{quantile="0.5"} 0.050
http_request_duration_seconds{quantile="0.9"} 0.120
http_request_duration_seconds{quantile="0.99"} 0.450
http_request_duration_seconds_sum 2400.5
http_request_duration_seconds_count 18900

The Summary problem

Because each Prometheus client computes percentiles locally, you cannot aggregate summaries across instances. You cannot take p99 across 10 pods; you would get the average of each pod's p99, which is not the p99 of the combined distribution.

Use Histograms, not Summaries, in almost all cases. Summaries exist for specific edge cases (when you need very precise percentiles within one instance) but Histograms aggregate correctly across the fleet.

WARNING

If you inherit a Prometheus-based system that uses Summaries for latency, the percentile values on your dashboards are wrong whenever you aggregate (sum/avg) across pods. Migrate to Histograms. The effort is small (change the metric type in your code); the correctness win is large.

The Decision Matrix

When you add a new metric, pick the type with this table:

Question	Type
Count of events (requests, errors, bytes written)	Counter
Current value (temperature, queue depth, memory)	Gauge
Distribution (latency, response size, batch size)	Histogram
(Avoid) Per-instance percentile only	Summary

The most common mistakes:

Gauge for count-like things: you lose reset-awareness.
Counter for current-state values: rate() does not make sense.
Summary for latency: cannot aggregate across pods.

Client Libraries

Every major language has an official Prometheus client:

# Python — prometheus_client
from prometheus_client import Counter, Gauge, Histogram

REQUESTS = Counter('http_requests_total', 'Total HTTP requests', ['method', 'endpoint', 'status'])
ACTIVE_CONNS = Gauge('active_connections', 'Active connections')
LATENCY = Histogram('http_request_duration_seconds', 'Request duration')

@app.route('/')
def index():
    with LATENCY.time():
        # ... handle ...
        REQUESTS.labels(method='GET', endpoint='/', status='200').inc()
        return 'OK'

// Go — client_golang
var (
    requests = promauto.NewCounterVec(prometheus.CounterOpts{
        Name: "http_requests_total",
    }, []string{"method", "endpoint", "status"})
    activeConns = promauto.NewGauge(prometheus.GaugeOpts{
        Name: "active_connections",
    })
    latency = promauto.NewHistogramVec(prometheus.HistogramOpts{
        Name:    "http_request_duration_seconds",
        Buckets: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10},
    }, []string{"method", "endpoint"})
)

func handler(w http.ResponseWriter, r *http.Request) {
    start := time.Now()
    defer func() {
        latency.WithLabelValues(r.Method, r.URL.Path).Observe(time.Since(start).Seconds())
    }()
    // ... handle ...
    requests.WithLabelValues(r.Method, r.URL.Path, "200").Inc()
}

Similar APIs in Node.js (prom-client), Java (simpleclient), Rust (prometheus), .NET (prometheus-net).

The Exposition Format

Your service exposes metrics by running an HTTP handler that returns the current state of every metric. The format:

# HELP metric_name Human-readable description
# TYPE metric_name counter|gauge|histogram|summary
metric_name{label1="value1",label2="value2"} value
# Another metric...

Prometheus scrapes this text once per interval. Your service rebuilds and emits it on every scrape (client libraries maintain internal state and render on demand).

Performance implication: metric collection has near-zero cost at scrape time (a few milliseconds to serialize even large metric sets). The cost is in the client library's per-update overhead (atomic increments, etc.), which is negligible for most workloads.

Integration With OpenTelemetry

OpenTelemetry (Module 4) is the vendor-neutral observability standard. Its metrics API can output to Prometheus format, you instrument your code once with OTel, and a Prometheus exporter renders it in the right format.

# OpenTelemetry metrics → Prometheus
from opentelemetry.metrics import get_meter
from opentelemetry.exporter.prometheus import PrometheusMetricReader

reader = PrometheusMetricReader()
meter = get_meter("myapp", meter_provider=provider_with(reader))

# Use OTel API:
request_counter = meter.create_counter("http_requests_total")
request_counter.add(1, {"method": "GET", "status": "200"})

# Prometheus scrapes /metrics — sees data in Prometheus format

This is the direction of modern instrumentation. OTel as the vendor-neutral SDK, Prometheus as one of many possible backends.

Data Model Visual

Key Concepts Summary

Time series = metric name + label set. Each unique combination is a separate series.
Samples = timestamp + value. Prometheus scrapes every 15s by default.
Four metric types: Counter, Gauge, Histogram, Summary.
Counter: only goes up; query with rate() / increase(); use for request counts, error counts, bytes.
Gauge: goes up and down; query directly; use for current values like memory, queue depth.
Histogram: bucketed distribution; use histogram_quantile() to compute percentiles; aggregates across pods correctly.
Summary: client-side percentiles; do NOT aggregate across pods; avoid in favor of Histogram.
Picking the wrong type is a silent bug, queries run, numbers are wrong.
Cardinality = product of label-value counts. Covered in Lesson 1.3.
Exposition format is plain text at /metrics; every client library and scraper speaks it.
Native histograms (Prometheus 2.40+) give better percentile accuracy with lower cardinality, worth adopting for new services.

Common Mistakes

Using Gauge for count-like metrics. You lose reset-awareness.
Using Counter for current-state values. rate() does not make sense.
Subtracting two raw counter values instead of using rate(). You miss counter resets.
Using Summary for latency then aggregating across pods. Percentiles are wrong.
Choosing histogram buckets that do not span the range you actually see. Percentiles become useless.
Too many histogram buckets. Cardinality blow-up per label set.
Forgetting the _total suffix on counters. Tooling and conventions assume it.
Not reading your /metrics endpoint during development. You learn format, labels, and types by inspecting the output.
Skipping native histograms for new services. Reduces cardinality significantly without sacrificing percentile accuracy.
Building custom exporters instead of using prometheus client libraries. Every language has one; roll-your-own is almost never right.

KNOWLEDGE CHECK

You have a metric `active_db_connections` that reports the current connection pool size (0-50). You want to see the average, max, and recent spike behavior. Which Prometheus metric type should this be, and which queries would you write?

Cardinality and Why It Matters

Continue

PromQL Fundamentals

←→ navigateM toggle sidebar