How Auth0 Detects Stolen Refresh Tokens (and Why You Should Implement the Same)

Refresh-token rotation is a known good practice. The 'reuse detection' that goes with it is what actually catches stolen tokens. Here is how the mechanism works and how to implement it correctly.

By Sharon Sahadevan·May 15, 2026·11 min read

Refresh-token rotation is one of those security practices that everyone agrees with and almost nobody implements completely. The "rotation" part is easy: every refresh produces a new refresh token, the old one is invalidated. The "reuse detection" part is what actually makes the system catch stolen tokens, and it is a single elegant mechanism that most implementations skip. The whole reason refresh tokens exist in the first place is to keep access tokens short, which is the right way to escape the JWT-as-session trap.

This post is the full mechanism: rotation plus reuse detection plus token-family revocation. It is what Auth0 documents and what your IdP probably already does. If you build your own auth, this is the pattern to copy.

The problem rotation alone does not solve#

The "rotating refresh tokens" pattern, as commonly described:

1. User logs in. Server issues access_token (15 min) + refresh_token_A (7 days).
2. Access token expires. Client calls /token with refresh_token_A.
3. Server validates A; issues access_token + refresh_token_B; invalidates A.
4. Client uses B from now on.
5. Repeat: B -> C, C -> D, etc.

This is rotation. It is good. It limits the lifetime of any single refresh token to one use.

What it does not solve: an attacker who steals refresh_token_A and uses it before the legitimate client does.

Attacker steals refresh_token_A (XSS, malware, log leak, browser exploit).
Attacker calls /token with A first. Server issues new tokens to attacker.
A is now invalidated.
Legitimate client tries to refresh with A.
Server says "A is invalid." Legitimate client is logged out.
User logs back in; the cycle repeats; attacker keeps refreshing.

The user notices something is wrong (they keep getting logged out). But the attacker still has working tokens between log-out events. The defense is incomplete.

The completion is reuse detection: when the server sees the SECOND attempt to use refresh_token_A, that is the signal that one of the two parties is illegitimate. Revoke everything; force everyone (including the legitimate user) to log in again. The attacker is locked out; the user re-authenticates and continues.

The reuse-detection mechanism#

The full pattern:

1. User logs in. Server creates a token "family" (a unique family_id):
     family_id = random(256 bits)
   Server issues: refresh_token_A (with family_id), access_token, etc.

2. Client uses access token until expiry.

3. Client refreshes:
     POST /token, refresh_token_A
   Server:
     - Looks up A in the database. Found, marked unused.
     - Issues access_token + refresh_token_B. B is in the same family_id.
     - Marks A as used (atomically).
   Returns the new tokens to the client.

4. Client stores B; discards A.

5. Client refreshes again with B. Same flow: B -> C, all in the same family.

6. Now imagine attacker has stolen B (somehow).
     Attacker calls /token with B (legitimate client just used A and got B,
     but has not yet used B).
     Server: B is valid; issues C; marks B as used.
     Attacker has C.

7. Legitimate client tries to refresh with B (next time it needs to).
     Server: B is marked used. This is a reuse attempt.
     Server: detects reuse; revokes the entire family (B, C, and any other
     tokens with this family_id).
     Server: returns 401.

8. Legitimate client is logged out. User re-authenticates.
9. Attacker tries to refresh with C. Server: family was revoked; reject.
10. Attacker is locked out. User has a fresh family.

The detection point is step 7. The server sees a refresh attempt with a token marked "already used." That is the signal. There is no other reason for this to happen in normal operation; legitimate clients never reuse a refresh token (they got a new one in the response).

The remediation is family-wide revocation: not just "revoke this token" but "revoke every token descended from the original login." The attacker's most recent refresh (C in the example) is also invalidated.

The implementation#

Server-side code (Python, simplified):

import secrets
from datetime import datetime, timedelta

class RefreshToken:
    """Simplified model."""
    token: str           # the actual token (or a hash)
    family_id: str       # links siblings
    user_id: str
    used: bool           # has this token been redeemed yet?
    used_at: datetime    # when (for forensic)
    issued_at: datetime
    expires_at: datetime
    revoked: bool        # was this family revoked?

# Helper: revoke entire family
def revoke_family(family_id):
    db.execute("UPDATE refresh_tokens SET revoked = TRUE WHERE family_id = ?",
               (family_id,))

# Login: issue tokens and start a new family
def login(user_id):
    family_id = secrets.token_urlsafe(32)
    rt = secrets.token_urlsafe(32)
    db.insert_refresh_token({
        "token": rt,
        "family_id": family_id,
        "user_id": user_id,
        "used": False,
        "issued_at": datetime.utcnow(),
        "expires_at": datetime.utcnow() + timedelta(days=7),
        "revoked": False,
    })
    access_token = mint_access_token(user_id)
    return {"access_token": access_token, "refresh_token": rt}

# Refresh handler
def refresh(presented_refresh_token):
    record = db.get_refresh_token(presented_refresh_token)
    
    # 1. Must exist
    if not record:
        return Response(401, "unknown token")
    
    # 2. Must not be revoked (family revocation)
    if record.revoked:
        return Response(401, "token family revoked")
    
    # 3. Must not be expired
    if datetime.utcnow() > record.expires_at:
        return Response(401, "expired")
    
    # 4. REUSE DETECTION: must not be already used
    if record.used:
        # This is the signal. Revoke the entire family.
        revoke_family(record.family_id)
        log_security_event("refresh_token_reuse_detected",
                          user=record.user_id, family=record.family_id)
        alert_security_team(record)
        return Response(401, "token reuse detected; please re-authenticate")
    
    # Mark this token as used (atomic with new issue below)
    with db.transaction():
        db.execute("UPDATE refresh_tokens SET used = TRUE, used_at = ? WHERE token = ?",
                   (datetime.utcnow(), presented_refresh_token))
        
        # Issue new tokens in the same family
        new_rt = secrets.token_urlsafe(32)
        db.insert_refresh_token({
            "token": new_rt,
            "family_id": record.family_id,   # SAME family
            "user_id": record.user_id,
            "used": False,
            "issued_at": datetime.utcnow(),
            "expires_at": datetime.utcnow() + timedelta(days=7),
            "revoked": False,
        })
    
    return {
        "access_token": mint_access_token(record.user_id),
        "refresh_token": new_rt,
    }

The pattern in plain English:

Each login starts a token family.
Each refresh inside that family rotates the refresh token but stays in the family.
Any reuse of an already-used token in the family triggers family-wide revocation.

The atomicity in step 4 (marking used + issuing new) is important. Without it, you get a race condition where two parties can race to redeem the same token; both get tokens; the reuse is not detected.

The grace period#

A complication: legitimate clients sometimes retry refresh calls. Network blip, client bug, race condition between tabs. If two refresh attempts with the same token arrive within seconds, you do not want to falsely flag this as reuse.

Auth0 (and others) implement a small grace period: if the second use comes within N seconds (typically 5-10) of the first, treat it as a retry and return the same new token both times.

def refresh(presented_refresh_token):
    record = db.get_refresh_token(presented_refresh_token)
    
    # ... existing checks ...
    
    if record.used:
        # Check grace period
        elapsed = (datetime.utcnow() - record.used_at).total_seconds()
        if elapsed < 5:   # 5 second grace
            # Treat as retry; return the same new token.
            new_rt = db.get_child_token(record.family_id, after=record.used_at)
            return {
                "access_token": mint_access_token(record.user_id),
                "refresh_token": new_rt.token,
            }
        else:
            # Beyond grace; this is reuse.
            revoke_family(record.family_id)
            return Response(401, "token reuse detected")

The grace period balances false positives (legitimate retries) against false negatives (slow attacker that waits >5s). Five seconds is a reasonable default; tune based on your client's retry behavior.

What to do when reuse is detected#

The minimum: revoke the family. The user has to re-authenticate. The attacker is locked out.

Beyond the minimum:

1. Alert the user. Send an email: "we detected suspicious activity on your account; please log back in." Lets them know something happened.

2. Alert the security team. Reuse detection is a strong signal of compromise. Include: user ID, family ID, source IPs of both refresh attempts, timestamps. The security team can investigate further (was the user's machine compromised, was a session leaked).

3. Force MFA on next login. Even if the user normally has trusted-device skip-MFA, require fresh MFA after a reuse event.

4. Audit access during the suspected exposure. From the time of the first refresh of the suspect token until the reuse detection, what did the attacker do with the access tokens they had? Audit the API calls.

The reuse detection is the alarm; what comes after is the response.

The CI / multi-instance gotcha#

A common production failure mode: refresh-token rotation deployed but multiple app instances behind a load balancer share the same refresh token across instances. Two instances try to refresh simultaneously; the first wins; the second sees "used" and triggers a false reuse detection.

Workarounds:

Sticky sessions: each user always hits the same instance. The instance serializes refreshes for that user.
Centralized refresh logic: only one instance is responsible for refreshing; others ask it. Adds a hop.
Distributed lock: serialize refreshes for the same token across instances using Redis or similar. Adds a dependency.
Grace period: the simplest, with caveats. If the grace period is long enough to cover the load-balancer racing window, false positives go away.

Pick based on architecture. For most apps, a 5-second grace period plus single-flight per user (a per-user lock in the app) handles 99% of cases.

Common mistakes#

Rotation without reuse detection. Half the answer; tokens still work for stolen-by-attacker case until expiry.
Reuse detection without grace period. Legitimate retries trigger false revocations; users get logged out unexpectedly.
No family concept; revoking only the specific reused token. Attacker still has the next token in the chain. Revoke the entire family.
Revocation only at the auth server; not propagated to JWT validators. If access tokens are JWTs validated offline, the family revocation does not affect them until they expire. Either short access tokens (acceptable) or push revocation to the JWT validators (introspection).
Refresh tokens stored in localStorage. XSS reads them. httpOnly cookies or OS keyring.
No alert on reuse detection. The strongest signal of compromise; not surfaced. Page the security team.
No audit log of refresh events. Cannot reconstruct the timeline of who refreshed when. Log every refresh.
Same refresh token reused for many tabs/devices. Each tab gets the same token; refresh from one invalidates for all. Either issue per-device tokens or accept that one tab's refresh logs out the others.

Quick reference: the rotation+detection checklist#

For your refresh-token implementation:

[ ] Each refresh issues a new token; old token invalidated.
[ ] Tokens carry a family_id linking them to the original login.
[ ] Database table tracks: token, family_id, used, used_at, expires_at, revoked.
[ ] On refresh: if token already used, revoke entire family.
[ ] Grace period (5-10s) for legitimate retries.
[ ] Family-wide revocation on reuse detection.
[ ] Alert security team on reuse events.
[ ] Force MFA on user's next login after a reuse.
[ ] Refresh tokens in httpOnly cookies (web) or OS keyring (native).
[ ] Per-device tokens if multi-device support matters.
[ ] Atomic mark-used + issue-new (same DB transaction).
[ ] Logout endpoint revokes the family (not just the access token).

What Auth0 does (and what you should copy)#

Auth0's documented behavior:

Refresh tokens rotate on every use by default.
Reuse detection enabled by default; configurable grace period.
Family-wide revocation on detected reuse.
Webhook event fires on reuse for security teams to consume.
"Inactivity expiration" if a refresh token is not used for N days, expire it independently of absolute lifetime.

These are sensible defaults; copy them if you build your own.

The mental model#

Refresh-token rotation alone is "a per-use credential." Reuse detection is "a per-use credential plus an alarm that fires when two parties try to use it." The combination is what catches stolen refresh tokens; the alarm is what makes refresh-token theft self-detecting.

The cost is one extra column in your refresh-token table (used), plus a check on every refresh, plus the family concept. The benefit is that the worst case for refresh-token theft becomes "the user gets logged out and has to re-authenticate" instead of "the attacker has long-term access."

Modern IdPs implement this pattern. If you use one (Okta, Auth0, Microsoft Entra ID, AWS Cognito), you get it for free; verify it is enabled. If you build your own auth, the implementation is a hundred lines of code; the protection is real.

Refresh tokens, OAuth, OIDC, and the broader token-lifecycle patterns are covered in depth in the Identity and Trust for DevOps Engineers course. The Kubernetes-specific token mechanics (ServiceAccount tokens, projected tokens, IRSA federation) are part of the Kubernetes Security course. Related reading: the OAuth flows overview shows where refresh tokens fit in each flow, and PKCE protects the authorization code that issues them in the first place.

More in Security

Security·May 15, 2026·12 min read

Your JWT Is Not a Session. The Costliest Misuse of OAuth in 2026.

JWTs were designed for short-lived authorization assertions. Half the industry uses them as session cookies, then discovers they cannot revoke. The five problems this causes and the right alternative.

Read post

Security·May 15, 2026·11 min read

MFA Fatigue Bypassed Uber, MGM, and Cisco. Number Matching Stops It in One Config Change.

MFA fatigue is the cheapest, most-effective attack against push-based MFA in 2026. The defense is one IdP config change. Here is the attack, the defense, and why most companies still have not enabled it.

Read post

Security·May 15, 2026·11 min read

PKCE: Why Every OAuth Client Needs It in 2026 (Even the Ones That Used to Be Fine Without)

PKCE used to be a mobile-only thing. OAuth 2.1 makes it mandatory for everyone. Here is what the protection actually does, why a confidential web app needs it too, and the eight-line implementation that closes the authorization-code-interception attack.

Read post