Skip to content

Fix: Expired ID Token (CODE_611) Reporting

Date: 2026-06-23
Related ADR: 0018. Recoverable Errors as GraphQLErrors
Original symptom: Relay settingsQuery (and other queries) failing with “Internal server error” while the server logged ApiError [CODE_611] and the error reached Sentry even though the condition is recoverable.

Summary

Expired Firebase ID tokens are a legitimate, client-recoverable condition. They must never produce per-occurrence Sentry errors. We still need to know when their rate becomes an anomaly.

We apply the pattern from ADR 0018:

  • Make the server always emit a GraphQLError (with the [CODE_611] message intact) for this family.
  • This re-uses the existing “GraphQLError = do not report” machinery.
  • Emit cheap breadcrumbs (and optional sampled info messages).
  • Rely on Sentry Alerts for rate/spike detection.

Root Cause

  1. authMiddleware (called only when postgres migration flags are on) does verifyIdToken and calls throwAsTokenApiErrorApiError(CODE_611).
  2. createResolversContext wraps it and propagates.
  3. createGraphQLResolversContext in yogaHandler.ts correctly identifies it via isExpectedAuthError and does throw result.error, but the thrown value is still an ApiError.
  4. Yoga turns some non-GraphQLError context throws into responses that cause the outer handler
    pages/api/graphql.ts
    new Try(handler).report('Error in graphql api handler').unwrap()
    to fire, and the useSentry envelop plugin can also see it.
  5. maskError only preserves the coded message for GraphQLError instances whose originalError is an ApiError. A plain ApiError can become “Internal server error” on some paths (explaining the user-visible symptom before the client middleware fully helped).

The client-side expiredTokenMiddleware (PR #6408) and _app.js detection were added precisely because the top-level QueryRenderer didn’t cover inner queries, but they still depend on the server putting the real code in errors[].message.

Changes Required

All changes are small and mechanical. They make the “expected auth” path behave exactly like the deactivated-account path (GraphQLError).

1. lib/graphql/yogaHandler.ts

Turn the expected-auth throw into an explicit GraphQLError, built with yoga’s createGraphQLError and without an originalError (this is the core of the pattern for the GraphQL layer).

import { createYoga, createGraphQLError } from 'graphql-yoga';
...
if (isExpectedAuthError(result.error)) {
throw result.error;
const message = result.error instanceof Error ? result.error.message : 'Authentication error';
throw createGraphQLError(message);
}

Why createGraphQLError(message) and not new GraphQLError(message, { originalError: result.error }):

  • Attaching the ApiError as originalError makes yoga classify the error unexpected and answer HTTP 500. On HTTP ≥ 400 react-relay-network-modern throws inside convertResponse before expiredTokenMiddleware can read res.errors, so the middleware is bypassed and inner-query recovery is lost. A bare GraphQLError yields HTTP 200, which keeps the coded message and lets the client middleware resolve and act.
  • graphql ships dual CJS/ESM. A GraphQLError built from a different module instance fails yoga’s internal instanceof check and is re-wrapped (back to 500). createGraphQLError builds it from yoga’s own graphql instance.

Update the comment above it to reference ADR 0018.

2. lib/graphql/maskError.ts

Detect GraphQLErrors by shape, not instanceof, so an expected coded error coming from a different graphql module instance is not masked into “Internal server error”.

if (error instanceof GraphQLError) {
function isGraphQLErrorLike(error: unknown): error is GraphQLError {
return error instanceof GraphQLError || (error instanceof Error && error.name === 'GraphQLError');
}
...
if (isGraphQLErrorLike(error)) {
if (!error.originalError || error.originalError instanceof ApiError) {
return error;
}
...
}

3. server/graphql/helpers/authMiddleware.js (noise reduction)

Stop logging full stacks for expected cases (the server already has the pattern in set-auth.js).

const decodedToken = await admin.auth().verifyIdToken(token).catch((err) => {
console.error('Error verifying Firebase token:', err);
if (!isExpectedAuthError(err)) {
console.error('Error verifying Firebase token:', err);
}
throwAsTokenApiError(err);
});

Import isExpectedAuthError (it is already re-exported from the same module that exports throwAsTokenApiError).

4. Tests

  • In _tests_/api/graphql-v2.test.ts (or a dedicated auth context test) add a case that mocks createResolversContext (or the inner auth) to reject with a token-expired ApiError and asserts:
    • The response contains a GraphQL error whose message includes CODE_611.
    • No Sentry.captureException was called for the expected path.
  • Update / add assertions in authMiddleware.test.ts and classifyTokenError.test.ts if the conditional logging changes behaviour they assert.
  • The existing expiredTokenMiddleware.test.ts must continue to pass (it only cares that the message containing the code reaches the network layer).

5. Client (already mostly done by #6408)

No functional change required. expiredTokenMiddleware.ts + the wiring in lib/createEnvironment/client.js already extract messages and call getCodeFromErrorMessage(message) === codes.tokenExpired. Because we now guarantee a GraphQLError with the real message, even context-level failures will be visible to every Relay operation.

Optionally add a breadcrumb in the middleware when it decides to redirect (this is a good place for the “info signal” part of the pattern):

if (messages.length > 0 && deps.isExpiredTokenError(messages)) {
Sentry.addBreadcrumb({
category: 'auth.token',
message: 'id-token-expired',
level: 'info',
});
deps.onExpiredToken();
}

6. Sentry configuration (one-time)

In sentry.server.config.js / sentry.client.config.js (or a shared util) you may add a stable fingerprint for the family:

beforeSend(event) {
// existing token stripping ...
const msg = event.message || event.exception?.values?.[0]?.value || '';
if (/CODE_611|id-token-expired/.test(msg)) {
event.fingerprint = ['recoverable', 'auth', 'expired-id-token'];
// event.level = 'warning'; // if you want them visible but not "error" severity
}
return event;
}

Create the alert (do this in the Sentry UI, not code):

  • Scope: events whose fingerprint or message matches the above, or breadcrumbs with category:auth.token.
  • Condition examples:
    • “Number of events > 20 in the last 10 minutes”
    • “Spike detection: > 3× 1-hour baseline”
    • “Affected users/tenants > X”
  • Action: Slack / email / page the on-call or the auth/platform team.
  • Name it clearly: “Auth token expiry anomaly (CODE_611)“.
  • Add a one-sentence reference in docs/user-guides/dev-manual/token-refresh.md under the “Error handling” section: “Server-side the condition is delivered as a GraphQLError per ADR 0018 so it does not generate per-occurrence Sentry errors.”
  • Mention in docs/development/project-description/07-auth-and-permissions.md (if it exists) or the graphql chapter.

Verification Steps (run before merging)

  1. pnpm typecheck && pnpm flow:check
  2. pnpm lint
  3. pnpm test (especially the graphql-v2, authMiddleware, classify, and expiredTokenMiddleware suites)
  4. Manually (or via a test helper):
    • Force an expired token into the idToken cookie (or make verifyIdToken reject with the firebase expired error).
    • Execute any Relay query (including ones that previously only went through context, e.g. settings-related).
    • Assert the response JSON contains errors[0].message with CODE_611.
    • Assert no new Sentry event of level error was created for this request.
  5. Check that the expiredTokenMiddleware still fires onExpiredToken (the redirect guard test is sufficient if it exercises the message path).
  6. (After deploy) confirm the new Sentry alert exists and the normal volume of token expiry produces only breadcrumbs/info, not issues.

Rollback

The change is almost entirely additive in the “expected” branch. Worst case: a mis-classified error becomes a GraphQLError and is not reported. Because we only do this for errors that already pass isExpectedAuthError, the blast radius is limited to the auth-expiry family. Reverting the GraphQLError(...) wrapper restores the previous (noisy) behaviour.

References

  • ADR 0018 (the pattern this document applies)
  • lib/graphql/yogaHandler.ts
  • pages/api/graphql.ts
  • server/graphql/helpers/auth/classifyTokenError.ts
  • lib/auth/expiredTokenMiddleware.ts
  • Investigation notes: 2026-04-27-firebase-token-refresh-findings.md