Rotating JWT signing keys with JWKS
When a webhook provider signs its callbacks as JWTs, the hardest operational problem is not signing — it is changing the signing key without dropping a single in-flight delivery. This guide walks through rotating JWT signing keys behind a JWKS (JSON Web Key Set) endpoint, the pattern that lets you swap keys while old and new tokens both verify cleanly. It builds on the JWT-Based Webhook Auth reference and assumes you already verify incoming tokens as described in Validating JWT tokens in webhook payloads. The specific scenario here: you sign outbound webhook JWTs, consumers verify them, and you need to retire the current signing key on a schedule with zero verification failures during the cutover.
The mechanism that makes this safe is the kid (key id) header. Each JWT names the key that signed it; the verifier looks that key up in your published JWKS rather than hard-coding a single key. Rotation then becomes a matter of publishing the new key before you sign with it and retiring the old key after the last token signed with it has expired.
Prerequisites
- Runtime: Node.js 20+ with TypeScript.
- Library:
jose(npm i jose) for signing, JWKS serving helpers, and the remote JWKS verifier. - Key storage: A secret store (KMS, Vault, or env-injected PEMs) holding the active and previous private keys, each tagged with a
kid. - A publicly reachable JWKS URL, e.g.
https://provider.example.com/.well-known/jwks.json, served over TLS. - Familiarity with claim validation (
iss,aud,exp) from the sibling JWT validation walkthrough.
Step 1: Generate a key pair with a stable kid
Generate an asymmetric key pair and assign it a unique, immutable kid. Use a content-derived id (a thumbprint) so the same key never gets two ids. ECDSA P-256 (ES256) keeps keys and signatures small; RSA (RS256) is fine where a consumer mandates it.
import { generateKeyPair, exportJWK, calculateJwkThumbprint } from 'jose';
export async function newSigningKey() {
const { publicKey, privateKey } = await generateKeyPair('ES256', { extractable: true });
const publicJwk = await exportJWK(publicKey);
// Stable kid derived from the key material itself — never reuse a kid for a different key.
const kid = await calculateJwkThumbprint(publicJwk);
publicJwk.kid = kid;
publicJwk.use = 'sig';
publicJwk.alg = 'ES256';
return { kid, privateKey, publicJwk };
}
Engineering Note: Treat the kid as a permanent label. If you ever rotate the underlying key but keep the kid, cached verifiers will pull the new public key under the old id and reject every token still signed by the original key.
Step 2: Publish the JWKS endpoint
Serve every currently valid public key — the active key plus any previous keys still inside their token lifetime — as a JWKS document. Add cache headers so verifiers cache it but refresh within your overlap window.
import express from 'express';
// In practice these come from your key store, not module state.
const publishedKeys: Record<string, object> = {}; // kid -> public JWK
const app = express();
app.get('/.well-known/jwks.json', (_req, res) => {
res.set('Cache-Control', 'public, max-age=600'); // 10 min; keep < overlap window
res.json({ keys: Object.values(publishedKeys) });
});
Engineering Note: max-age is a contract with your consumers. The overlap window in Step 5 must be comfortably longer than this max-age plus your longest token lifetime, or a verifier holding a stale JWKS will see a kid it does not have.
Step 3: Sign tokens with the kid header
Sign each outbound webhook JWT with the active private key and stamp the matching kid into the protected header so verifiers know which key to fetch.
import { SignJWT } from 'jose';
import type { KeyLike } from 'jose';
export async function signWebhookJwt(
payload: Record<string, unknown>,
activeKid: string,
privateKey: KeyLike,
) {
return new SignJWT(payload)
.setProtectedHeader({ alg: 'ES256', kid: activeKid }) // kid is mandatory for rotation
.setIssuer('https://provider.example.com')
.setAudience('https://consumer.example.com/webhooks')
.setIssuedAt()
.setExpirationTime('5m') // short lifetimes shrink the overlap window you must hold
.sign(privateKey);
}
Engineering Note: Keep exp short. The overlap window you hold the old key for in Step 5 is bounded below by the maximum token lifetime; a 5-minute exp lets you retire an old key minutes after switching, whereas a 24-hour token forces a 24-hour overlap.
Step 4: Verify with a caching JWKS resolver
On the consumer side, resolve the verification key by kid from a cached JWKS set. jose’s createRemoteJWKSet fetches the JWKS, caches it, and on a cache miss (an unknown kid) performs a single rate-limited refetch — which is exactly the behavior that makes overlapping rollover transparent.
import { jwtVerify, createRemoteJWKSet } from 'jose';
const JWKS = createRemoteJWKSet(
new URL('https://provider.example.com/.well-known/jwks.json'),
{
cacheMaxAge: 600_000, // cache for 10 min
cooldownDuration: 30_000, // min 30s between forced refetches on unknown kid
},
);
export async function verifyWebhookJwt(token: string) {
// JWKS reads the kid from the token header and returns the matching key,
// refetching once if the kid is unknown (a freshly rotated key).
const { payload } = await jwtVerify(token, JWKS, {
issuer: 'https://provider.example.com',
audience: 'https://consumer.example.com/webhooks',
});
return payload;
}
Engineering Note: The cooldownDuration throttle is a deliberate DoS guard — without it, an attacker spraying tokens with random kid values would force unbounded refetches of your JWKS. Never disable it.
Step 5: Execute an overlapping rollover
Rotate in four ordered moves, each respecting the cache windows above:
- Publish the new key (
kid-2) into the JWKS alongside the old key (kid-1). Do not sign with it yet. - Wait at least the JWKS
max-age(here 10 minutes) so consumer caches containkid-2before anykid-2token can arrive. - Switch the active signing key to
kid-2. New tokens carrykid-2; in-flightkid-1tokens still verify becausekid-1is still published. - Retire
kid-1from the JWKS only after the longestkid-1token lifetime (exp) has fully elapsed past the switch. Then delete thekid-1private key from the store.
// Orchestration sketch — sequencing is the security property, not the code.
async function rollover(store: KeyStore) {
const next = await newSigningKey();
store.publishPublic(next.kid, next.publicJwk); // 1. publish
await sleep(JWKS_MAX_AGE_MS); // 2. let caches warm
store.setActiveSigningKey(next.kid, next.privateKey); // 3. switch signing
await sleep(MAX_TOKEN_TTL_MS); // 4. drain old tokens
store.unpublishAndDestroy(store.previousKid); // retire old key
}
Verification and testing
Confirm the rollover is correct before relying on it in production:
- JWKS shape:
curl -s https://provider.example.com/.well-known/jwks.json | jq '.keys[].kid'should list bothkids during the overlap window and only the new one after retirement. - Cross-key verification test: sign one token with
kid-1and one withkid-2, then assert both passverifyWebhookJwtwhile both keys are published. - Stale-cache simulation: point the verifier at a JWKS snapshot that contains only
kid-1, send akid-2token, and assert the resolver refetches once and then verifies (proving Step 2’s warm-up is the only thing protecting you). - Retirement assertion: after retiring
kid-1, send a freshly mintedkid-1token and assert verification fails — the old key must be genuinely gone, not merely unused.
curl -s https://provider.example.com/.well-known/jwks.json \
| jq '.keys | map({kid, alg, use})'
Failure modes and gotchas
- Signing before caches warm (the classic outage). If you skip Step 2 and switch signing immediately after publishing, every consumer still holding the old cached JWKS sees an unknown
kidand rejects tokens until its cache expires. Always wait at least the JWKSmax-agebetween publish and switch. - Reusing a
kidfor a new key. A verifier that has cached the old public key under thatkidwill use the wrong key and fail every signature.kids are write-once; derive them from a thumbprint so reuse is impossible by accident. - Overlap shorter than token lifetime. Retiring the old key while tokens signed with it are still within their
expstrands those deliveries. The overlap must exceed your maximum token TTL — keep TTLs short so the overlap can be short too. - Unbounded JWKS refetch on bad
kid. A verifier that refetches the JWKS on every unknownkidwith no cooldown is a free amplification target. KeepcooldownDurationset; an unknownkidafter a single refetch is a hard rejection, not a retry loop.