Zero-downtime webhook secret rotation

Rotating a shared HMAC secret is the one routine security task most likely to cause an outage, because the naive approach — swap the secret on both ends at once — guarantees a window where the sender and receiver disagree and every delivery fails its signature check. This guide covers the dual-secret overlap pattern that eliminates that window: for a defined period the verifier accepts either the old or the new secret, so the sender can switch over at its own pace without a single rejected webhook. It extends the Key Rotation Strategies reference and complements the broader procedure in How to implement secure key rotation for webhooks. The scenario here is narrow and concrete: you verify inbound webhooks with an HMAC shared secret, and you need to replace that secret on a schedule with provably zero dropped deliveries.

The whole technique rests on one idea — never have exactly one valid secret during a change. You go from one secret, to two valid secrets (overlap), to one secret again. As long as the overlap window is open on the verifier before the sender starts using the new secret, and stays open until the sender has fully stopped using the old one, no in-flight request can land on a secret the verifier does not hold.

Dual-secret overlap window Phases of rotation: verifier accepts both secrets during an overlap window while the sender switches from the old secret to the new one. time sender switches old retired Verifier accepts: old only old + new (overlap) new only Sender signs with: old secret new secret
The verifier opens the dual-secret overlap before the sender switches and keeps it open until after the switch, so no request ever hits a secret the verifier lacks.

Prerequisites

Step 1: Add the new secret alongside the old

Generate a strong new secret and load it into the verifier in addition to the existing one. The verifier reads a list of secrets, ordered newest-first, rather than a single value.

import crypto from 'node:crypto';

// Generate a 32-byte secret; store it, do not log it.
export const newSecret = () => crypto.randomBytes(32).toString('hex');

// Verifier reads an ordered list. During normal operation this has one entry;
// during rotation it has two. e.g. WEBHOOK_SECRETS="<new>,<old>"
export function loadSecrets(): string[] {
  return (process.env.WEBHOOK_SECRETS ?? '')
    .split(',')
    .map((s) => s.trim())
    .filter(Boolean);
}

Engineering Note: Deploy this list-aware verifier before you ever add a second secret. The first rotation should not also be the first time the multi-secret code path runs in production.

Step 2: Verify against both secrets

Accept the request if its signature validates against any secret in the list, using a constant-time comparison for each candidate. Order the list newest-first so the common case (post-switch traffic) matches on the first try.

import crypto from 'node:crypto';

function sign(rawBody: Buffer, secret: string): Buffer {
  return crypto.createHmac('sha256', secret).update(rawBody).digest();
}

export function verifyAgainstAny(rawBody: Buffer, signatureHex: string, secrets: string[]): boolean {
  const provided = Buffer.from(signatureHex, 'hex');
  let valid = false;
  for (const secret of secrets) {
    const expected = sign(rawBody, secret);
    // Evaluate every secret without early-return so timing does not reveal
    // which secret (if any) matched.
    const match =
      expected.length === provided.length && crypto.timingSafeEqual(expected, provided);
    valid = valid || match;
  }
  return valid;
}

Engineering Note: Do not return true on the first match. Iterating all secrets keeps total verification time independent of which secret matched, preserving the constant-time property across the whole list. With only two secrets the cost is negligible.

Step 3: Switch the sender to the new secret

Once every verifier instance is running with both secrets loaded (confirm via deploy rollout, not assumption), update the signing side to sign exclusively with the new secret.

// Signer: after the verifier fleet accepts both, sign only with the new secret.
function signOutbound(rawBody: Buffer): string {
  const secret = process.env.WEBHOOK_ACTIVE_SECRET!; // now the new value
  return crypto.createHmac('sha256', secret).update(rawBody).digest('hex');
}

Engineering Note: The ordering between Step 1 and Step 3 is the entire safety guarantee. The verifier must accept the new secret before the sender emits anything signed with it. If you reverse them, you reintroduce the exact failure window this pattern exists to remove.

Step 4: Retire the old secret

Leave the old secret in the verifier’s list until you are certain no traffic still uses it — at minimum, longer than your delivery timeout plus retry budget so that even a request enqueued before the switch and retried afterward cannot land on the old secret after it is gone. Then remove it from WEBHOOK_SECRETS and destroy it in the store.

// After the overlap window: WEBHOOK_SECRETS shrinks back to a single value.
// Confirm zero verifications matched the old secret before removing it (see Verification).

Engineering Note: Tag each verification with which secret matched and emit it as a metric. Retire the old secret only after that metric has read zero for a full overlap window — let observed traffic, not a wall-clock guess, gate the retirement.

Verification and testing

# Watch which secret is matching in production logs before retiring the old one.
grep 'webhook_verified' /var/log/app/webhook.log | jq '.match_secret_index' | sort | uniq -c

A minimal unit test:

import { verifyAgainstAny } from './verify';
import crypto from 'node:crypto';

const OLD = 'old-secret', NEW = 'new-secret';
const body = Buffer.from('{"event":"test"}');
const sig = (s: string) => crypto.createHmac('sha256', s).update(body).digest('hex');

test('accepts both during overlap', () => {
  expect(verifyAgainstAny(body, sig(OLD), [NEW, OLD])).toBe(true);
  expect(verifyAgainstAny(body, sig(NEW), [NEW, OLD])).toBe(true);
});
test('rejects retired secret', () => {
  expect(verifyAgainstAny(body, sig(OLD), [NEW])).toBe(false);
});

Failure modes and gotchas