At-least-once vs exactly-once webhook delivery trade-offs
Every webhook platform must pick a delivery semantic, and the choice cascades into retry design, consumer complexity, and operational cost. This comparison builds on message ordering guarantees and sits alongside implementing strict ordering for financial webhooks, which shows ordering on top of these semantics. For the full taxonomy of guarantees a platform can offer, see the cross-cutting treatment in delivery guarantee levels.
The short version: true exactly-once delivery over an unreliable network is impossible, because the sender can never be certain whether a lost acknowledgement means the receiver got the message or not. What real systems ship is at-least-once delivery plus idempotent consumers, which together produce an effect that is observed exactly once. Understanding why is the difference between chasing an unattainable guarantee and building one that works.
What at-least-once delivery guarantees
At-least-once means the sender keeps retrying until it receives an acknowledgement, so the consumer is guaranteed to see every event — but possibly more than once. It is the default for essentially every commercial webhook provider because it is the only semantic that survives network partitions without dropping events. The sender’s contract is simple: store the event, deliver, and on any timeout or non-2xx response, retry with backoff.
The burden moves to the consumer, which must tolerate duplicates. That is the whole reason idempotency matters in webhooks; with an idempotent consumer, at-least-once delivery yields effectively-once processing.
Why exactly-once delivery is unattainable
Exactly-once delivery would require the sender to know, with certainty, that the consumer received and durably stored each event — exactly once, never zero, never twice. The problem is the two generals: after the consumer processes an event it sends an ack, but if that ack is lost the sender cannot tell the difference between “consumer never got it” and “consumer got it, ack vanished.” Its only safe move is to retry, which produces a duplicate. No additional acknowledgement round solves this; the last message in any finite exchange can always be the one that’s lost.
So exactly-once is achievable only as exactly-once processing — the effect happens once even though the message may arrive several times. That is delivered by at-least-once transport plus a consumer that discards duplicates.
At-most-once for contrast
At-most-once delivery fires each event once and never retries: zero duplicates, but events are silently lost on any failure. It suits pure telemetry or best-effort notifications where a missed event costs nothing. It is the wrong choice for anything stateful, which is why it rarely appears in webhook platforms beyond fire-and-forget pings.
Comparison
| Dimension | At-most-once | At-least-once | Exactly-once (effective) |
|---|---|---|---|
| Duplicates | Never | Possible | Suppressed by consumer |
| Lost events | Possible | Never (with retries) | Never |
| Sender complexity | Trivial (fire and forget) | Retry + backoff + DLQ | Same as at-least-once |
| Consumer complexity | None | Must tolerate duplicates | Must be idempotent + store keys |
| Network-partition behavior | Drops events | Survives, retries later | Survives, retries later |
| Realistic to implement | Yes | Yes | Only as effectively-once |
| Typical fit | Telemetry, pings | The default for webhooks | Money movement, account state |
Choosing for a workload
Default to at-least-once delivery with an idempotent consumer for almost every webhook integration. It is the only combination that neither drops events nor double-applies them, and it degrades gracefully under partitions. Add a durable idempotency key when the side effect is irreversible, and back retries with exponential backoff and a dead-letter queue so a permanently failing consumer does not stall the pipeline.
Pick at-most-once only for genuinely disposable signals. Never reach for it on anything that mutates state, because the lost-event mode is invisible until reconciliation surfaces the gap. When ordering also matters, layer the techniques from implementing strict ordering for financial webhooks on top of the at-least-once base.
Implementing effectively-once on the consumer
import hashlib
import json
import psycopg2
# A UNIQUE constraint on event_id turns at-least-once into effectively-once:
# CREATE TABLE processed_events (
# event_id TEXT PRIMARY KEY,
# processed_at TIMESTAMPTZ NOT NULL DEFAULT now()
# );
def stable_event_id(headers: dict, payload: dict) -> str:
# Prefer the provider's event id; it is stable across retries.
if eid := headers.get("X-Event-Id"):
return eid
canonical = json.dumps(payload, sort_keys=True, separators=(",", ":"))
return hashlib.sha256(canonical.encode()).hexdigest()
def handle(conn, headers: dict, payload: dict) -> str:
eid = stable_event_id(headers, payload)
with conn: # one transaction: claim + side effect commit together
with conn.cursor() as cur:
cur.execute(
"INSERT INTO processed_events (event_id) VALUES (%s) "
"ON CONFLICT (event_id) DO NOTHING RETURNING event_id",
(eid,),
)
if cur.fetchone() is None:
# The row already existed: this is a retry of a processed event.
return "duplicate-ignored"
apply_side_effect(cur, payload) # runs in the same transaction
return "processed"
def apply_side_effect(cur, payload: dict) -> None:
... # the actual work, committed atomically with the claim
Claiming the event_id and applying the side effect in one transaction is what makes it correct: either both commit or neither does, so a crash mid-flight is safely retried rather than leaving a claimed-but-unprocessed event.
Verification
def test_duplicate_is_ignored(conn):
headers = {"X-Event-Id": "evt_123"}
payload = {"amount": 100}
assert handle(conn, headers, payload) == "processed"
# An identical retry must not re-run the side effect.
assert handle(conn, headers, payload) == "duplicate-ignored"
You can also force the duplicate path with curl by replaying the same delivery twice; the second call should be acknowledged with a 200 but leave state unchanged:
BODY='{"amount":100}'
for i in 1 2; do
curl -fsS -X POST localhost:8000/webhooks \
-H 'X-Event-Id: evt_123' -H 'content-type: application/json' --data "$BODY"
done
# Expect one state change in the database, two 200 responses.
Failure modes and gotchas
- Claiming and applying in separate transactions. If you insert the
event_id, commit, then crash before the side effect, the retry sees the row and skips the work — the event is lost despite at-least-once transport. Keep both in one transaction. - Treating “exactly-once” as a transport setting. No retry tuning or ack scheme buys exactly-once on the wire. Chasing it wastes effort; invest in consumer idempotency instead.
- Using a non-stable event id. Deriving the id from a field the provider rewrites on retry (a timestamp, a delivery attempt counter) defeats deduplication entirely. Use the provider’s event id, or hash only retry-stable fields.
- Unbounded retries with no dead-letter path. At-least-once without a ceiling will retry a poison event forever, blocking the queue. Cap attempts and route exhausted events to a dead-letter queue for out-of-band handling.