Applying Backpressure to Webhook Consumers
This guide implements consumer-side backpressure for a single Python webhook receiver under load, the concrete companion to Webhook Rate Limiting & Backpressure. The scenario: a FastAPI endpoint accepts events, hands them to background workers that do slow processing (a database write, a downstream call), and during a traffic spike the workers fall behind. Without backpressure the receiver either runs out of memory buffering or silently drops events. The fix is to make saturation explicit — a bounded queue, a 429 Retry-After response that tells the sender to slow down, and pause/resume control with hysteresis. Because the producer must honor that 429 to break the loop, pair this with sender-side protection such as per-endpoint circuit breaker state machines.
Prerequisites
- Python 3.11+ with
asyncio, a FastAPI/Starlette receiver, andhttpxfor the test sender. - A receiver that already does its real work in background workers off the request path (synchronous processing inside the handler cannot be paused independently of intake).
- Senders that honor HTTP 429 and
Retry-After. If you control the dispatcher, see Webhook Rate Limiting & Backpressure for the producer side. - Idempotent processing, because backpressure relies on the sender retrying rejected deliveries.
Step 1: Bound the work queue
Replace any unbounded buffer with a fixed-size asyncio.Queue. The size is a deliberate memory budget: roughly the number of in-flight events you can hold without risking the process. A bounded queue turns “we are overloaded” from a silent memory leak into an observable, actionable condition.
import asyncio
QUEUE_MAX = 1000
HIGH_WATER = int(QUEUE_MAX * 0.8) # start shedding
LOW_WATER = int(QUEUE_MAX * 0.4) # resume accepting
queue: asyncio.Queue = asyncio.Queue(maxsize=QUEUE_MAX)
Step 2: Signal backpressure with 429 and Retry-After
When the queue is at or above the high-water mark, reject new deliveries with 429 Too Many Requests and a Retry-After header. This is the entire point of consumer backpressure: you are telling the sender to come back later instead of accepting work you cannot process.
from fastapi import FastAPI, Request, Response
app = FastAPI()
@app.post("/webhooks")
async def receive(request: Request):
if queue.qsize() >= HIGH_WATER or not accepting.is_set():
# Suggest a wait proportional to the backlog drain time.
retry_after = max(1, queue.qsize() // max(1, DRAIN_PER_SEC))
return Response(
status_code=429,
headers={"Retry-After": str(retry_after)},
)
event = await request.json()
try:
queue.put_nowait(event) # never block the request on a full queue
except asyncio.QueueFull:
return Response(status_code=429, headers={"Retry-After": "5"})
return Response(status_code=202)
Use put_nowait rather than await queue.put(...): blocking the HTTP handler on a full queue holds a connection open and converts backpressure into latency. Reject fast and let the sender retry.
Step 3: Pause and resume intake with hysteresis
A single threshold causes flapping — the receiver toggles 202/429 on every event near the boundary. Use two thresholds: stop accepting at the high-water mark, and only resume once the queue drains below the low-water mark. An asyncio.Event makes the accepting/paused state explicit and cheap to check.
accepting = asyncio.Event()
accepting.set() # start in the accepting state
DRAIN_PER_SEC = 50 # measured worker throughput
async def watermark_monitor():
while True:
depth = queue.qsize()
if depth >= HIGH_WATER and accepting.is_set():
accepting.clear() # pause intake
elif depth <= LOW_WATER and not accepting.is_set():
accepting.set() # resume intake
await asyncio.sleep(0.25)
async def worker():
while True:
event = await queue.get()
try:
await process(event) # the slow work
finally:
queue.task_done()
@app.on_event("startup")
async def startup():
asyncio.create_task(watermark_monitor())
for _ in range(8): # concurrency cap = worker count
asyncio.create_task(worker())
The gap between HIGH_WATER and LOW_WATER is the hysteresis band; widen it if you observe rapid accept/reject oscillation.
Step 4: Detect slow consumers from drain rate
Backpressure is reactive; detection lets you act before the queue saturates. Track per-event dwell time (enqueue-to-dequeue) and processing latency. A rising dwell time with steady intake means workers are falling behind — the leading indicator of an impending 429 storm.
import time
dwell_samples: list[float] = []
async def worker_instrumented():
while True:
event = await queue.get()
dwell = time.monotonic() - event["_enqueued_at"]
dwell_samples.append(dwell)
try:
await process(event)
finally:
queue.task_done()
def health_snapshot() -> dict:
recent = dwell_samples[-100:] or [0.0]
return {
"queue_depth": queue.qsize(),
"accepting": accepting.is_set(),
"p95_dwell_seconds": sorted(recent)[int(len(recent) * 0.95)],
}
Stamp _enqueued_at = time.monotonic() when you enqueue in Step 2. Export health_snapshot() to your metrics pipeline and alert on p95_dwell_seconds trending up.
Verification / Testing
Drive the receiver faster than it drains and assert it sheds load instead of growing without bound.
import asyncio, httpx
async def flood():
async with httpx.AsyncClient(base_url="http://localhost:8000") as c:
codes = await asyncio.gather(*[
c.post("/webhooks", json={"i": i}) for i in range(5000)
], return_exceptions=True)
statuses = [r.status_code for r in codes if hasattr(r, "status_code")]
assert 429 in statuses, "receiver never applied backpressure"
assert 202 in statuses, "receiver rejected everything"
print("202:", statuses.count(202), "429:", statuses.count(429))
asyncio.run(flood())
A passing test shows a mix of 202 and 429, and the receiver’s memory stays flat throughout the flood — confirming the bounded queue, not the heap, absorbs the spike.
Failure modes and gotchas
- Blocking the handler on a full queue — using
await queue.put()instead ofput_nowaitholds the request open under load, turning backpressure into mounting latency and connection exhaustion. Reject with 429 immediately. - Single threshold flapping — without the high/low-water hysteresis band, the receiver oscillates between accepting and rejecting near the boundary, producing noisy 429s. Widen the gap between the marks.
- Senders ignoring 429 — if the dispatcher retries instantly regardless of
Retry-After, your 429s amplify load instead of relieving it. Confirm sender behavior, and pair with the producer-side controls in Webhook Rate Limiting & Backpressure. - Non-idempotent processing — backpressure depends on rejected deliveries being retried, which means some events arrive twice. Deduplicate downstream so retries are safe.
Related
- per-endpoint circuit breaker state machines — sender-side protection that complements consumer backpressure.
- Webhook Rate Limiting & Backpressure — the producer-side rate limiting and feedback model behind this guide.