Why Webhook Handlers Should Return Fast and Process Later
A common mistake in webhook processing is doing too much work inside the request handler.
The provider sends an event. Your endpoint receives it. Then the handler verifies the event, calls a database, calls external APIs, sends an email, generates a report, and maybe calls an LLM. Only after all that does it return 200 OK.
This is fragile.
Webhook handlers should usually return fast and process later.
The webhook provider only needs acknowledgement
Most webhook providers need to know one thing first:
Did you receive the event?
They do not necessarily need your entire downstream workflow to complete before you respond.
A better handler flow is:
receive event
→ verify signature
→ store event or start job
→ return 200
→ process later
The provider gets fast acknowledgement. Your system gets time to complete work safely.
What goes wrong when handlers are slow
Provider timeout
If your endpoint takes too long, the provider may assume delivery failed. It can retry the same event, causing duplicate processing.
Duplicate side effects
A duplicate webhook can create duplicate records, send duplicate emails, or trigger duplicate billing logic unless your processing is idempotent.
Bad user experience
Slow handlers are harder to debug. A user may complete an action in one system but see no result in yours because the handler timed out halfway.
Fragile dependency chain
If your webhook response depends on five downstream services, any one of them can make the whole provider delivery fail.
The fast acknowledgement pattern
The handler should do only the minimum required work:
- parse the request;
- verify the signature;
- validate the event type;
- store the event or create a job;
- return success.
Everything else can happen in a background job.
Provider request
→ Webhook route
→ Event stored
→ 200 OK
Background job
→ Business logic
→ External APIs
→ Notifications
→ Logs
This pattern makes failures easier to manage because provider delivery and internal processing are separate.
Example: payment webhook
Bad pattern:
Stripe event
→ verify
→ create order
→ generate invoice
→ send email
→ update CRM
→ return 200
Better pattern:
Stripe event
→ verify
→ store event.id
→ start fulfill-order job
→ return 200
fulfill-order job
→ create order
→ generate invoice
→ send email
→ update CRM
→ mark event processed
If the CRM call fails, the provider does not need to resend the event. Your internal job can retry.
Example: AI webhook workflow
Suppose a webhook starts an AI workflow:
new support ticket
→ classify urgency
→ summarize conversation
→ suggest reply
→ notify team
This can be slow. It may involve multiple model calls and external API requests. Running it directly inside the webhook request is risky.
A better flow:
/support/webhook
→ verify event
→ create classification job
→ return 200
classification job
→ retrieve ticket
→ call LLM
→ store result
→ notify Slack
The webhook endpoint stays fast. The AI work becomes observable.
Why this improves retries
Provider retries and internal retries should not be the same mechanism.
Provider retry means:
the provider could not deliver the event
Internal retry means:
your system received the event but processing failed
Those are different problems. Mixing them creates confusion.
Fast acknowledgement lets your platform take ownership of the event after receipt.
What to store before returning
Before returning 200, store enough data to process later:
- provider name;
- event ID;
- event type;
- raw or normalized payload;
- received timestamp;
- signature verification result;
- processing status;
- tenant or account context.
You do not always need to store the entire raw payload forever, but you need enough context to retry and debug.
Where Inquir Compute fits
Inquir Compute supports this pattern with API routes and background jobs or pipelines.
You can expose a webhook route:
POST /webhooks/stripe
Then move slow work into a job:
fulfill-order
send-notification
run-ai-classification
sync-customer
This keeps the public endpoint responsive while giving the internal workflow logs and execution history.
When direct processing is acceptable
Direct processing can be acceptable when the work is very small and safe:
- log an internal event;
- update a lightweight counter;
- trigger a fire-and-forget notification;
- handle a low-risk internal webhook.
But for payments, provisioning, AI workflows, external API chains, and customer-facing actions, process later.
Checklist
A good webhook handler should answer:
- Can it return quickly?
- Is the event verified?
- Is there an event ID for idempotency?
- Is slow work moved to a job?
- Can the job be retried safely?
- Are logs tied to the event ID?
- Can duplicate events be ignored?
Conclusion
Webhook handlers should be boring and fast. Their job is to receive, verify, record, and acknowledge.
The real work belongs in background jobs or pipelines where it can be retried, logged, and inspected.
Fast acknowledgement reduces duplicate events, improves reliability, and makes webhook systems easier to operate.