Inquir Compute · webhooks

Webhook retry platform: survive provider redeliveries

Stripe, GitHub, and Slack retry failed webhooks for hours or days. A webhook retry platform needs three layers: reject forgeries fast, ACK inside provider timeouts, and retry downstream work without duplicating side effects. Inquir gives you idempotency, pipeline retries, and per-delivery traces in one serverless stack.

Last updated: 2026-06-23

Direct answer

Webhook retry platform: survive provider redeliveries. The webhook function verifies signatures, writes the provider event ID to durable storage, returns 200 immediately, and enqueues work to a pipeline. Provider retries hit the idempotency check and return 200 without re-processing.

When it fits

  • SaaS webhooks from Stripe, GitHub, Slack, Shopify, or HubSpot with aggressive retry policies
  • Handlers where downstream work can fail after you already returned 200 to the provider

Tradeoffs

  • Checking "did I see this event ID?" in application memory does not survive restarts or horizontal scaling. You need durable idempotency keys written before any side effect.
  • Retrying the entire webhook handler on downstream failure re-runs signature verification and risks double-processing if the first attempt partially succeeded.

Two kinds of webhook retries—and both hurt if you ignore them

Provider retries: Stripe retries for up to 72 hours when your endpoint returns non-2xx or times out. GitHub retries for 3 days. Each redelivery is a new HTTP request with the same event ID—you must detect duplicates before mutating state.

Downstream retries: your handler ACKed fast, but the fulfillment API failed. Without a retry platform, that failure is silent—or you manually replay from logs. Pipeline step retries solve this without re-triggering the provider.

Why ad-hoc retry logic fails at scale

Checking "did I see this event ID?" in application memory does not survive restarts or horizontal scaling. You need durable idempotency keys written before any side effect.

Retrying the entire webhook handler on downstream failure re-runs signature verification and risks double-processing if the first attempt partially succeeded.

Idempotent ingress + retriable pipelines

The webhook function verifies signatures, writes the provider event ID to durable storage, returns 200 immediately, and enqueues work to a pipeline. Provider retries hit the idempotency check and return 200 without re-processing.

Pipeline steps retry independently with exponential backoff. Execution traces show every delivery attempt, every step retry, and the final outcome—so on-call can answer "was this event processed?" in seconds.

Webhook retry platform features

Provider idempotency

Upsert event IDs before mutations. Duplicate deliveries from Stripe, GitHub, or Slack return 200 without side effects.

Fast ACK under timeout pressure

Return 200 within Slack's 3s window, Stripe's 30s limit, and GitHub's expectations—heavy work runs in pipelines.

Downstream step retries

Pipeline steps retry failed API calls, database writes, and notifications without re-invoking the webhook ingress function.

Per-delivery execution traces

Inspect headers, timing, retry count, and step outputs for every webhook delivery—not a black-box worker log.

How to build a webhook retry platform on Inquir

Separate provider retry handling (idempotency at ingress) from downstream retry handling (pipeline step policy).

1

Verify and record event ID

Check HMAC on raw body, upsert provider event ID to durable storage, return 200—even on duplicate delivery.

2

Enqueue durable work

Call global.durable.startNew() with the parsed event payload. The HTTP response completes before downstream work begins.

3

Retry failed steps, not the webhook

Configure pipeline step retry policy for downstream failures. Completed steps are not re-run when a later step fails.

Idempotent webhook ingress with pipeline handoff

Provider retries hit the idempotency check. Downstream failures retry at the pipeline step level.

webhooks/stripe-retry-safe.mjs
export async function handler(event) {
  const rawBody = event.body ?? '';
  if (!verifyStripeSignature(rawBody, event.headers['stripe-signature'])) {
    return { statusCode: 400, body: 'invalid signature' };
  }
  const evt = JSON.parse(rawBody);
  const isNew = await db.tryInsertWebhookEvent(evt.id, evt.type);
  if (!isNew) return { statusCode: 200, body: 'already processed' };
  await global.durable.startNew('stripe-fulfillment', undefined, {
    eventId: evt.id, type: evt.type, object: evt.data.object,
  });
  return { statusCode: 200, body: 'accepted' };
}

Good fit for a webhook retry platform

When this works

  • SaaS webhooks from Stripe, GitHub, Slack, Shopify, or HubSpot with aggressive retry policies
  • Handlers where downstream work can fail after you already returned 200 to the provider

When to skip it

  • Webhooks forwarded to a third-party iPaaS with no custom runtime

FAQ

How long do providers retry?

Stripe: up to 72 hours with exponential backoff. GitHub: up to 3 days. Slack expects a response within 3 seconds or marks your app slow. Design for both fast ACK and duplicate delivery.

What if the pipeline step exhausts retries?

The pipeline run is marked failed with full step logs. Alert on failure rates; replay manually from the execution history using the stored event payload.

Do I need a separate dead-letter queue?

Failed pipeline runs serve as dead letters with searchable execution history. No separate DLQ infrastructure to provision.