Building a REST API on serverless functions
A pragmatic guide to shipping a serverless REST API on gateway-backed functions: map resources to routes, enforce API keys at the API gateway, standardize your request and response shape, add cursor pagination and a single error contract, and know exactly where the limits are.
A REST API is still the most boring, most reliable contract you can hand another team: JSON in, JSON out, predictable status codes. The interesting question is no longer whether to build one, but where it should run. This post makes the pragmatic case for building a serverless REST API on gateway-backed functions — one function per resource, auth at the edge, and a request shape you already know — and it is honest about the few places where serverless makes you change how you think.
Why a serverless REST API beats a routing monolith
The default instinct is to reach for a framework — Express, FastAPI, Gin — and put every route behind one long-running process. That works right up until it doesn’t. When /users, /invoices, and /reports all live in the same binary, one bad deploy touches all three, and bisecting an incident means reading logs that interleave three unrelated concerns.
Scaling has the same problem. If /search gets hammered, you scale the entire monolith — including the sleepy /settings routes that see ten requests a day. You pay for memory you are not using because scaling is coupled at the process boundary, not the route boundary.
A serverless REST API flips that. Each resource is its own function with its own dependency set, its own deploy, and its own scaling curve. A hot route scales on its own; a risky change ships on its own; a failing endpoint shows up in observability attributed to a specific capability instead of a shared log stream.
The honest counterpoint is microservice sprawl. Split every path into its own repo and pipeline and you trade one problem for a worse one: dozens of tiny services that still need shared auth, CORS, and rate limits. The answer is not “many services” — it is route groups behind one API gateway. Group the serverless API endpoints that deploy and fail together into a function, and let the gateway own the cross-cutting concerns.
When does the monolith still win? When your team is tiny and a framework app is already working. Do not rewrite a healthy system for architectural fashion. Reach for serverless when deploy coupling, uneven scaling, or blast radius are actually hurting you.
Mapping REST resources to serverless API endpoints
Start where REST tells you to start: model resources as nouns. users, orders, invoices. Each noun becomes a route group, and each route group becomes a function. The verbs — GET, POST, PUT, PATCH, DELETE — are handled inside the function by switching on the HTTP method.
On Inquir Compute a gateway route is a method plus a path. Methods cover the full set — GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS, and a catch-all ANY. Paths support named parameters written as :id or {id} and */+ wildcards, so a single route like ANY /v1/users* can back an entire resource. The route’s target is a function (a lambda) — or, when you need multi-step work, a pipeline.
This is what serverless routing actually means here: the gateway matches the request to a route and hands your function an event; your function decides what the verb means. Publicly, routes are reachable at /gw/{tenant}/{route-path}, and named APIs can sit behind a verified custom domain so consumers see api.yourcompany.com rather than a platform URL.
How granular should you go? Group routes that change together. Splitting every single path into its own function feels clean on a whiteboard but explodes operational noise — more deploys, more dashboards, more cold paths. A good default is one function per resource, and you split further only when a route has genuinely different dependencies or a different scaling profile. The goal is a small number of serverless api endpoints that each own a coherent slice of the domain.
Route-level auth with API keys
Authentication is the first thing a public API needs and the last thing you want scattered across handlers. On the gateway, auth is configured per route, with three modes: none, api-key, and bearer. That single decision — set on the route, not in your code — determines who reaches the handler at all.
For server-to-server and partner traffic, api-key is the workhorse. Callers send an X-Api-Key header, and the gateway validates it before your function runs. Keys are 32-byte base64url values, stored only as a SHA-256 hash, and shown to the creator exactly once — so a leaked database never leaks usable keys. You can scope a route to specific keys with allowedApiKeyIds, which is how you give each partner a credential that only opens the endpoints they are entitled to.
For user-facing clients that already carry a token, bearer mode checks the Authorization: Bearer … header. Either way, the important architectural property is the same: your handler assumes an already-authenticated caller. There is no per-function auth boilerplate to copy, no forgotten check that accidentally leaves an endpoint open, and no auth logic to unit-test inside business code. The api gateway is the single chokepoint, and handlers stay focused on the resource.
The request and response shape of a serverless handler
If you have ever written an AWS Lambda behind API Gateway, this will feel immediately familiar. Your handler has the signature (event, context), and the gateway hands it an API-Gateway-style event with the fields you expect:
httpMethod—"GET","POST", and so on.path— the matched request path.headers— a plain object of request headers.queryStringParameters— parsed query string, ornull.pathParameters— the named path params, so:idarrives aspathParameters.id.body— the raw request body as a string (ornull).
The body arriving as a string is deliberate, not an inconvenience. It preserves the exact bytes the client sent, which matters for signature verification and content types you would rather not have a framework silently reparse. For JSON, you call JSON.parse(event.body || '{}') yourself and stay in control.
On the way out, return { statusCode, body }, where body is a string — almost always JSON.stringify(payload). Returning a bare object is also accepted and treated as a 200 JSON response, which is handy for quick internal endpoints; but a real REST API on serverless functions should be explicit about status codes, because 201 Created, 404 Not Found, and 422 Unprocessable Entity are part of your contract. Explicit is better than implicit when other people’s code depends on you.
A realistic handler: a users resource in JavaScript
Here is one function backing the entire /v1/users resource. It routes on the HTTP method, paginates the list endpoint with a cursor, validates input at the boundary, and returns a single consistent error envelope. Wire it to a gateway route such as ANY /v1/users* with api-key auth, and the handler can assume every caller is already authenticated.
// users.mjs — one function backing the /v1/users resource
// Gateway route: ANY /v1/users* · auth: api-key (enforced before this runs)
function json(statusCode, payload) {
return { statusCode, body: JSON.stringify(payload) };
}
function jsonError(statusCode, code, message) {
return json(statusCode, { error: { code, message } });
}
export async function handler(event) {
const { httpMethod, pathParameters, queryStringParameters } = event;
const id = pathParameters?.id;
try {
// GET /v1/users/:id — fetch one
if (httpMethod === 'GET' && id) {
const user = await db.users.findById(id);
if (!user) return jsonError(404, 'not_found', 'User not found');
return json(200, { data: user });
}
// GET /v1/users — list, cursor-paginated
if (httpMethod === 'GET') {
const limit = Math.min(Number(queryStringParameters?.limit) || 25, 100);
const cursor = queryStringParameters?.cursor ?? null;
const rows = await db.users.list({ limit: limit + 1, cursor });
const hasMore = rows.length > limit;
const data = hasMore ? rows.slice(0, limit) : rows;
return json(200, {
data,
next_cursor: hasMore ? data[data.length - 1].id : null,
});
}
// POST /v1/users — create
if (httpMethod === 'POST') {
const input = JSON.parse(event.body || '{}');
if (!input.email) return jsonError(422, 'invalid_body', 'email is required');
const user = await db.users.insert({ email: input.email, name: input.name ?? null });
return json(201, { data: user });
}
return jsonError(405, 'method_not_allowed', `${httpMethod} not supported`);
} catch (err) {
// Never leak internals: log the detail, return a stable envelope
console.error('users handler failed', err);
return jsonError(500, 'internal_error', 'Unexpected error');
}
}
A few things are worth calling out because they are the difference between a demo and an API clients can build on.
Pagination. The list branch reads limit and cursor from the query string, clamps limit so a client cannot ask for a million rows, and fetches one extra row to detect whether there is a next page. It returns next_cursor — the id of the last item — or null at the end. Cursor pagination stays stable under inserts and deletes in a way that offset/page does not, and it keeps each response comfortably under the body limit.
Error contracts. Every failure path returns the same envelope: { "error": { "code": "...", "message": "..." } }. The code is a stable machine string clients switch on; the message is a human hint that can change without breaking anyone. One shape across every route means your client SDK writes error handling once. Validate at the handler boundary and fail fast — a missing email becomes a 422 immediately, never a 500 three layers deep. (If you would rather reject malformed payloads before your code even runs, the gateway also offers opt-in per-route body validation.)
Defensive edges. An unsupported verb returns 405 instead of pretending to succeed, and the outer try/catch guarantees that an unexpected exception becomes a clean 500 with no stack trace leaking to the caller. Small habits, but they are what separate a public surface from an internal script.
CORS, versioning, and rate limits as gateway patterns
Three cross-cutting concerns show up on every public API. Two are gateway features; one is a discipline you adopt.
CORS is handled per route on the gateway, and it is on by default (corsEnabled: true). The gateway answers preflight OPTIONS requests and sends back allowed methods (GET, POST, PUT, PATCH, DELETE, OPTIONS) and headers (Content-Type, Authorization, X-Api-Key). Because it lives on the route, browser clients work without a single line of header code in your handler — and you tighten the allow-list centrally rather than auditing every function.
Versioning is a pattern, not a toggle. There is no magic “v2” switch; you version by path prefix. Ship /v1/users, and when a breaking schema change is unavoidable, stand up /v2/users as new routes — often new function versions — while the /v1 prefix keeps serving existing clients. This is exactly the discipline of versioning before you break a contract, and the gateway’s route model makes running both prefixes side by side cheap.
Rate limits are enforced per route, and here honesty matters: the limiter is per source-IP, per-minute, and in-memory — not a distributed global quota. When a client exceeds the limit it gets a 429 with Retry-After: 60. That is exactly right for shielding a shared dependency from a runaway or abusive client. But because it is per-IP and per-instance, do not treat it as a precise account-level billing quota; if you need global, per-tenant quotas, enforce those in the handler against a shared store and let the gateway limiter be your coarse first line of defense.
Honest limits and when to reach for background jobs
Serverless changes the shape of a few problems, and pretending otherwise just produces outages later.
Request size and time are bounded. The request body limit is about 2 MB by default — generous for JSON, wrong for accepting large file uploads directly (use object storage and a signed URL instead). Each function has a timeout of 5 seconds by default, 15 minutes maximum. That ceiling is for the long tail, not your happy path; a synchronous REST call should return in well under a second.
Cold starts are reduced, not eliminated. Hot/warm container pools keep instances ready and absorb most of the latency, but a first invoke or one after idle is still a cold start. Measure your p95/p99 under realistic traffic rather than assuming zero.
Slow work does not belong on the request thread. This is the most important limit to internalize: there is no durable orchestration engine you can lean on inside a handler. When a request would blow past the timeout — rebuilding an index, generating a big export, calling a slow third party — do not hold the socket open. Return 202 Accepted with a job id and continue in a durable, Postgres-backed background job.
// export.mjs — hand slow work to a durable background job, return fast
export async function handler(event) {
const { datasetId } = JSON.parse(event.body || '{}');
if (!datasetId) return { statusCode: 400, body: JSON.stringify({ error: { code: 'invalid_body', message: 'datasetId required' } }) };
const { instanceId: jobId } = await global.durable.startNew('reindex-dataset', undefined, { datasetId });
// Client polls GET /v1/jobs/:jobId or waits for a webhook
return { statusCode: 202, body: JSON.stringify({ data: { jobId, status: 'queued' } }) };
}
The job survives restarts, retries with backoff are available, and exhausted jobs dead-letter. What it does not give you is exactly-once delivery or guaranteed ordering — so make handlers idempotent and key writes on stable ids. Two other defaults are worth knowing: outbound network access is off unless you enable it, and the root filesystem is read-only with a writable /tmp. Treat these as security defaults, not obstacles.
Takeaway
Building a serverless REST API is not exotic. Model resources as nouns, back each with a function, and let the api gateway own routing, API-key auth, CORS, and a first line of rate limiting. Keep the request and response shape explicit — an API-Gateway-style event in, { statusCode, body } out — and standardize one error envelope and cursor pagination so clients are cheap to write. Version by path prefix before you break a contract, respect the ~2 MB body and 5s/15min timeout ceilings, and hand anything slow to a durable background job with a 202. Do that, and you get the deploy isolation and per-route scaling of many services with the shared gateway hygiene of one product — without a routing monolith and without microservice sprawl. Split one read-heavy route out first, learn the deploy loop, and grow the surface from there.