Use case · Inquir Compute

AI agent backend on serverless functions

Model tools become HTTP tool routes on the gateway: lock them down with API keys, inject secrets per function, offload long steps to jobs or pipelines, and ship handlers on Node.js, Python, or Go with the same observability as the rest of your stack.

Last updated: 2026-06-28

Direct answer

AI agent backend on serverless functions. One function per tool keeps dependencies and deploy risk localized; execution history in the console matches each tool invocation—easier on-call than a shared monolith.

When it fits

  • Multi-step agents
  • Tool access to private data

Tradeoffs

  • Running privileged tools on end-user devices breaks the moment data is regulated or the user is on a locked-down laptop.
  • Throwing every tool into one giant server turns every deploy into a high-risk change and makes log lines impossible to attribute to a specific capability.

What agents actually need

A chat completion is only the headline. Production agents also fetch private data, write to systems of record, escalate to humans, and enforce guardrails—and each of those steps needs clear failure and retry semantics.

When there is no real backend, side effects creep into prompt text or the user’s browser, where they are nearly impossible to audit or revoke cleanly.

Patterns that break in production

Running privileged tools on end-user devices breaks the moment data is regulated or the user is on a locked-down laptop.

Throwing every tool into one giant server turns every deploy into a high-risk change and makes log lines impossible to attribute to a specific capability.

Composable serverless tools for AI agents

One function per tool keeps dependencies and deploy risk localized; execution history in the console matches each tool invocation—easier on-call than a shared monolith.

Handlers use the same Node.js, Python, or Go runtimes as the rest of the platform. Optional warm pools trim cold-start overhead when the model calls tools in a tight loop—measure under realistic load.

Implementation rules for AI agent tools

One function per tool

Split functions unless dependencies are tightly coupled. One tool per function keeps deploy risk small and logs attributable to a specific capability.

Validate inputs, return structured JSON

Define required fields and reject invalid payloads early. Return a stable JSON shape the orchestrator can parse without special-casing.

Secrets in environment variables, never in prompts

Scope API keys per tool function in workspace secrets. Rotate keys independently of model versions without touching prompt templates—secrets never appear in logs or context windows.

Gateway auth before handler

Every tool route requires an API key enforced at the gateway level. Tool functions receive only already-authenticated requests—no ad-hoc auth logic inside handlers, no accidental open endpoints.

Use jobs or pipelines for long-running work

When a tool step exceeds the gateway timeout, return a job ID immediately and continue enrichment or side effects in a background pipeline. The orchestrator polls or receives a webhook when the pipeline completes.

Hot containers for tight tool loops

When the model calls tools in rapid succession, cold-start latency adds up. Enable warm pools for tool functions with steady traffic—measure p95/p99 before and after to validate the gain.

Track failures per tool endpoint

Alert on error rates per tool route, not only per chat session. One failing tool should surface in observability before it silently degrades agent quality.

How to build AI agent tools on Inquir Compute

1

Define per-tool input contract

Document required fields, validation behavior, and the error shape the orchestrator should handle.

2

Define output schema and auth model

Keep return shapes stable across versions. Use route-level API key auth so tools are not accidentally open to public traffic.

3

Define retries, idempotency, and timeout handoff

Decide when to retry in-place, when to return a job ID and continue via pipeline, and how to key writes idempotently so retries do not create duplicates.

Tool handler patterns — Node.js, Python, Go, and async handoff

Sync tools use the same gateway event contract: body arrives as a string, return {statusCode, body}. Mix languages per tool—Python for ML inference, Node.js for API calls, Go for high-throughput lookups. When a step outlasts the gateway timeout, return 202 with a job ID and continue in a pipeline.

tools/lookup.mjs (Node.js 22)
export async function handler(event) {
  const { id } = JSON.parse(event.body || '{}');
  if (!id) return { statusCode: 400, body: JSON.stringify({ error: 'id required' }) };
  // API key auth is enforced at the gateway route — handler assumes authenticated caller
  const row = await db.findById(id);
  if (!row) return { statusCode: 404, body: JSON.stringify({ error: 'not found' }) };
  return { statusCode: 200, body: JSON.stringify({ row }) };
}
tools/classify.py (Python 3.12)
import json, os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])  # injected from workspace secrets

def handler(event, context):
    body = json.loads(event.get("body") or "{}")
    text = body.get("text")
    if not text:
        return {"statusCode": 400, "body": json.dumps({"error": "text required"})}
    r = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Classify intent of: {text}"}],
    )
    return {"statusCode": 200, "body": json.dumps({"intent": r.choices[0].message.content})}
tools/lookup.go (Go 1.22)
package main

import (
	"encoding/json"
)

func parsePayload(event map[string]interface{}) map[string]interface{} {
	if s, ok := event["body"].(string); ok && s != "" {
		var out map[string]interface{}
		if err := json.Unmarshal([]byte(s), &out); err != nil || out == nil {
			return map[string]interface{}{}
		}
		return out
	}
	return event
}

// Handler — API key auth enforced at the gateway route
func Handler(event map[string]interface{}, ctx map[string]interface{}) (interface{}, error) {
	payload := parsePayload(event)
	id, _ := payload["id"].(string)
	if id == "" {
		b, _ := json.Marshal(map[string]string{"error": "id required"})
		return map[string]interface{}{"statusCode": 400, "body": string(b)}, nil
	}
	row, err := db.FindByID(id)
	if err != nil {
		b, _ := json.Marshal(map[string]string{"error": "not found"})
		return map[string]interface{}{"statusCode": 404, "body": string(b)}, nil
	}
	b, _ := json.Marshal(map[string]interface{}{"row": row})
	return map[string]interface{}{"statusCode": 200, "body": string(b)}, nil
}
tools/enrich-async.mjs (async handoff)
export async function handler(event) {
  const { customerId } = JSON.parse(event.body || '{}');
  if (!customerId) return { statusCode: 400, body: JSON.stringify({ error: 'customerId required' }) };
  // Return fast; continue in pipeline — orchestrator polls /jobs/:jobId or receives webhook
  const { instanceId: jobId } = await global.durable.startNew('enrich-customer', undefined, { customerId });
  return { statusCode: 202, body: JSON.stringify({ jobId }) };
}

Good fit

When this works

  • Multi-step agents
  • Tool access to private data

When to skip it

  • Stateless single-shot completions with no side effects

FAQ

Should agent tools be separate HTTP functions?

Yes for production: one function per tool (or tight group) keeps dependencies isolated, deploy risk small, and logs attributable—easier than a monolith that mixes user sessions and tool IO.

How do I store secrets for tool calls?

Use Inquir workspace secrets and environment injection so API keys never live in prompts or client bundles; rotate keys independently of model versions.

Streaming responses to the user?

End-user streaming is a gateway concern; many tool-calling stacks still use plain request/response JSON between the orchestrator and each tool because retries and idempotency stay simpler that way.

How do I make tool calls idempotent when the model retries?

Key writes with stable IDs from the tool payload (customer ID, order ID, external record key). Return the same JSON shape on replay so the orchestrator can treat duplicate invocations as safe no-ops.