Use case

LLM pipelines without mystery state machines

Stages you can retry one at a time—cleaner than a single giant prompt when tools touch prod.

Why giant single prompts fail

You cannot retry one sub-step without redoing everything.

Costs balloon when every branch re-embeds the same context.

Script spaghetti

Notebooks are great for exploration, not for durable production graphs.

Stage as functions

Each stage is deployable and loggable; compose with pipelines for async gaps.

Tool calls remain HTTP functions with explicit auth.

Stages to consider

Retrieve

Isolate embedding and search calls.

Moderate

Fail fast before expensive generation.

Act

Call tools with tight input validation.

Summarize

Compress for storage or user display.

How to stage LLM work with Inquir pipelines

1

Draw dataflow

Name inputs/outputs per box.

2

Codify

Implement each box as a function or pipeline step.

3

Measure cost

Track tokens and wall time per stage.

Pseudo stages

Replace with your orchestrator’s actual calls.

flow.js
// retrieve → moderate → generate → tool? → summarize → persist

Use pipelines when…

When to use

  • Multi-model flows
  • Human-in-the-loop handoffs
  • Long-running enrichment

When not to use

  • Single prompt demos

FAQ

Why split an LLM workflow into stages?

Retries, cost attribution, and debugging improve when retrieval, moderation, tool calls, and summarization are separate steps with their own logs.

Streaming tokens to end users?

Keep user-visible streaming at the boundary; internal stages can use request/response for simpler failure handling and replays.

How do I control cost across stages?

Measure tokens and wall time per stage in observability; cap expensive steps with budgets and short-circuit when moderation fails.

Inquir Compute

The simplest way to run AI agents and backend jobs without infrastructure.

Contact info@inquir.org

© 2025 Inquir Compute. All rights reserved.