How pgflow Works
pgflow is deliberately simple inside: a few SQL tables, couple of SQL functions, and thin helpers around them. Yet those pieces compose into a full workflow engine with Postgres as the single source of truth.
This page walks you through the mental model in three short hops:
- The three layers (DSL → SQL Core → Worker)
- The step state machine (
created → started → completed/failed
) - How workers execute steps (the implementation detail)
pgflow follows four core design principles:
- Postgres-first: All state and orchestration logic in your database
- Opinionated over configurable: Simple, sensible defaults that work for most cases
- Robust yet simple: Reliability without complexity
- Compile-time safety: Catch flow definition errors before runtime
1. Three thin layers
Section titled “1. Three thin layers”Layer cheat-sheet:
Layer | What it does | Lives where |
---|---|---|
TypeScript DSL | Describe the flow shape with full type inference | Your repo |
SQL Core | Owns all state & orchestration logic | A handful of Postgres tables + functions |
Worker | Executes user code, then reports back | Edge Function by default (but can be anything) |
2. All state lives in SQL 📊
Section titled “2. All state lives in SQL 📊”pgflow uses two categories of tables:
Definition tables:
flows
- Workflow identities and global optionssteps
- DAG nodes with option overridesdeps
- DAG edges between steps
Runtime tables:
runs
- One row per workflow executionstep_states
- Tracks each step’s current statestep_tasks
- Execution units for step processing (implementation detail)
Because updates happen inside the same transaction that handles the queue message, pgflow gets ACID guarantees “for free”.
3. Steps drive your workflow
Section titled “3. Steps drive your workflow”A workflow run finishes when every step reaches a terminal status (completed or failed). Each step in pgflow.step_states
follows this lifecycle:
created → started → completed ↘ failed
- created: Step exists but dependencies aren’t met yet
- started: All dependencies complete, step is executing
- completed/failed: Step has finished
How workers execute steps
Section titled “How workers execute steps”Behind the scenes, when a step becomes ready, workers handle the execution through a simple 3-call sequence:
read_with_poll
→ locks queue messagesstart_tasks
→ marks task started, builds inputcomplete_task/fail_task
→ finishes task, moves step forward
4. Ultra-short example (2 sequential steps)
Section titled “4. Ultra-short example (2 sequential steps)”Flow definition
Section titled “Flow definition”import { Flow } from "npm:@pgflow/dsl";
type Input = { first: string; last: string };
export default new Flow<Input>({ slug: "greet_user" }) .step( { slug: "full_name" }, (input) => `${input.run.first} ${input.run.last}` // return type inferred ) .step( { slug: "greeting", dependsOn: ["full_name"] }, (input) => `Hello, ${input.full_name}!` // safe access, full IntelliSense );
What the compiler generates
Section titled “What the compiler generates”SELECT pgflow.create_flow('greet_user');SELECT pgflow.add_step('greet_user', 'full_name');SELECT pgflow.add_step('greet_user', 'greeting', ARRAY['full_name']);
No boilerplate, no hand-written DAG SQL.
5. Type safety from end to end
Section titled “5. Type safety from end to end”Only the initial input type is annotated (Input
).
Every other type is inferred:
- Return type of
full_name
➜ becomesinput.full_name
type - The compiler prevents you from referencing
input.summary
if it does not exist - Refactors propagate instantly—change one handler’s return type, dependent steps turn red in your IDE
When a step executes, its input combines the flow input with outputs from dependencies:
// If 'greeting' depends on 'full_name', its input looks like:{ "run": { "first": "Jane", "last": "Doe" }, "full_name": "Jane Doe"}
6. Why the three-layer design is liberating
Section titled “6. Why the three-layer design is liberating”Because all orchestration lives in the SQL Core:
- Workers are stateless – they can crash or scale horizontally
- You can mix & match: one flow processed by Edge Functions, another by a Rust service
- Observability is trivial—
SELECT * FROM pgflow.runs
is your dashboard
The modular design means:
- Any worker can process any flow (workers just call SQL functions)
- Automatic retries with exponential backoff for transient failures
- Horizontal scaling with multiple worker instances
- Zero vendor lock-in – swap the worker layer anytime
- Three simple layers keep concerns separate.
- All state is in Postgres tables—easy to query, backup, and reason about.
- Four SQL functions (
read_with_poll
,start_tasks
,complete_task
,fail_task
) advance the graph transactionally. - The model feels like “a job queue that enqueues the next job”, but pgflow does the wiring.
- Type inference guarantees that every step only consumes data that actually exists.
You now have the core mental model needed for the rest of the docs—no more than ten minutes well spent 🚀