Skip to content

How pgflow Works

pgflow is deliberately simple inside: a few SQL tables, couple of SQL functions, and thin helpers around them. Yet those pieces compose into a full workflow engine with Postgres as the single source of truth.

This page walks you through the mental model in three short hops:

  1. The three layers (DSL → SQL Core → Worker)
  2. The step state machine (created → started → completed/failed)
  3. How workers execute steps (the implementation detail)

pgflow follows four core design principles:

  • Postgres-first: All state and orchestration logic in your database
  • Opinionated over configurable: Simple, sensible defaults that work for most cases
  • Robust yet simple: Reliability without complexity
  • Compile-time safety: Catch flow definition errors before runtime

Three layers of pgflow architecture: TypeScript DSL, SQL Core, and Worker

Layer cheat-sheet:

LayerWhat it doesLives where
TypeScript DSLDescribe the flow shape with full type inferenceYour repo
SQL CoreOwns all state & orchestration logicA handful of Postgres tables + functions
WorkerExecutes user code, then reports backEdge Function by default (but can be anything)

pgflow uses two categories of tables:

Definition tables:

  • flows - Workflow identities and global options
  • steps - DAG nodes with option overrides
  • deps - DAG edges between steps

Runtime tables:

  • runs - One row per workflow execution
  • step_states - Tracks each step’s current state
  • step_tasks - Execution units for step processing (implementation detail)

Because updates happen inside the same transaction that handles the queue message, pgflow gets ACID guarantees “for free”.


A workflow run finishes when every step reaches a terminal status (completed or failed). Each step in pgflow.step_states follows this lifecycle:

created → started → completed
↘ failed
  • created: Step exists but dependencies aren’t met yet
  • started: All dependencies complete, step is executing
  • completed/failed: Step has finished

Behind the scenes, when a step becomes ready, workers handle the execution through a simple 3-call sequence:

  1. read_with_poll → locks queue messages
  2. start_tasks → marks task started, builds input
  3. complete_task/fail_task → finishes task, moves step forward

4. Ultra-short example (2 sequential steps)

Section titled “4. Ultra-short example (2 sequential steps)”
supabase/functions/_flows/greet_user.ts
import { Flow } from "npm:@pgflow/dsl";
type Input = { first: string; last: string };
export default new Flow<Input>({ slug: "greet_user" })
.step(
{ slug: "full_name" },
(input) => `${input.run.first} ${input.run.last}` // return type inferred
)
.step(
{ slug: "greeting", dependsOn: ["full_name"] },
(input) => `Hello, ${input.full_name}!` // safe access, full IntelliSense
);
SELECT pgflow.create_flow('greet_user');
SELECT pgflow.add_step('greet_user', 'full_name');
SELECT pgflow.add_step('greet_user', 'greeting', ARRAY['full_name']);

No boilerplate, no hand-written DAG SQL.


Only the initial input type is annotated (Input). Every other type is inferred:

  • Return type of full_name ➜ becomes input.full_name type
  • The compiler prevents you from referencing input.summary if it does not exist
  • Refactors propagate instantly—change one handler’s return type, dependent steps turn red in your IDE

When a step executes, its input combines the flow input with outputs from dependencies:

// If 'greeting' depends on 'full_name', its input looks like:
{
"run": { "first": "Jane", "last": "Doe" },
"full_name": "Jane Doe"
}

6. Why the three-layer design is liberating

Section titled “6. Why the three-layer design is liberating”

Because all orchestration lives in the SQL Core:

  • Workers are stateless – they can crash or scale horizontally
  • You can mix & match: one flow processed by Edge Functions, another by a Rust service
  • Observability is trivial—SELECT * FROM pgflow.runs is your dashboard

The modular design means:

  • Any worker can process any flow (workers just call SQL functions)
  • Automatic retries with exponential backoff for transient failures
  • Horizontal scaling with multiple worker instances
  • Zero vendor lock-in – swap the worker layer anytime

  1. Three simple layers keep concerns separate.
  2. All state is in Postgres tables—easy to query, backup, and reason about.
  3. Four SQL functions (read_with_poll, start_tasks, complete_task, fail_task) advance the graph transactionally.
  4. The model feels like “a job queue that enqueues the next job”, but pgflow does the wiring.
  5. Type inference guarantees that every step only consumes data that actually exists.

You now have the core mental model needed for the rest of the docs—no more than ten minutes well spent 🚀