Workflows Explained

How workflows compose functions into a directed graph with one entry point

Hand off to an LLM

A workflow is a directed graph of bem functions, called as a single endpoint. You define the graph once; bem runs it on every call, managing state and data flow between nodes. This page covers the structure, the common shapes (sequential, branching, splitting, joining), and how to create and update workflows from the API.

What is a Workflow?

A workflow is a versioned, reusable orchestration layer that wraps one or more functions into a cohesive processing unit. Think of it as a directed graph where:

  • Nodes are named call sites that point at a function (Extract, Classify, Split, Join, Enrich, etc.)
  • Edges are directed connections that define how data flows between nodes
                         Workflow
+--------------------------------------------------------+
|                                                        |
|   +-----------+           +-----------+                |
|   |  Extract  |   ---->   |  Enrich   |                |
|   |  (main)   |           |           |                |
|   +-----------+           +-----------+                |
|                                                        |
+--------------------------------------------------------+
        ^
        |
   Input (PDF, email, JSON, etc.)

When you call a workflow, bem automatically executes all functions in the correct order, managing state and data flow between them.

Why Use Workflows?

Single Entry Point

Instead of managing multiple function calls and tracking their dependencies yourself, workflows provide a single API endpoint. Call the workflow once, and bem handles the orchestration:

curl -X POST "https://api.bem.ai/v3/workflows/invoice-processing/call?wait=true" \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "input": { "singleFile": { "inputType": "pdf", "inputContent": "..." } }
  }'

Versioned Configuration

Workflows are versioned independently from the functions they contain. Each time you update a workflow's structure (changing nodes or edges), a new version is created. This means:

  • Existing integrations continue working on their version
  • You can test new workflow configurations without disrupting production
  • Rollback is straightforward—just point to a previous version

Unified Monitoring

Track the entire processing pipeline through a single workflow call. The response surfaces the terminal outputs and errors produced by the workflow; fetch the full per-node execution graph via GET /v3/calls/{callID}/trace.

{
  "call": {
    "callID": "wc_abc123",
    "status": "completed",
    "workflowName": "invoice-processing",
    "workflowVersionNum": 1,
    "outputs": [
      { "eventID": "ev_abc", "eventType": "transform", "transformation": { ... } }
    ],
    "errors": [],
    "url": "/v3/calls/wc_abc123",
    "traceUrl": "/v3/calls/wc_abc123/trace"
  }
}

Workflow Structure

Every workflow has three key components:

1. Nodes

Nodes are named call sites in the workflow's DAG. Each node points at a function (optionally pinned to a version). At least one node is required:

{
  "name": "invoice-processing",
  "nodes": [
    {
      "name": "invoice-extractor",
      "function": { "name": "invoice-extractor" }
    }
  ]
}

A node's name is unique within the workflow version and is referenced by mainNodeName and by edges. If omitted, it defaults to the function's own name.

2. Main Node

The main node is the entry point—the node that receives input when the workflow is called. Set mainNodeName to the name of one of the nodes declared above:

{
  "mainNodeName": "invoice-extractor"
}

The main node must not be the destination of any edge.

3. Edges

Edges define how data flows between nodes. Each edge specifies a source node and a destination node, creating the processing graph:

{
  "edges": [
    {
      "sourceNodeName": "invoice-extractor",
      "destinationNodeName": "sku-matcher"
    }
  ]
}

The output of the source node becomes the input for the destination node. Edges are optional — a single-node workflow has no edges.

Common Workflow Patterns

Sequential Pipeline

Chain functions in sequence for multi-step processing:

Input --> Extract --> Enrich --> Payload Shaping

Example: Extract invoice data, match to product catalog, then format for ERP system.

{
  "name": "invoice-to-erp",
  "mainNodeName": "invoice-extractor",
  "nodes": [
    { "name": "invoice-extractor", "function": { "name": "invoice-extractor" } },
    { "name": "product-matcher", "function": { "name": "product-matcher" } },
    { "name": "erp-formatter", "function": { "name": "erp-formatter" } }
  ],
  "edges": [
    { "sourceNodeName": "invoice-extractor", "destinationNodeName": "product-matcher" },
    { "sourceNodeName": "product-matcher", "destinationNodeName": "erp-formatter" }
  ]
}

Branching with Classify

Use Classify functions to direct data down different paths based on content. The destinationName on each edge matches a classifications[].name from the Classify function.

                       +--> Invoice Extract
                       |
Input --> Classify ----+--> Receipt Extract
                       |
                       +--> PO Extract

Example: Classify incoming documents and process each type differently.

{
  "name": "document-processor",
  "mainNodeName": "document-classifier",
  "nodes": [
    { "name": "document-classifier", "function": { "name": "document-classifier" } },
    { "name": "invoice-extractor", "function": { "name": "invoice-extractor" } },
    { "name": "receipt-extractor", "function": { "name": "receipt-extractor" } },
    { "name": "po-extractor", "function": { "name": "po-extractor" } }
  ],
  "edges": [
    { "sourceNodeName": "document-classifier", "destinationName": "invoice", "destinationNodeName": "invoice-extractor" },
    { "sourceNodeName": "document-classifier", "destinationName": "receipt", "destinationNodeName": "receipt-extractor" },
    { "sourceNodeName": "document-classifier", "destinationName": "purchase_order", "destinationNodeName": "po-extractor" }
  ]
}

The destinationName field on an edge maps to the classifications[].name values defined in your Classify function configuration.

Split and Process

Handle multi-document files by splitting and processing each piece:

              +--> Doc 1 --> Extract A
              |
PDF --> Split-+--> Doc 2 --> Extract B
              |
              +--> Doc 3 --> Extract C

Example: A PDF containing multiple shipment documents, each needing extraction.

{
  "name": "shipment-bundle-processor",
  "mainNodeName": "shipment-splitter",
  "nodes": [
    { "name": "shipment-splitter", "function": { "name": "shipment-splitter" } },
    { "name": "bol-extractor", "function": { "name": "bol-extractor" } },
    { "name": "commercial-invoice-extractor", "function": { "name": "commercial-invoice-extractor" } },
    { "name": "packing-list-extractor", "function": { "name": "packing-list-extractor" } }
  ],
  "edges": [
    { "sourceNodeName": "shipment-splitter", "destinationName": "bill_of_lading", "destinationNodeName": "bol-extractor" },
    { "sourceNodeName": "shipment-splitter", "destinationName": "commercial_invoice", "destinationNodeName": "commercial-invoice-extractor" },
    { "sourceNodeName": "shipment-splitter", "destinationName": "packing_list", "destinationNodeName": "packing-list-extractor" }
  ]
}

Aggregation with Join

Combine outputs from multiple sources into a unified result:

Source A --+
           |
Source B --+--> Join --> Unified Extract Output
           |
Source C --+

Example: Merge data from multiple related documents into a comprehensive record.

Function Reference Options

A node's function field accepts a FunctionVersionIdentifier. Provide either id or name (not both), and optionally a versionNum to pin to a specific version:

Reference StyleExampleDescription
By name (latest version)"function": { "name": "invoice-extractor" }Uses the current version
By ID (latest version)"function": { "id": "f_abc123" }Uses the current version
By name with version"function": { "name": "invoice-extractor", "versionNum": 2 }Pins to specific version
By ID with version"function": { "id": "f_abc123", "versionNum": 2 }Pins to specific version

Pinning to specific versions is useful when you need deterministic behavior and want to control when updates propagate.

Creating a Workflow

Create a workflow with POST /v3/workflows:

curl -X POST https://api.bem.ai/v3/workflows \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "name": "invoice-processing",
    "displayName": "Invoice Processing Pipeline",
    "tags": ["finance", "invoices"],
    "mainNodeName": "invoice-extractor",
    "nodes": [
      { "name": "invoice-extractor", "function": { "name": "invoice-extractor" } },
      { "name": "sku-matcher", "function": { "name": "sku-matcher" } }
    ],
    "edges": [
      { "sourceNodeName": "invoice-extractor", "destinationNodeName": "sku-matcher" }
    ]
  }'

Configuration Fields

FieldTypeRequiredDescription
namestringYesUnique identifier (alphanumeric, hyphens, underscores)
displayNamestringNoHuman-readable name for the UI
tagsstring[]NoTags for organizing workflows
mainNodeNamestringYesName of the entry-point node
nodesarrayYesCall-site nodes in the DAG (at least one)
edgesarrayNoDirected edges between nodes

Updating a Workflow

Updates create a new version, preserving the previous configuration. Use PATCH /v3/workflows/{workflowName}:

curl -X PATCH https://api.bem.ai/v3/workflows/invoice-processing \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "mainNodeName": "invoice-extractor",
    "nodes": [
      { "name": "invoice-extractor", "function": { "name": "invoice-extractor" } },
      { "name": "sku-matcher", "function": { "name": "sku-matcher" } },
      { "name": "erp-formatter", "function": { "name": "erp-formatter" } }
    ],
    "edges": [
      { "sourceNodeName": "invoice-extractor", "destinationNodeName": "sku-matcher" },
      { "sourceNodeName": "sku-matcher", "destinationNodeName": "erp-formatter" }
    ]
  }'

When updating structure, you must provide mainNodeName, nodes, and edges together. Omit all three to keep the topology unchanged from the current version while updating displayName, tags, or name.

Workflow Versions

List all versions of a workflow with GET /v3/workflows/{workflowName}/versions:

curl https://api.bem.ai/v3/workflows/invoice-processing/versions \
  -H "x-api-key: YOUR_API_KEY"

Get a specific version with GET /v3/workflows/{workflowName}/versions/{versionNum}:

curl https://api.bem.ai/v3/workflows/invoice-processing/versions/2 \
  -H "x-api-key: YOUR_API_KEY"

Executing Workflows

Call a workflow with POST /v3/workflows/{workflowName}/call. The workflow name is derived from the URL path:

curl -X POST "https://api.bem.ai/v3/workflows/invoice-processing/call?wait=true" \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "callReferenceID": "my-tracking-id-001",
    "input": {
      "singleFile": {
        "inputType": "pdf",
        "inputContent": "JVBERi0xLjQK..."
      }
    }
  }'

You can also upload files as multipart/form-data against the same endpoint — see Call a Workflow.

The callReferenceID is your custom identifier for tracking this execution in your systems. Pass wait=true to have the endpoint wait up to 30 seconds for the call to complete; if it's still running when the timeout elapses, the response returns status: "pending" or "running" and you can poll GET /v3/calls/{callID} or configure a webhook subscription.

Ad-hoc Function Calls

V3 executes every call through a workflow — there is no standalone "call a function" endpoint. For a single-function pipeline, wrap the function in a one-node workflow with no edges:

curl -X POST https://api.bem.ai/v3/workflows \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "name": "invoice-extractor-wrapper",
    "mainNodeName": "invoice-extractor",
    "nodes": [
      { "name": "invoice-extractor", "function": { "name": "invoice-extractor" } }
    ]
  }'

Then call it like any other workflow via POST /v3/workflows/invoice-extractor-wrapper/call. Adding more nodes and edges later produces a new workflow version without breaking existing callers.

Best Practices

Start Simple, Extend Incrementally

Begin with a single-node workflow. As requirements grow, add nodes and edges:

  1. Create workflow with just an Extract function
  2. Add Enrich to augment data with your catalogs
  3. Add Payload Shaping to format for downstream systems

Use Meaningful Names

Workflow and function names should describe their purpose:

  • invoice-processing (workflow)
  • invoice-extractor (extract function)
  • product-catalog-matcher (enrich function)

Leverage Tags for Organization

Use tags to categorize workflows by domain, team, or use case:

{
  "tags": ["finance", "ap-automation", "production"]
}

Pin Versions for Stability

In production, consider pinning function versions to prevent unexpected changes:

{
  "nodes": [
    { "name": "invoice-extractor", "function": { "name": "invoice-extractor", "versionNum": 3 } }
  ]
}

Next Steps

On this page