System Overview

bem's primitives — functions, workflows, calls, events, transformations, subscriptions, views — and how they fit together

Hand off to an LLM

This page covers bem's core primitives — functions, workflows, calls, events, transformations, subscriptions, views — and how they fit together. If you've already done the Quickstart and want a model of what's actually happening under the API, this is the page.

The Big Picture

+----------------+     +------------------+     +-----------------+
|                |     |                  |     |                 |
|   Your Input   | --> |    Workflow      | --> |   Structured    |
|  (PDF, email,  |     |  (orchestrates   |     |  Output (JSON)  |
|   image, etc.) |     |   Functions)     |     |                 |
|                |     |                  |     |                 |
+----------------+     +------------------+     +-----------------+
                              |
                              v
                   +--------------------+
                   |   Subscriptions    |
                   |   (webhooks to     |
                   |    your systems)   |
                   +--------------------+

You send documents to bem, workflows orchestrate processing using functions, and you receive structured JSON via polling or webhooks.

Core Primitives

Functions

A function is a single, reusable processing operation. Functions are the atomic building blocks of bem:

TypeDescriptionUse case
extract1:1 extraction of structured JSON from documents, images, and mediaInvoices, forms, receipts, visual analysis
classifyClassifies inputs and directs them down labeled pathsDocument type classification, branching workflows
split1:N breakdown of multi-document filesProcessing bundled PDFs
joinN:1 combination of multiple inputsMerging related documents
enrichAugments data via semantic search against collectionsSKU matching, catalog lookup
parseRenders documents into a navigable structure of sections, entities, and relationshipsLLM-agent retrieval over a corpus, cross-document memory
payload_shapingTransforms JSON structure using JMESPathFormatting for downstream APIs
sendDelivers workflow outputs to an external destinationWebhooks, S3 sync, Google Drive

Functions are versioned — each configuration change creates a new version. Workflow nodes that pin a versionNum continue to use that version even after the function is updated.

If you're coming from V1/V2, see V3 migration for the rename map.

Workflows

A workflow orchestrates multiple functions into a unified processing pipeline. Workflows are configured as a directed graph:

                           Workflow
+-------------------------------------------------------------+
|                                                             |
|   +-----------+      +----------+      +----------------+   |
|   |  Extract  | ---> |  Enrich  | ---> | Payload Shaping|   |
|   +-----------+      +----------+      +----------------+   |
|                                                             |
+-------------------------------------------------------------+
         ^
         |
    Single Entry Point
  • Main Function: The entry point that receives input
  • Relationships: Define how data flows between functions
  • Versioned: Update workflows safely without disrupting production

Calls

A call is an execution request. When you send data to bem, you create a call:

POST /v2/calls
        |
        v
+------------------+
|   WorkflowCall   |   <-- Your execution request
+------------------+
        |
        | spawns one per function
        v
+------------------+      +------------------+
|  FunctionCall 1  | ---> |  FunctionCall 2  |
+------------------+      +------------------+
  • Workflow Call: Executes an entire workflow
  • Ad-hoc Function Call: Executes a single function directly

Calls progress through statuses: pendingrunningcompleted (or failed).

Events

An event is the output notification from a function execution. When a function completes, it produces an event containing the results.

FunctionCall completes
         |
         v
+------------------+
|      Event       |
+------------------+
         |
         +-- eventType  ("extract" | "enrich" | "transform" | "classify" | ...)
         |
         +-- Content field (payload, varies by eventType — see below)
         |
         +-- Triggers Subscriptions (webhooks)

Events are what subscriptions listen to — when created, bem delivers them to your configured webhook endpoints. Events are also what a workflow call returns synchronously: a call object whose terminal events are in call.outputs, with the extracted data on each event at transformedContent / enrichedContent / choice / etc. depending on eventType. See Reading workflow call outputs for the full path map and accessor patterns in every SDK.

Transformations

A Transformation is a persisted record of one function's structured output, stored in bem and queryable via the legacy /v1-beta/transformations endpoints. The shape is:

{
  "transformID": "tr_abc123",
  "extractedJSON": {
    "invoiceNumber": "INV-2024-001",
    "vendor": "Acme Corp",
    "totalAmount": 1250.00
  },
  "referenceID": "your-tracking-id"
}

Transformations adhere to the outputSchema defined in the function configuration.

V3 workflow callers, take note: POST /v3/workflows/{name}/call does not return Transformation records. It returns Events whose extracted JSON is at outputs[].transformedContent (or enrichedContent etc., per the field map above). The legacy Transformation record shape only shows up if you read it through the V1/V2 endpoints.

Subscriptions

A subscription configures webhook delivery for events, connecting function outputs to your systems:

Function completes --> Event created --> Subscription triggers --> Webhook sent

Subscribe to specific functions to receive notifications when they complete.

Views

A view provides insight into transformation outputs. Views can include columns, filters, and aggregations—useful for monitoring and analyzing results across many function executions.

How Everything Connects

1. SETUP (once)
   +-- Create Functions (define extraction logic)
   +-- Create Workflow (chain functions together)
   +-- Create Subscriptions (configure webhooks)

2. EXECUTE (per document)
   POST /v2/calls
        |
        v
   WorkflowCall created (status: pending)
        |
        v
   FunctionCalls execute in sequence
        |
        v
   Events produced with Transformations
        |
        v
   Subscriptions trigger webhooks to your systems

3. RETRIEVE
   GET /v2/calls/{id} --> Full results with all function outputs

Data Model Summary

ConceptWhat It IsContains
FunctionReusable processing unitConfiguration, output schema
WorkflowOrchestration layerMain function, relationships
CallExecution requestInput data, reference ID, terminal outputs[]
Function CallSingle function executionStatus, attempt info
EventOutput notification (one per function execution)eventType, functionName, content payload (transformedContent / enrichedContent / choice / …)
TransformationPersisted record of a function's output (legacy V1/V2)transformID, extractedJSON, referenceID
SubscriptionWebhook configFunction ID, webhook URL

Next Steps

On this page