System Overview
bem's primitives — functions, workflows, calls, events, transformations, subscriptions, views — and how they fit together
This page covers bem's core primitives — functions, workflows, calls, events, transformations, subscriptions, views — and how they fit together. If you've already done the Quickstart and want a model of what's actually happening under the API, this is the page.
The Big Picture
+----------------+ +------------------+ +-----------------+
| | | | | |
| Your Input | --> | Workflow | --> | Structured |
| (PDF, email, | | (orchestrates | | Output (JSON) |
| image, etc.) | | Functions) | | |
| | | | | |
+----------------+ +------------------+ +-----------------+
|
v
+--------------------+
| Subscriptions |
| (webhooks to |
| your systems) |
+--------------------+You send documents to bem, workflows orchestrate processing using functions, and you receive structured JSON via polling or webhooks.
Core Primitives
Functions
A function is a single, reusable processing operation. Functions are the atomic building blocks of bem:
| Type | Description | Use case |
|---|---|---|
extract | 1:1 extraction of structured JSON from documents, images, and media | Invoices, forms, receipts, visual analysis |
classify | Classifies inputs and directs them down labeled paths | Document type classification, branching workflows |
split | 1:N breakdown of multi-document files | Processing bundled PDFs |
join | N:1 combination of multiple inputs | Merging related documents |
enrich | Augments data via semantic search against collections | SKU matching, catalog lookup |
parse | Renders documents into a navigable structure of sections, entities, and relationships | LLM-agent retrieval over a corpus, cross-document memory |
payload_shaping | Transforms JSON structure using JMESPath | Formatting for downstream APIs |
send | Delivers workflow outputs to an external destination | Webhooks, S3 sync, Google Drive |
Functions are versioned — each configuration change creates a new version. Workflow nodes that pin a versionNum continue to use that version even after the function is updated.
If you're coming from V1/V2, see V3 migration for the rename map.
Workflows
A workflow orchestrates multiple functions into a unified processing pipeline. Workflows are configured as a directed graph:
Workflow
+-------------------------------------------------------------+
| |
| +-----------+ +----------+ +----------------+ |
| | Extract | ---> | Enrich | ---> | Payload Shaping| |
| +-----------+ +----------+ +----------------+ |
| |
+-------------------------------------------------------------+
^
|
Single Entry Point- Main Function: The entry point that receives input
- Relationships: Define how data flows between functions
- Versioned: Update workflows safely without disrupting production
Calls
A call is an execution request. When you send data to bem, you create a call:
POST /v2/calls
|
v
+------------------+
| WorkflowCall | <-- Your execution request
+------------------+
|
| spawns one per function
v
+------------------+ +------------------+
| FunctionCall 1 | ---> | FunctionCall 2 |
+------------------+ +------------------+- Workflow Call: Executes an entire workflow
- Ad-hoc Function Call: Executes a single function directly
Calls progress through statuses: pending → running → completed (or failed).
Events
An event is the output notification from a function execution. When a function completes, it produces an event containing the results.
FunctionCall completes
|
v
+------------------+
| Event |
+------------------+
|
+-- eventType ("extract" | "enrich" | "transform" | "classify" | ...)
|
+-- Content field (payload, varies by eventType — see below)
|
+-- Triggers Subscriptions (webhooks)Events are what subscriptions listen to — when created, bem delivers them to your configured webhook endpoints. Events are also what a workflow call returns synchronously: a call object whose terminal events are in call.outputs, with the extracted data on each event at transformedContent / enrichedContent / choice / etc. depending on eventType. See Reading workflow call outputs for the full path map and accessor patterns in every SDK.
Transformations
A Transformation is a persisted record of one function's structured output, stored in bem and queryable via the legacy /v1-beta/transformations endpoints. The shape is:
{
"transformID": "tr_abc123",
"extractedJSON": {
"invoiceNumber": "INV-2024-001",
"vendor": "Acme Corp",
"totalAmount": 1250.00
},
"referenceID": "your-tracking-id"
}Transformations adhere to the outputSchema defined in the function configuration.
V3 workflow callers, take note:
POST /v3/workflows/{name}/calldoes not return Transformation records. It returns Events whose extracted JSON is atoutputs[].transformedContent(orenrichedContentetc., per the field map above). The legacy Transformation record shape only shows up if you read it through the V1/V2 endpoints.
Subscriptions
A subscription configures webhook delivery for events, connecting function outputs to your systems:
Function completes --> Event created --> Subscription triggers --> Webhook sentSubscribe to specific functions to receive notifications when they complete.
Views
A view provides insight into transformation outputs. Views can include columns, filters, and aggregations—useful for monitoring and analyzing results across many function executions.
How Everything Connects
1. SETUP (once)
+-- Create Functions (define extraction logic)
+-- Create Workflow (chain functions together)
+-- Create Subscriptions (configure webhooks)
2. EXECUTE (per document)
POST /v2/calls
|
v
WorkflowCall created (status: pending)
|
v
FunctionCalls execute in sequence
|
v
Events produced with Transformations
|
v
Subscriptions trigger webhooks to your systems
3. RETRIEVE
GET /v2/calls/{id} --> Full results with all function outputsData Model Summary
| Concept | What It Is | Contains |
|---|---|---|
| Function | Reusable processing unit | Configuration, output schema |
| Workflow | Orchestration layer | Main function, relationships |
| Call | Execution request | Input data, reference ID, terminal outputs[] |
| Function Call | Single function execution | Status, attempt info |
| Event | Output notification (one per function execution) | eventType, functionName, content payload (transformedContent / enrichedContent / choice / …) |
| Transformation | Persisted record of a function's output (legacy V1/V2) | transformID, extractedJSON, referenceID |
| Subscription | Webhook config | Function ID, webhook URL |