Introduction
bem turns documents into structured JSON, against a schema you define
bem turns documents — PDFs, emails, spreadsheets, images, audio, video — into structured JSON, against a schema you define. You compose small typed functions into a workflow, call the workflow with a file, and bem returns extracted data ready to write to your systems.
How it fits together
| Primitive | What it is |
|---|---|
| Function | A single processing step. One of extract, classify, split, join, enrich, parse, payload_shaping, or send. Versioned. |
| Workflow | A directed graph of functions with one entry point. Versioned. |
| Call | A single execution of a workflow against a specific input. |
| Event | The output of a function within a call. Successful events carry a transformation; failed events carry an error. |
| Subscription | A binding from a function to a webhook URL. |
The function types
| Function | What it does |
|---|---|
| Extract | Pulls structured JSON from any supported file against your outputSchema. |
| Classify | Sends the input down one of several labeled paths based on content. |
| Split | Breaks a multi-document file into individual pieces for downstream processing. |
| Join | Combines the outputs of upstream nodes into a single payload. |
| Enrich | Augments extracted data with context from a collection via semantic search. |
| Parse | Renders documents into a navigable structure — sections, entities, relationships — for an LLM agent to walk via the File System API. |
| Payload Shaping | Reshapes JSON with JMESPath for ingestion into downstream systems. |
| Send | Delivers workflow outputs to a webhook, S3 bucket, or Google Drive folder. |
bem is SOC 2 Type 2, HIPAA and GDPR compliant. Outputs are validated against the schema you provide, and low-confidence transformations can be routed to human review automatically.
Get started
Quickstart
Your first synchronous workflow call end-to-end — pick a language and ship in five minutes.
System overview
The full data model: functions, workflows, calls, events, subscriptions.
Workflows explained
Compose functions into branching, splitting, and aggregating graphs.
API reference
Authentication, endpoints, request and response shapes.
SDKs and tools
Official client libraries cover TypeScript, Python, Go, and C#. There's also a CLI, an MCP server for agent-driven access, and a Terraform provider for declarative configuration. All read BEM_API_KEY from the environment by default.
See SDKs for the install commands and links, or jump to the Quickstart for side-by-side examples.
V3 and the legacy API
If you have an existing integration on V1 or V2, see V3 migration for the rename map and endpoint changes. Legacy types (transform, analyze, route) remain readable and callable — no migration is required for deployed pipelines.