Quickstart
Set up your first workflow with bem
This guide walks you through setting up a simple workflow that extracts structured data from invoices using bem's API. You'll learn how to:
- Create and configure a transform function with a data schema you can customize
- Create and configure workflow that uses the function
- Call the workflow with a file as input
- Retrieve your results via polling or webhooks
Prerequisites
- A bem account (you can sign up for free!)
- An API key (you can generate one in
Settings>API Keysthrough our UI)
Step 1: Create a Transform Function
First, create a transform function that defines the structure you want to extract from invoices.
curl -X POST https://api.bem.ai/v2/functions \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"functionName": "invoice-extractor",
"type": "transform",
"displayName": "Invoice Extractor",
"outputSchemaName": "Invoice",
"outputSchema": {
"type": "object",
"required": ["invoiceNumber", "vendor", "totalAmount"],
"properties": {
"invoiceNumber": {
"type": "string",
"description": "The unique invoice identifier"
},
"invoiceDate": {
"type": "string",
"description": "Date of the invoice in YYYY-MM-DD format"
},
"dueDate": {
"type": "string",
"description": "Payment due date in YYYY-MM-DD format"
},
"vendor": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Company name of the vendor"
},
"address": {
"type": "string",
"description": "Full address of the vendor"
}
}
},
"lineItems": {
"type": "array",
"description": "Individual items on the invoice",
"items": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Description of the item or service"
},
"quantity": {
"type": "number",
"description": "Number of units"
},
"unitPrice": {
"type": "number",
"description": "Price per unit"
},
"amount": {
"type": "number",
"description": "Total amount for this line item"
}
}
}
},
"subtotal": {
"type": "number",
"description": "Sum of all line items before tax"
},
"taxAmount": {
"type": "number",
"description": "Total tax amount"
},
"totalAmount": {
"type": "number",
"description": "Final total amount due"
}
}
}
}'Response:
{
"functionID": "fn_2abc123xyz",
"functionName": "invoice-extractor",
"displayName": "Invoice Extractor",
"type": "transform",
"currentVersionNum": 1
}Step 2: Create a Workflow
Create a workflow that uses your transform function. Workflows provide a stable entry point for processing and can be extended later with additional functions.
curl -X POST https://api.bem.ai/v2/workflows \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"name": "invoice-processing",
"displayName": "Invoice Processing Workflow",
"tags": ["invoices", "financial-data"],
"mainFunction": {
"name": "invoice-extractor"
}
}'Response:
{
"workflow": {
"workflowID": "w_2def456abc",
"name": "invoice-processing",
"displayName": "Invoice Processing Workflow",
"currentVersionNum": 1,
"tags": ["invoices", "financial-data"],
"mainFunction": {
"functionID": "f_2abc123xyz",
"functionName": "invoice-extractor",
"versionNum": 1
}
}
}Step 3: Call the Workflow
Now you can send invoices to your workflow for processing using the Calls API. Provide your file as base64-encoded content.
curl -X POST https://api.bem.ai/v2/calls \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"calls": [
{
"workflowName": "invoice-processing",
"callReferenceID": "invoice-001",
"input": {
"singleFile": {
"inputType": "pdf",
"inputContent": "JVBERi0xLjQKJeLjz9..."
}
}
}
]
}'Response:
{
"calls": [
{
"callID": "wc_2ghi789def",
"callType": "workflow",
"status": "pending",
"workflowID": "wf_2def456abc",
"workflowName": "invoice-processing",
"callReferenceID": "invoice-001",
"createdAt": "2024-01-15T10:30:00Z"
}
]
}Step 4: Get Results
All processing is asynchronous. You can retrieve results either by polling or by setting up webhooks.
Option A: Polling
Poll the call endpoint until the status is completed:
curl -X GET https://api.bem.ai/v2/calls/wc_2ghi789def \
-H "x-api-key: YOUR_API_KEY"Response when processing:
{
"call": {
"callID": "wc_2ghi789def",
"callType": "workflow",
"status": "running",
"workflowName": "invoice-processing",
"callReferenceID": "invoice-001",
"createdAt": "2024-01-15T10:30:00Z"
}
}Response when complete:
{
"call": {
"callID": "wc_2ghi789def",
"callType": "workflow",
"status": "completed",
"workflowName": "invoice-processing",
"callReferenceID": "invoice-001",
"createdAt": "2024-01-15T10:30:00Z",
"finishedAt": "2024-01-15T10:30:15Z",
"functionCalls": [
{
"functionCallID": "fc_3jkl012mno",
"functionName": "invoice-extractor",
"type": "transform",
"referenceID": "invoice-001",
"status": "completed",
"transformedContent": {
"invoiceNumber": "INV-2024-0042",
"invoiceDate": "2024-01-15",
"dueDate": "2024-02-15",
"vendor": {
"name": "Acme Supplies Inc.",
"address": "123 Business Ave, Suite 100, San Francisco, CA 94107"
},
"lineItems": [
{
"description": "Widget A - Premium",
"quantity": 10,
"unitPrice": 25.0,
"amount": 250.0
},
{
"description": "Widget B - Standard",
"quantity": 5,
"unitPrice": 15.0,
"amount": 75.0
}
],
"subtotal": 325.0,
"taxAmount": 29.25,
"totalAmount": 354.25
}
}
]
}
}Option B: Webhooks (Subscriptions)
For production use, set up a webhook subscription to receive results automatically when processing completes.
Create a subscription:
curl -X POST https://api.bem.ai/v1-alpha/subscriptions \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"name": "invoice-results",
"type": "transform",
"functionName": "invoice-extractor",
"webhookURL": "https://your-server.com/webhooks/bem"
}'Response:
{
"subscriptionID": "sub_4pqr345stu",
"name": "invoice-results",
"type": "transform",
"functionName": "invoice-extractor",
"webhookURL": "https://your-server.com/webhooks/bem",
"disabled": false
}When processing completes, bem sends a POST request to your webhook URL with the transformed data:
{
"functionCallID": "fc_3jkl012mno",
"functionName": "invoice-extractor",
"referenceID": "invoice-001",
"status": "completed",
"transformedContent": {
"invoiceNumber": "INV-2024-0042",
"invoiceDate": "2024-01-15",
"vendor": {
"name": "Acme Supplies Inc.",
"address": "123 Business Ave, Suite 100, San Francisco, CA 94107"
},
"lineItems": [
{
"description": "Widget A - Premium",
"quantity": 10,
"unitPrice": 25.0,
"amount": 250.0
}
],
"subtotal": 325.0,
"taxAmount": 29.25,
"totalAmount": 354.25
}
}Verifying Webhook Signatures
To verify that webhook requests are from bem, check the bem-signature header. See Webhook Authentication for details.
Error Handling
Create an error subscription to be notified when processing fails:
curl -X POST https://api.bem.ai/v1-alpha/subscriptions \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"name": "invoice-errors",
"type": "error",
"functionName": "invoice-extractor",
"webhookURL": "https://your-server.com/webhooks/bem-errors"
}'Expanding Your Workflow (Optional)
One of bem's strengths is the ability to iterate on workflows incrementally. Let's extend our invoice processing workflow to automatically match line item descriptions to SKUs from a product catalog using an Enrich function.
This demonstrates how you can chain multiple functions together to build sophisticated data pipelines.
Step 5a: Create a Product Catalog Collection
First, create a collection to store your product catalog data:
curl -X POST https://api.bem.ai/v2/collections \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"collectionName": "product_catalog"
}'Response:
{
"collectionID": "cl_5abc123def",
"collectionName": "product_catalog",
"itemCount": 0,
"createdAt": "2024-01-15T11:00:00Z"
}Step 5b: Add Products to the Collection
Populate the collection with your product data. Each item includes a data field that bem uses for semantic search:
curl -X POST https://api.bem.ai/v2/collections/items \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"collectionName": "product_catalog",
"items": [
{
"data": {
"sku": "WGT-A-PRM-001",
"name": "Widget A - Premium Edition",
"category": "Widgets",
"unitCost": 18.50
}
},
{
"data": {
"sku": "WGT-B-STD-002",
"name": "Widget B - Standard Model",
"category": "Widgets",
"unitCost": 9.25
}
},
{
"data": {
"sku": "WGT-C-ECO-003",
"name": "Widget C - Economy Line",
"category": "Widgets",
"unitCost": 5.00
}
}
]
}'Response:
{
"status": "pending",
"message": "Collection items are being processed asynchronously",
"eventID": "evt_6def456ghi"
}Step 5c: Create an Enrich Function
Create an enrich function that searches your product catalog to match line item descriptions to SKUs:
curl -X POST https://api.bem.ai/v2/functions \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"functionName": "sku-matcher",
"type": "enrich",
"displayName": "SKU Matcher",
"tags": ["products", "sku-lookup"],
"config": {
"steps": [
{
"sourceField": "lineItems[*].description",
"collectionName": "product_catalog",
"targetField": "matchedProducts",
"topK": 1,
"searchMode": "semantic"
}
]
}
}'This configuration tells the enrich function to:
- Extract all
descriptionvalues from thelineItemsarray - Search the
product_catalogcollection semantically - Store the best match for each item in
matchedProducts
Response:
{
"functionID": "fn_7ghi789jkl",
"functionName": "sku-matcher",
"displayName": "SKU Matcher",
"type": "enrich",
"currentVersionNum": 1
}Step 5d: Update the Workflow
Now update your workflow to chain the transform and enrich functions together. The output from invoice-extractor will flow into sku-matcher:
curl -X PATCH https://api.bem.ai/v2/workflows/invoice-processing \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"mainFunction": {
"name": "invoice-extractor"
},
"relationships": [
{
"sourceFunction": {
"name": "invoice-extractor"
},
"destinationFunction": {
"name": "sku-matcher"
}
}
]
}'Response:
{
"workflow": {
"workflowID": "w_2def456abc",
"name": "invoice-processing",
"displayName": "Invoice Processing Workflow",
"currentVersionNum": 2,
"mainFunction": {
"functionID": "f_2abc123xyz",
"functionName": "invoice-extractor",
"versionNum": 1
},
"relationships": [
{
"sourceFunction": {
"functionName": "invoice-extractor",
"versionNum": 1
},
"destinationFunction": {
"functionName": "sku-matcher",
"versionNum": 1
}
}
]
}
}Enriched Results
When you call the updated workflow, the output now includes matched SKU data:
{
"call": {
"callID": "wc_8mno012pqr",
"status": "completed",
"workflowName": "invoice-processing",
"functionCalls": [
{
"functionCallID": "fc_9stu345vwx",
"functionName": "invoice-extractor",
"type": "transform",
"status": "completed",
"transformedContent": {
"invoiceNumber": "INV-2024-0042",
"lineItems": [
{
"description": "Widget A - Premium",
"quantity": 10,
"unitPrice": 25.0,
"amount": 250.0
}
]
}
},
{
"functionCallID": "fc_0xyz678abc",
"functionName": "sku-matcher",
"type": "enrich",
"status": "completed",
"transformedContent": {
"invoiceNumber": "INV-2024-0042",
"lineItems": [
{
"description": "Widget A - Premium",
"quantity": 10,
"unitPrice": 25.0,
"amount": 250.0
}
],
"matchedProducts": [
{
"data": {
"sku": "WGT-A-PRM-001",
"name": "Widget A - Premium Edition",
"category": "Widgets",
"unitCost": 18.5
},
"distance": 0.0823
}
]
}
}
]
}
}The matchedProducts array contains the best semantic matches from your product catalog, with a distance score indicating match quality (lower is better).
This pattern of chaining functions allows you to build sophisticated data pipelines that extract, enrich, route, and transform your documents in any combination you need.