Quickstart

Set up your first workflow with bem

This guide walks you through setting up a simple workflow that extracts structured data from invoices using bem's API. You'll learn how to:

  1. Create and configure a transform function with a data schema you can customize
  2. Create and configure workflow that uses the function
  3. Call the workflow with a file as input
  4. Retrieve your results via polling or webhooks

Prerequisites

  • A bem account (you can sign up for free!)
  • An API key (you can generate one in Settings > API Keys through our UI)

Step 1: Create a Transform Function

First, create a transform function that defines the structure you want to extract from invoices.

curl -X POST https://api.bem.ai/v2/functions \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "functionName": "invoice-extractor",
    "type": "transform",
    "displayName": "Invoice Extractor",
    "outputSchemaName": "Invoice",
    "outputSchema": {
      "type": "object",
      "required": ["invoiceNumber", "vendor", "totalAmount"],
      "properties": {
        "invoiceNumber": {
          "type": "string",
          "description": "The unique invoice identifier"
        },
        "invoiceDate": {
          "type": "string",
          "description": "Date of the invoice in YYYY-MM-DD format"
        },
        "dueDate": {
          "type": "string",
          "description": "Payment due date in YYYY-MM-DD format"
        },
        "vendor": {
          "type": "object",
          "properties": {
            "name": {
              "type": "string",
              "description": "Company name of the vendor"
            },
            "address": {
              "type": "string",
              "description": "Full address of the vendor"
            }
          }
        },
        "lineItems": {
          "type": "array",
          "description": "Individual items on the invoice",
          "items": {
            "type": "object",
            "properties": {
              "description": {
                "type": "string",
                "description": "Description of the item or service"
              },
              "quantity": {
                "type": "number",
                "description": "Number of units"
              },
              "unitPrice": {
                "type": "number",
                "description": "Price per unit"
              },
              "amount": {
                "type": "number",
                "description": "Total amount for this line item"
              }
            }
          }
        },
        "subtotal": {
          "type": "number",
          "description": "Sum of all line items before tax"
        },
        "taxAmount": {
          "type": "number",
          "description": "Total tax amount"
        },
        "totalAmount": {
          "type": "number",
          "description": "Final total amount due"
        }
      }
    }
  }'

Response:

{
  "functionID": "fn_2abc123xyz",
  "functionName": "invoice-extractor",
  "displayName": "Invoice Extractor",
  "type": "transform",
  "currentVersionNum": 1
}

Step 2: Create a Workflow

Create a workflow that uses your transform function. Workflows provide a stable entry point for processing and can be extended later with additional functions.

curl -X POST https://api.bem.ai/v2/workflows \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "name": "invoice-processing",
    "displayName": "Invoice Processing Workflow",
    "tags": ["invoices", "financial-data"],
    "mainFunction": {
      "name": "invoice-extractor"
    }
  }'

Response:

{
  "workflow": {
    "workflowID": "w_2def456abc",
    "name": "invoice-processing",
    "displayName": "Invoice Processing Workflow",
    "currentVersionNum": 1,
    "tags": ["invoices", "financial-data"],
    "mainFunction": {
      "functionID": "f_2abc123xyz",
      "functionName": "invoice-extractor",
      "versionNum": 1
    }
  }
}

Step 3: Call the Workflow

Now you can send invoices to your workflow for processing using the Calls API. Provide your file as base64-encoded content.

curl -X POST https://api.bem.ai/v2/calls \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "calls": [
      {
        "workflowName": "invoice-processing",
        "callReferenceID": "invoice-001",
        "input": {
          "singleFile": {
            "inputType": "pdf",
            "inputContent": "JVBERi0xLjQKJeLjz9..."
          }
        }
      }
    ]
  }'

Response:

{
  "calls": [
    {
      "callID": "wc_2ghi789def",
      "callType": "workflow",
      "status": "pending",
      "workflowID": "wf_2def456abc",
      "workflowName": "invoice-processing",
      "callReferenceID": "invoice-001",
      "createdAt": "2024-01-15T10:30:00Z"
    }
  ]
}

Step 4: Get Results

All processing is asynchronous. You can retrieve results either by polling or by setting up webhooks.

Option A: Polling

Poll the call endpoint until the status is completed:

curl -X GET https://api.bem.ai/v2/calls/wc_2ghi789def \
  -H "x-api-key: YOUR_API_KEY"

Response when processing:

{
  "call": {
    "callID": "wc_2ghi789def",
    "callType": "workflow",
    "status": "running",
    "workflowName": "invoice-processing",
    "callReferenceID": "invoice-001",
    "createdAt": "2024-01-15T10:30:00Z"
  }
}

Response when complete:

{
  "call": {
    "callID": "wc_2ghi789def",
    "callType": "workflow",
    "status": "completed",
    "workflowName": "invoice-processing",
    "callReferenceID": "invoice-001",
    "createdAt": "2024-01-15T10:30:00Z",
    "finishedAt": "2024-01-15T10:30:15Z",
    "functionCalls": [
      {
        "functionCallID": "fc_3jkl012mno",
        "functionName": "invoice-extractor",
        "type": "transform",
        "referenceID": "invoice-001",
        "status": "completed",
        "transformedContent": {
          "invoiceNumber": "INV-2024-0042",
          "invoiceDate": "2024-01-15",
          "dueDate": "2024-02-15",
          "vendor": {
            "name": "Acme Supplies Inc.",
            "address": "123 Business Ave, Suite 100, San Francisco, CA 94107"
          },
          "lineItems": [
            {
              "description": "Widget A - Premium",
              "quantity": 10,
              "unitPrice": 25.0,
              "amount": 250.0
            },
            {
              "description": "Widget B - Standard",
              "quantity": 5,
              "unitPrice": 15.0,
              "amount": 75.0
            }
          ],
          "subtotal": 325.0,
          "taxAmount": 29.25,
          "totalAmount": 354.25
        }
      }
    ]
  }
}

Option B: Webhooks (Subscriptions)

For production use, set up a webhook subscription to receive results automatically when processing completes.

Create a subscription:

curl -X POST https://api.bem.ai/v1-alpha/subscriptions \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "name": "invoice-results",
    "type": "transform",
    "functionName": "invoice-extractor",
    "webhookURL": "https://your-server.com/webhooks/bem"
  }'

Response:

{
  "subscriptionID": "sub_4pqr345stu",
  "name": "invoice-results",
  "type": "transform",
  "functionName": "invoice-extractor",
  "webhookURL": "https://your-server.com/webhooks/bem",
  "disabled": false
}

When processing completes, bem sends a POST request to your webhook URL with the transformed data:

{
  "functionCallID": "fc_3jkl012mno",
  "functionName": "invoice-extractor",
  "referenceID": "invoice-001",
  "status": "completed",
  "transformedContent": {
    "invoiceNumber": "INV-2024-0042",
    "invoiceDate": "2024-01-15",
    "vendor": {
      "name": "Acme Supplies Inc.",
      "address": "123 Business Ave, Suite 100, San Francisco, CA 94107"
    },
    "lineItems": [
      {
        "description": "Widget A - Premium",
        "quantity": 10,
        "unitPrice": 25.0,
        "amount": 250.0
      }
    ],
    "subtotal": 325.0,
    "taxAmount": 29.25,
    "totalAmount": 354.25
  }
}

Verifying Webhook Signatures

To verify that webhook requests are from bem, check the bem-signature header. See Webhook Authentication for details.

Error Handling

Create an error subscription to be notified when processing fails:

curl -X POST https://api.bem.ai/v1-alpha/subscriptions \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "name": "invoice-errors",
    "type": "error",
    "functionName": "invoice-extractor",
    "webhookURL": "https://your-server.com/webhooks/bem-errors"
  }'

Expanding Your Workflow (Optional)

One of bem's strengths is the ability to iterate on workflows incrementally. Let's extend our invoice processing workflow to automatically match line item descriptions to SKUs from a product catalog using an Enrich function.

This demonstrates how you can chain multiple functions together to build sophisticated data pipelines.

Step 5a: Create a Product Catalog Collection

First, create a collection to store your product catalog data:

curl -X POST https://api.bem.ai/v2/collections \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "collectionName": "product_catalog"
  }'

Response:

{
  "collectionID": "cl_5abc123def",
  "collectionName": "product_catalog",
  "itemCount": 0,
  "createdAt": "2024-01-15T11:00:00Z"
}

Step 5b: Add Products to the Collection

Populate the collection with your product data. Each item includes a data field that bem uses for semantic search:

curl -X POST https://api.bem.ai/v2/collections/items \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "collectionName": "product_catalog",
    "items": [
      {
        "data": {
          "sku": "WGT-A-PRM-001",
          "name": "Widget A - Premium Edition",
          "category": "Widgets",
          "unitCost": 18.50
        }
      },
      {
        "data": {
          "sku": "WGT-B-STD-002",
          "name": "Widget B - Standard Model",
          "category": "Widgets",
          "unitCost": 9.25
        }
      },
      {
        "data": {
          "sku": "WGT-C-ECO-003",
          "name": "Widget C - Economy Line",
          "category": "Widgets",
          "unitCost": 5.00
        }
      }
    ]
  }'

Response:

{
  "status": "pending",
  "message": "Collection items are being processed asynchronously",
  "eventID": "evt_6def456ghi"
}

Step 5c: Create an Enrich Function

Create an enrich function that searches your product catalog to match line item descriptions to SKUs:

curl -X POST https://api.bem.ai/v2/functions \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "functionName": "sku-matcher",
    "type": "enrich",
    "displayName": "SKU Matcher",
    "tags": ["products", "sku-lookup"],
    "config": {
      "steps": [
        {
          "sourceField": "lineItems[*].description",
          "collectionName": "product_catalog",
          "targetField": "matchedProducts",
          "topK": 1,
          "searchMode": "semantic"
        }
      ]
    }
  }'

This configuration tells the enrich function to:

  • Extract all description values from the lineItems array
  • Search the product_catalog collection semantically
  • Store the best match for each item in matchedProducts

Response:

{
  "functionID": "fn_7ghi789jkl",
  "functionName": "sku-matcher",
  "displayName": "SKU Matcher",
  "type": "enrich",
  "currentVersionNum": 1
}

Step 5d: Update the Workflow

Now update your workflow to chain the transform and enrich functions together. The output from invoice-extractor will flow into sku-matcher:

curl -X PATCH https://api.bem.ai/v2/workflows/invoice-processing \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "mainFunction": {
      "name": "invoice-extractor"
    },
    "relationships": [
      {
        "sourceFunction": {
          "name": "invoice-extractor"
        },
        "destinationFunction": {
          "name": "sku-matcher"
        }
      }
    ]
  }'

Response:

{
  "workflow": {
    "workflowID": "w_2def456abc",
    "name": "invoice-processing",
    "displayName": "Invoice Processing Workflow",
    "currentVersionNum": 2,
    "mainFunction": {
      "functionID": "f_2abc123xyz",
      "functionName": "invoice-extractor",
      "versionNum": 1
    },
    "relationships": [
      {
        "sourceFunction": {
          "functionName": "invoice-extractor",
          "versionNum": 1
        },
        "destinationFunction": {
          "functionName": "sku-matcher",
          "versionNum": 1
        }
      }
    ]
  }
}

Enriched Results

When you call the updated workflow, the output now includes matched SKU data:

{
  "call": {
    "callID": "wc_8mno012pqr",
    "status": "completed",
    "workflowName": "invoice-processing",
    "functionCalls": [
      {
        "functionCallID": "fc_9stu345vwx",
        "functionName": "invoice-extractor",
        "type": "transform",
        "status": "completed",
        "transformedContent": {
          "invoiceNumber": "INV-2024-0042",
          "lineItems": [
            {
              "description": "Widget A - Premium",
              "quantity": 10,
              "unitPrice": 25.0,
              "amount": 250.0
            }
          ]
        }
      },
      {
        "functionCallID": "fc_0xyz678abc",
        "functionName": "sku-matcher",
        "type": "enrich",
        "status": "completed",
        "transformedContent": {
          "invoiceNumber": "INV-2024-0042",
          "lineItems": [
            {
              "description": "Widget A - Premium",
              "quantity": 10,
              "unitPrice": 25.0,
              "amount": 250.0
            }
          ],
          "matchedProducts": [
            {
              "data": {
                "sku": "WGT-A-PRM-001",
                "name": "Widget A - Premium Edition",
                "category": "Widgets",
                "unitCost": 18.5
              },
              "distance": 0.0823
            }
          ]
        }
      }
    ]
  }
}

The matchedProducts array contains the best semantic matches from your product catalog, with a distance score indicating match quality (lower is better).

This pattern of chaining functions allows you to build sophisticated data pipelines that extract, enrich, route, and transform your documents in any combination you need.

Next Steps

On this page