Functions

Create a Function

Create a new function of any type (transform, analyze, route, join, split, payload_shaping, or enrich).

Each function type serves different purposes:

  • Transform: Extract structured data from unstructured documents
  • Analyze: Analyze the visual context of documents to extract information
  • Route: Route data to different functions based on conditions
  • Join: Combine data from multiple sources
  • Split: Split data into multiple outputs
  • Payload Shaping: Customize and transform input payloads using JMESPath expressions
  • Enrich: Enhance data with semantic search against collections
POST
/v2/functions
x-api-key<token>

Authenticate using API Key in request header

In: header

functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

outputSchemaName?string

Name of output schema object.

outputSchema?object

Desired output structure defined in standard JSON Schema convention.

Empty Object

tabularChunkingEnabled?boolean

Whether tabular chunking is enabled on the pipeline. This processes tables in CSV/Excel in row batches, rather than all rows at once.

independentDocumentProcessingEnabled?boolean

Whether independent transformations is enabled. For PDFs sent through the pipeline, this enables independent transformations for each individual page. For CSVs, this enables transforming chunks of rows in the CSV.

functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

outputSchemaName?string

Name of output schema object.

outputSchema?object

Desired output structure defined in standard JSON Schema convention.

Empty Object

functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

description?string

Description of router. Can be used to provide additional context on router's purpose and expected inputs.

routes?RouteList

List of routes.

functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

splitType?string
Value in"print_page" | "semantic_page"
printPageSplitConfig?object
semanticPageSplitConfig?object
functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

description?string

Description of join function.

joinType?string

The type of join to perform.

Value in"standard"
outputSchemaName?string

Name of output schema object.

outputSchema?object

Desired output structure defined in standard JSON Schema convention.

Empty Object

functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

shapingSchema?string

JMESPath expression that defines how to transform and customize the input payload structure. Payload shaping allows you to extract, reshape, and reorganize data from complex input payloads into a simplified, standardized output format. Use JMESPath syntax to select specific fields, perform calculations, and create new data structures tailored to your needs.

functionNamestring

Name of function. Must be UNIQUE on a per-environment basis.

displayName?string

Display name of function. Human-readable name to help you identify the function.

typeFunctionType

The type of the function.

Value in"transform" | "route" | "split" | "join" | "analyze" | "payload_shaping" | "enrich"
tags?array<string>

Array of tags to categorize and organize functions.

config?enrichConfig

Configuration for enrich function with semantic search steps.

How Enrich Functions Work:

Enrich functions use semantic search to augment JSON data with relevant information from collections. They take JSON input (typically from a transform function), extract specified fields, perform vector-based semantic search against collections, and inject the results back into the data.

Input Requirements:

  • Must receive JSON input (typically uploaded to S3 from a previous function)
  • Can be chained after transform or other functions that produce JSON output

Example Use Cases:

  • Match product descriptions to SKU codes from a product catalog
  • Enrich customer data with account information
  • Link order line items to inventory records

Configuration:

  • Define one or more enrichment steps
  • Each step extracts values, searches a collection, and injects results
  • Steps are executed sequentially

Response Body

curl -X POST "https://api.bem.ai/v2/functions" \  -H "Content-Type: application/json" \  -d '{    "name": "extract-freight-tender-summary",    "type": "payload_shaping",    "description": "Transform complex freight load tender payloads into simplified summary data by extracting load reference, calculating total weight, and consolidating unique origins and submitters",    "shapingSchema": "{ \"load_reference\": tenders[0].loadReference, \"total_weight_tons\": tenders[].weightTons | sum(@), \"origins\": tenders[].origin | unique(@), \"submitters\": tenders[].submitter.name | unique(@) }"  }'
{
  "functionID": "string",
  "functionName": "string",
  "versionNum": 0,
  "type": "transform",
  "usedInWorkflows": [
    [
      {
        "workflowID": "w_1234567890",
        "workflowName": "My Workflow",
        "currentVersionNum": 1,
        "usedInWorkflowVersionNums": [
          1,
          2,
          3
        ]
      }
    ]
  ],
  "displayName": "string",
  "tags": [
    "billing",
    "finance",
    "automated"
  ],
  "outputSchemaName": "Freight Load Schema",
  "outputSchema": {
    "type": "object",
    "required": [
      "tenders"
    ],
    "properties": {
      "tenders": {
        "type": "array",
        "items": {
          "type": "object",
          "required": [
            "loadReference",
            "origin",
            "destination",
            "weightTons",
            "loadType",
            "desiredDeliveryDate",
            "bidSubmissionDeadline",
            "submitter"
          ],
          "properties": {
            "origin": {
              "type": "string",
              "description": "The starting point of the shipment."
            },
            "loadType": {
              "type": "string",
              "description": "The type of goods being shipped."
            },
            "submitter": {
              "type": "object",
              "required": [
                "name",
                "position",
                "contactInfo"
              ],
              "properties": {
                "name": {
                  "type": "string",
                  "description": "Name of the person submitting the tender."
                },
                "position": {
                  "type": "string",
                  "description": "Position of the submitter within their company."
                },
                "contactInfo": {
                  "type": "object",
                  "required": [
                    "email"
                  ],
                  "properties": {
                    "email": {
                      "type": "string",
                      "format": "email",
                      "description": "Email address of the submitter."
                    },
                    "phone": {
                      "type": "string",
                      "description": "Phone number of the submitter."
                    }
                  }
                }
              }
            },
            "weightTons": {
              "type": "number",
              "description": "The weight of the load in tons."
            },
            "destination": {
              "type": "string",
              "description": "The endpoint of the shipment."
            },
            "loadReference": {
              "type": "string",
              "description": "Unique identifier for the load tender."
            },
            "desiredDeliveryDate": {
              "type": "string",
              "format": "date",
              "description": "The preferred date for the shipment to be delivered."
            },
            "bidSubmissionDeadline": {
              "type": "string",
              "format": "date",
              "description": "The deadline for submitting bids."
            }
          }
        }
      }
    }
  },
  "emailAddress": "eml_2c9AXFXHwiaL4vPXDTOS171OJ8T@pipeline.bem.ai",
  "tabularChunkingEnabled": false,
  "independentDocumentProcessingEnabled": false
}