Score Function Against (input, expected) Pairs

Hand off to an LLM

Score a function against a list of (input, expected) pairs.

Submits a batch of (input, expected) pairs, runs the named function over each input, and returns per-pair + aggregate accuracy metrics comparing the function's actual output to the provided expected JSON.

Scoring runs asynchronously. The response carries a scoreRunID; poll GET /v3/eval/score/{scoreRunID} until status is one of completed, error, or cancelled.

matchConfig controls comparator behavior:

  • numericTolerance: relative tolerance for numeric fields (0 = exact)
  • stringMatch: exact (default) or fuzzy (Levenshtein ratio)
  • arrayMatch: by-index (default; only mode in P0)
  • ignorePaths: JSON Pointer paths to skip, supports * wildcards
POST
/v3/eval/score
x-api-key<token>

Authenticate using API Key in request header

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

application/json

application/json

curl -X POST "https://api.bem.ai/v3/eval/score" \  -H "Content-Type: application/json" \  -d '{    "functionName": "string",    "pairs": [      {        "input": {          "inputType": "csv",          "inputContent": "string"        },        "expected": null      }    ]  }'
{
  "scoreRunID": "evalrun_2a8f...",
  "status": "pending"
}
{
  "message": "string",
  "code": 0,
  "details": {}
}

See also