Function Review

Estimate human review requirements for a function.

Combines confusion-matrix metrics with the per-transformation evaluation scores (confidence / hallucination / relevance produced by the eval service) to compute:

A confidence-bucketed distribution of the function's outputs.
Sample-size estimates at configurable margin-of-error and confidence levels (Wald or Wilson intervals).
A precision-recall AUC and a per-threshold matrix you can use to pick a review cutoff.

Supported for every function type that produces transformations and feeds the auto-evaluation pipeline: extract, transform, analyze, join. Extract works on both vision (PDF/PNG/JPEG/HEIC/HEIF/WebP) and OCR-routed inputs.

Pass isRegression: true to scope the review to transformations created by a previous regression run (see POST /v3/functions/regression).

Authorization

API Key

x-api-key<token>

Authenticate using API Key in request header

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

functionName*string

Name of the function to analyze

functionVersionNum?integer

Optional function version number to analyze. If not provided, uses the latest/current version of the function.

Range1 <= value

evaluationVersion?string

Optional evaluation version to filter evaluations by. Must be one of the supported versions. If not provided, defaults to "0.1.0-gemini".

Default"0.1.0-gemini"

Value in"0.1.0-gemini"

marginOfError?number

Margin of error for statistical calculations

Default0.05

Formatfloat

Range0.001 <= value <= 0.1

thresholdMin?number

Minimum confidence threshold to analyze

Default0.5

Formatfloat

Range0 <= value <= 1

thresholdMax?number

Maximum confidence threshold to analyze

Default1

Formatfloat

Range0 <= value <= 1

thresholdStep?number

Step size for threshold analysis (smaller = more granular)

Default0.01

Formatfloat

Range0.001 <= value <= 0.1

confidenceLevels?array<>

Confidence levels for statistical analysis as integers representing percentages (e.g., [90, 95, 99] for 90%, 95%, 99%). IMPORTANT: Only integers are accepted, floats like 0.95 will be rejected.

Default[95]

confidenceMethod?string

Confidence interval calculation method (default "wald").

"wald": Normal approximation method (faster, standard)
"wilson": Wilson score interval (more robust for extreme rates)

Default"wald"

Value in"wald" | "wilson"

isRegression?boolean

Internal flag indicating if the request is from a regression test

Response Body

`application/json`

curl -X POST "https://api.bem.ai/v3/functions/review" \  -H "Content-Type: application/json" \  -d '{    "functionName": "invoice-extractor",    "functionVersionNum": 2,    "isRegression": true,    "marginOfError": 0.05  }'

{
  "functionName": "invoice-extractor",
  "functionVersionNum": 3,
  "estimate": {
    "totalTransformations": 1000,
    "labeledTransformations": 200,
    "unlabeledTransformations": 800,
    "missingEvaluations": 50,
    "confidenceDistribution": {
      "high": 500,
      "medium": 350,
      "low": 150
    },
    "thresholdMatrix": [
      {
        "threshold": 0.8,
        "tp": 85,
        "fp": 12,
        "fn": 15,
        "tn": 88,
        "accuracyAboveThreshold": {
          "95": {
            "ciLower": 0.8456,
            "mid": 0.875,
            "ciUpper": 0.9044,
            "currentSample": 120,
            "sampleNeeded": 30
          }
        }
      }
    ]
  },
  "metrics": {
    "fieldMetrics": [
      {
        "fieldPath": "/invoice/number",
        "metrics": {
          "accuracy": 0.95,
          "precision": 0.98,
          "recall": 0.92,
          "f1Score": 0.95,
          "tp": 92,
          "fp": 2,
          "tn": 0,
          "fn": 8
        }
      }
    ],
    "precisionRecallAuc": 0.8542,
    "aggregateMetrics": {
      "accuracy": 0.7407,
      "precision": 0.9524,
      "recall": 0.7692,
      "f1Score": 0.8511,
      "tp": 40,
      "fp": 2,
      "tn": 0,
      "fn": 12
    }
  }
}

{
  "message": "string",
  "code": 0,
  "details": {}
}

{
  "message": "string",
  "code": 0,
  "details": {}
}

Function Review

Authorization

Request Body

Response Body

200application/json

400application/json

500application/json

See also

`application/json`

`application/json`

`application/json`