Function Review
Estimate human review requirements for a function.
Combines confusion-matrix metrics with the per-transformation evaluation scores (confidence / hallucination / relevance produced by the eval service) to compute:
- A confidence-bucketed distribution of the function's outputs.
- Sample-size estimates at configurable margin-of-error and confidence levels (Wald or Wilson intervals).
- A precision-recall AUC and a per-threshold matrix you can use to pick a review cutoff.
Supported for every function type that produces transformations and feeds
the auto-evaluation pipeline: extract, transform, analyze, join.
Extract works on both vision (PDF/PNG/JPEG/HEIC/HEIF/WebP) and OCR-routed
inputs.
Pass isRegression: true to scope the review to transformations created
by a previous regression run (see POST /v3/functions/regression).
Authorization
API Key Authenticate using API Key in request header
In: header
Request Body
application/json
TypeScript Definitions
Use the request body type in TypeScript.
Response Body
application/json
application/json
application/json
curl -X POST "https://api.bem.ai/v3/functions/review" \ -H "Content-Type: application/json" \ -d '{ "functionName": "invoice-extractor", "functionVersionNum": 2, "isRegression": true, "marginOfError": 0.05 }'{
"functionName": "invoice-extractor",
"functionVersionNum": 3,
"estimate": {
"totalTransformations": 1000,
"labeledTransformations": 200,
"unlabeledTransformations": 800,
"missingEvaluations": 50,
"confidenceDistribution": {
"high": 500,
"medium": 350,
"low": 150
},
"thresholdMatrix": [
{
"threshold": 0.8,
"tp": 85,
"fp": 12,
"fn": 15,
"tn": 88,
"accuracyAboveThreshold": {
"95": {
"ciLower": 0.8456,
"mid": 0.875,
"ciUpper": 0.9044,
"currentSample": 120,
"sampleNeeded": 30
}
}
}
]
},
"metrics": {
"fieldMetrics": [
{
"fieldPath": "/invoice/number",
"metrics": {
"accuracy": 0.95,
"precision": 0.98,
"recall": 0.92,
"f1Score": 0.95,
"tp": 92,
"fp": 2,
"tn": 0,
"fn": 8
}
}
],
"precisionRecallAuc": 0.8542,
"aggregateMetrics": {
"accuracy": 0.7407,
"precision": 0.9524,
"recall": 0.7692,
"f1Score": 0.8511,
"tp": 40,
"fp": 2,
"tn": 0,
"fn": 12
}
}
}{
"message": "string",
"code": 0,
"details": {}
}{
"message": "string",
"code": 0,
"details": {}
}See also
- System overview — evaluating extraction quality