Infer Schema from File
Analyze a file and infer a JSON Schema from its contents.
Accepts a file via multipart form upload and uses Gemini to analyze the document, returning a description of its contents, an inferred JSON Schema capturing all extractable fields, and document classification metadata.
The returned schema is designed to be reusable across many similar documents of the
same type, not just the specific file uploaded. It can be used directly as the
outputSchema when creating a Transform function.
The endpoint also detects whether the file contains multiple bundled documents and classifies the content nature (textual, visual, audio, video, or mixed).
Supported file types
PDF, PNG, JPEG, HEIC, HEIF, WebP, CSV, XLS, XLSX, DOCX, JSON, HTML, XML, EML, plain text, WAV, MP3, M4A, MP4.
File size limit
Maximum file size is 20 MB.
Examples
Using curl:
curl -X POST https://api.bem.ai/v3/infer-schema \
-H "x-api-key: YOUR_API_KEY" \
-F "file=@invoice.pdf"Using the Bem CLI:
bem infer-schema create --file @invoice.pdfAuthorization
API Key Authenticate using API Key in request header
In: header
Request Body
multipart/form-data
TypeScript Definitions
Use the request body type in TypeScript.
The file to analyze and infer a JSON schema from.
Response Body
application/json
application/json
curl -X POST "https://api.bem.ai/v3/infer-schema" \ -F file="null"{
"filename": "string",
"analysis": {
"fileName": "string",
"contentType": "string",
"sizeBytes": 0,
"fileType": "string",
"description": "string",
"schema": {},
"isMultiDocument": true,
"documentTypes": [
{
"name": "string",
"count": 0,
"description": "string"
}
],
"contentNature": "string"
}
}{
"message": "string",
"code": 0,
"details": {}
}See also
- Schema building guide — from inferred to production-ready