Collections

Count tokens for texts

Count the number of tokens in the provided texts using the BGE M3 tokenizer. This is useful for checking if texts will fit within the embedding model's token limit (8,192 tokens per text) before sending them for embedding.

POST
/v2/collections/token-count
x-api-key<token>

Authenticate using API Key in request header

In: header

Request Body

application/json

texts*array<>

Response Body

application/json

curl -X POST "https://api.bem.ai/v2/collections/token-count" \  -H "Content-Type: application/json" \  -d '{    "texts": [      "string"    ]  }'
{
  "token_counts": [
    {
      "index": 0,
      "token_count": 0,
      "exceeds_limit": true,
      "char_count": 0
    }
  ],
  "total_tokens": 0,
  "max_token_limit": 0,
  "texts_exceeding_limit": 0
}
Empty
Empty