Count tokens for texts

Count the number of tokens in the provided texts using the BGE M3 tokenizer. This is useful for checking if texts will fit within the embedding model's token limit (8,192 tokens per text) before sending them for embedding.

Authorization

API Key

x-api-key<token>

Authenticate using API Key in request header

In: header

Request Body

application/json

texts*array<string>

List of texts to count tokens for

Response Body

`application/json`

curl -X POST "https://api.bem.ai/v2/collections/token-count" \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "Hello, world!",
      "This is another text to count tokens for."
    ]
  }'

curl -X POST "https://api.bem.ai/v2/collections/token-count" \  -H "Content-Type: application/json" \  -d '{    "texts": [      "Hello, world!",      "This is another text to count tokens for."    ]  }'

{
  "token_counts": [
    {
      "index": 0,
      "token_count": 4,
      "exceeds_limit": false,
      "char_count": 13
    },
    {
      "index": 1,
      "token_count": 10,
      "exceeds_limit": false,
      "char_count": 43
    }
  ],
  "total_tokens": 14,
  "max_token_limit": 8192,
  "texts_exceeding_limit": 0
}

Empty

Count tokens for texts

Authorization

Request Body

Response Body

200application/json

400

500

`application/json`