Polling and Retries
Synchronous waits, polling cadence, idempotency, and how to retry safely
Workflow calls run asynchronously by default. There are three ways to find out when one finishes:
- Synchronous wait — pass
wait=trueand bem holds the response open for up to 30 seconds. - Polling — call
GET /v3/calls/{callID}untilstatusis terminal. - Webhooks — subscribe a function to a URL and receive event deliveries (see Webhooks).
Use whichever fits your workload. Synchronous waits are the simplest for interactive flows; webhooks are the right choice for high-volume backends; polling is the universal fallback.
Synchronous waits (wait=true)
wait=true on POST /v3/workflows/{workflowName}/call holds the response open for up to 30 seconds. If the call finishes inside that window you get 200 OK (or 500 on failure) with the final result; if it doesn't, you get 202 Accepted with the in-progress call object and you fall back to polling or wait for the webhook.
For the full contract — request shape, latency expectations, language-by-language access patterns, HTTP-client timeout configuration, and the production patterns that combine sync mode with webhooks — see Synchronous Mode.
Polling
GET /v3/calls/{callID} returns the current state of any call. The status field is the one to switch on:
| Status | Terminal? | What it means |
|---|---|---|
pending | No | Queued, not yet picked up by a worker |
running | No | At least one node is executing |
completed | Yes | Every terminal node finished without an error event |
failed | Yes | One or more terminal nodes produced an error event |
Recommended cadence:
- Initial wait: 500ms–1s. Most simple workflows finish well under 5s.
- Backoff: double after each unsuccessful poll, with jitter, capping at ~10s. A capped exponential of
0.5, 1, 2, 4, 8, 10, 10, 10, …is a reasonable default. - Deadline: pick one based on your workflow's expected runtime. Multi-step workflows with split/extract chains can run for tens of seconds; OCR-heavy pages can take minutes. If you don't have a deadline, fall back to webhooks.
import time
from bem import Bem
client = Bem()
call_id = "wc_abc123"
delay = 0.5
deadline = time.time() + 120 # 2-minute deadline
while True:
call = client.calls.retrieve(call_id).call
if call.status in ("completed", "failed"):
break
if time.time() > deadline:
raise TimeoutError(f"call {call_id} did not finish in time")
time.sleep(delay)
delay = min(delay * 2, 10)For per-node visibility (which node ran, which event it emitted, why a particular node failed), fetch the trace at GET /v3/calls/{callID}/trace. The trace is incremental — it grows as the call progresses, so it's also pollable mid-execution.
Idempotency via callReferenceID
callReferenceID is your client-side deduplication key. When you submit the same callReferenceID against the same workflow within a short retention window, bem returns the existing call instead of creating a new one — safe to retry network failures without producing duplicates.
import Bem from "bem-ai-sdk";
const client = new Bem();
// Retrying this exact request with the same callReferenceID is safe.
const { call } = await client.workflows.call("invoice-processing", {
callReferenceID: `invoice:${invoiceID}`,
input: { singleFile: { inputContent, inputType: "pdf" } },
wait: true,
});Pick a callReferenceID that's deterministic from your domain — the invoice ID, the document UUID, the user-and-email-and-timestamp tuple — not a random string. Random IDs defeat the deduplication.
If you don't pass a callReferenceID, every retry creates a new call. The call objects are cheap, but you'll process the same input multiple times and your downstream systems will see duplicate events.
Network and server-side retries
| Status | Retry? | How |
|---|---|---|
429 Too Many Requests | Yes | Honour Retry-After if set; otherwise exponential backoff. |
500/502/503/504 | Yes | Exponential backoff with jitter, max 5 attempts. |
408 Request Timeout | Yes | Same as 5xx. |
400/401/404/422 | No | The request itself needs to change. |
The official SDKs implement these defaults — you only need to add explicit retry logic if you're using fetch or requests directly. See Errors and status codes for the full breakdown.
When to use which
| Pattern | Use when | Watch out for |
|---|---|---|
wait=true | Interactive UIs, scripts, single-shot extracts that finish in seconds | The 30s ceiling — fall back to polling on 202 |
| Polling | Batch jobs, CI workflows, simple long-running scripts | Don't poll faster than ~2 calls/sec; honour rate limits |
| Webhooks | Production backends, multi-tenant systems, anything where you'd otherwise burn polling traffic | Set up signature verification before going live (see Webhooks) |