Appearance
Rate limits
Rikaii applies distributed rate limits to POST /v1/chat/completions so shared infrastructure stays reliable for everyone.
Limits (requests per minute)
Limits are enforced per workspace (identified by your API key) using a one-minute sliding window:
| Plan | Requests per minute (RPM) |
|---|---|
| Standard (pay-as-you-go, non‑Pro) | 60 |
| Pro | 120 |
These values match the deployed rate-limit configuration. If you need sustained higher throughput, contact support.
HTTP behavior
When the limit is exceeded, the API returns 429 Too Many Requests with a JSON error body. Rikaii does not send a Retry-After header—use exponential backoff when you receive 429.
If rate-limit checks are temporarily unavailable, the API may return 503 Service Unavailable instead of allowing unlimited traffic (fail-closed).
Best practices
- Prefer backoff and jitter on
429rather than tight retry loops. - Serialize bursty traffic within your tier limits.