Rate limits
YumKiosk rate-limits API requests to protect the platform from runaway clients and to maintain fair shared usage across all tenants. Limits are enforced per-client and vary by endpoint class. This page explains the current limits, how to detect that you're approaching one, and how to handle the response when you've hit one.
Limit buckets
The platform has four distinct rate-limit buckets:
| Bucket | Scope | Limit |
|---|---|---|
| Public kiosk API | Per device | 120 req/min |
| Public pairing | Per IP | 10 req/min |
| Agent dashboard API | Per authenticated agent | 300 req/min |
| Login / auth | Per IP | 5 req/min |
| Webhooks (outbound) | Per endpoint URL | 60 req/min delivered |
The most common bucket you'll interact with is the Agent dashboard API bucket, which is generous enough that a normal agent workflow won't come close. The Login / auth bucket is the most restrictive because it's the primary vector for brute-force attacks.
Response headers
Every response from a rate-limited endpoint includes these headers:
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 287
X-RateLimit-Reset: 1712760000
X-RateLimit-Limit— the maximum for this bucket per minute.X-RateLimit-Remaining— how many requests you have left in the current window.X-RateLimit-Reset— the Unix timestamp when the window resets.
Well-behaved clients should monitor Remaining and back off preemptively when it gets low, not wait until they see a 429.
429 response
When you exceed a limit, the server returns 429 Too Many Requests with a body:
{
"error": {
"code": "RATE_LIMITED",
"message": "Too many requests. Retry after 42 seconds.",
"retry_after": 42
}
}
And a Retry-After header matching the retry_after value. Respect it — sleep for at least that many seconds before trying again. Hammering a rate-limited endpoint can get your device token or account temporarily banned.
Exponential backoff
For transient failures (not just rate limits, but any 5xx or network error), use exponential backoff with jitter:
import time, random
def call_with_backoff(fn, max_retries=5):
for attempt in range(max_retries):
try:
return fn()
except RateLimitError as e:
sleep = min(e.retry_after, 2 ** attempt + random.random())
time.sleep(sleep)
except TransientError:
sleep = 2 ** attempt + random.random()
time.sleep(sleep)
raise Exception("Max retries exceeded")
The jitter is important — if you don't add randomness, every client retrying at the same intervals creates thundering herds that make congestion worse.
Asking for a higher limit
If your use case genuinely needs a higher limit — say, a POS integration syncing hundreds of orders per minute — email developers@yumkiosk.com explaining what you're doing and how much throughput you need. We'll raise the limit for your tenant without changing the platform defaults.
Distributed deployment gotcha
If you have multiple copies of the same client sharing credentials (e.g., multiple worker processes using the same agent account), they share the same rate-limit bucket. Plan capacity accordingly. Consider instead using one account per worker if you're hitting limits, or coalescing requests through a single gateway.
Our commitment
We will give at least 30 days notice before tightening any published rate limit. We may raise limits at any time without notice. Limits are documented on this page as the source of truth — if you see conflicting numbers elsewhere, this page wins.