Rate limits

YumKiosk rate-limits API requests to protect the platform from runaway clients and to maintain fair shared usage across all tenants. Limits are enforced per-client and vary by endpoint class. This page explains the current limits, how to detect that you're approaching one, and how to handle the response when you've hit one.

Limit buckets

The platform has four distinct rate-limit buckets:

Bucket	Scope	Limit
Public kiosk API	Per device	120 req/min
Public pairing	Per IP	10 req/min
Agent dashboard API	Per authenticated agent	300 req/min
Login / auth	Per IP	5 req/min
Webhooks (outbound)	Per endpoint URL	60 req/min delivered

The most common bucket you'll interact with is the Agent dashboard API bucket, which is generous enough that a normal agent workflow won't come close. The Login / auth bucket is the most restrictive because it's the primary vector for brute-force attacks.

Response headers

Every response from a rate-limited endpoint includes these headers:

X-RateLimit-Limit: 300
X-RateLimit-Remaining: 287
X-RateLimit-Reset: 1712760000

X-RateLimit-Limit — the maximum for this bucket per minute.
X-RateLimit-Remaining — how many requests you have left in the current window.
X-RateLimit-Reset — the Unix timestamp when the window resets.

Well-behaved clients should monitor Remaining and back off preemptively when it gets low, not wait until they see a 429.

429 response

When you exceed a limit, the server returns 429 Too Many Requests with a body:

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests. Retry after 42 seconds.",
    "retry_after": 42
  }
}

And a Retry-After header matching the retry_after value. Respect it — sleep for at least that many seconds before trying again. Hammering a rate-limited endpoint can get your device token or account temporarily banned.

Exponential backoff

For transient failures (not just rate limits, but any 5xx or network error), use exponential backoff with jitter:

import time, random

def call_with_backoff(fn, max_retries=5):
    for attempt in range(max_retries):
        try:
            return fn()
        except RateLimitError as e:
            sleep = min(e.retry_after, 2 ** attempt + random.random())
            time.sleep(sleep)
        except TransientError:
            sleep = 2 ** attempt + random.random()
            time.sleep(sleep)
    raise Exception("Max retries exceeded")

The jitter is important — if you don't add randomness, every client retrying at the same intervals creates thundering herds that make congestion worse.

Asking for a higher limit

If your use case genuinely needs a higher limit — say, a POS integration syncing hundreds of orders per minute — email developers@yumkiosk.com explaining what you're doing and how much throughput you need. We'll raise the limit for your tenant without changing the platform defaults.

Distributed deployment gotcha

If you have multiple copies of the same client sharing credentials (e.g., multiple worker processes using the same agent account), they share the same rate-limit bucket. Plan capacity accordingly. Consider instead using one account per worker if you're hitting limits, or coalescing requests through a single gateway.

Our commitment

We will give at least 30 days notice before tightening any published rate limit. We may raise limits at any time without notice. Limits are documented on this page as the source of truth — if you see conflicting numbers elsewhere, this page wins.

Rate limits

Rate limits

#Limit buckets

#Response headers

#429 response

#Exponential backoff

#Asking for a higher limit

#Distributed deployment gotcha

#Our commitment

Limit buckets

Response headers

429 response

Exponential backoff

Asking for a higher limit

Distributed deployment gotcha

Our commitment