Skip to main content
Unkey Deploy is currently in private beta. To get access, reach out on Discord or email support@unkey.com.
The rate limiting policy enforces request limits on any route that matches the policy’s match expressions. Requests that exceed a configured limit receive a 429 response, protecting your app from traffic spikes and abuse. Each rate limit policy specifies a maximum number of requests within a time window (for example, 100 requests per 60 seconds) and a subject that identifies the entity being limited.

Rate limit subjects

The subject determines how the Sentinel groups requests for counting:
SubjectDescription
Remote IPLimit by client IP address
Header valueLimit by a specific request header (for example, X-Tenant-Id)
Authenticated subjectLimit by the authenticated Principal’s subject field
URL pathCreate separate limits per endpoint
Source fieldLimit by a field from the Principal’s source (for example, source.key.meta.org_id for per-organization limits)

Response headers

When the Sentinel evaluates a rate limit, it includes the rate limit state in the response headers:
HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed in the window
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp (seconds) when the window resets
Retry-AfterSeconds until the client can retry (only present on 429)
These headers appear on both successful and rate-limited responses, so your clients can monitor their usage proactively.

Exceeded rate limit behavior

When a rate limit is exceeded, the Sentinel returns HTTP status 429 Too Many Requests with the Retry-After header and a JSON error body:
{
  "meta": { "requestId": "req_abc123" },
  "error": {
    "title": "Rate Limited",
    "detail": "Rate limit exceeded. Please try again later.",
    "status": 429,
    "type": "https://unkey.com/docs/errors/sentinel/rate-limited"
  }
}
Your app never sees rate-limited requests.
Last modified on March 30, 2026