Rate limiting

Unkey Deploy is in public beta. To try it, open the product switcher in the top-left of the dashboard and select Deploy. During beta, deployed resources are free. We’re eager for feedback, so let us know what you think on Discord, X, or email support@unkey.com.

The rate limiting policy enforces request limits on any route that matches the policy’s match expressions. Requests that exceed a configured limit receive a 429 response, protecting your app from traffic spikes and abuse. Each rate limit policy specifies a maximum number of requests within a time window (for example, 100 requests per 60 seconds) and an identifier that determines how the Sentinel groups requests for counting. Rate limit state is managed by Unkey’s distributed rate limiting service, so limits are consistent across multiple Sentinel replicas.

Configuration

You can create and manage rate limit policies from the Sentinel Policies page in your project dashboard. Each policy requires:

Limit: the maximum number of requests allowed in the window
Window: the time window in which the limit applies (for example, 60 seconds)
Identifier: how the Sentinel determines which requests share a rate limit bucket
Match conditions: which requests the policy applies to (optional, an empty match list applies to all requests)

Place authentication policies before rate limit policies in your policy list if you want to use an authenticated identifier (such as the authenticated subject or a principal field).

Identifiers

The identifier determines how the Sentinel groups requests for counting:

Identifier	Description
Remote IP	Limit by client IP address. Effective for anonymous traffic but can over-limit users behind shared NATs or proxies.
Header value	Limit by a specific request header (for example, `X-Tenant-Id`). Use only when the header is set by a trusted upstream.
Authenticated subject	Limit by the authenticated Principal’s `subject` field. Requires an authentication policy earlier in the list.
URL path	Create separate limits per endpoint, useful for protecting expensive routes.
Principal field	Limit by a dotted-path field from the Principal (for example, `source.key.meta.org_id` for per-organization limits). Requires an authentication policy earlier in the list.

Response headers

When the Sentinel evaluates a rate limit, it includes the rate limit state in the response headers:

Header	Description
`X-RateLimit-Limit`	Maximum number of requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp (seconds) when the window resets
`Retry-After`	Seconds until the client can retry (only present on `429`)

These headers appear on both successful and rate-limited responses, so your clients can monitor their usage proactively. When multiple policies write rate limit headers (for example, a per-key limit from API key authentication and a standalone rate limit policy), the Sentinel keeps the most restrictive values.

Exceeded rate limit behavior

When a rate limit is exceeded, the Sentinel returns HTTP status 429 Too Many Requests with the Retry-After header and a JSON error body:

{
  "meta": { "requestId": "req_abc123" },
  "error": {
    "title": "Rate Limited",
    "detail": "Rate limit exceeded. Please try again later.",
    "status": 429,
    "type": "https://unkey.com/docs/errors/sentinel/rate-limited"
  }
}

Your app never sees rate-limited requests.

Getting Started

Platform

Build & Deploy

Networking

Observability

CLI

Security

External APIs

AI Code Gen

Audit Logs

CLI

Errors

Configuration

Identifiers

Response headers

Exceeded rate limit behavior

Getting Started

Platform

Build & Deploy

Networking

Observability

CLI

Security

External APIs

AI Code Gen

Audit Logs

CLI

Errors

Documentation Index

​Configuration

​Identifiers

​Response headers

​Exceeded rate limit behavior

Configuration

Identifiers

Response headers

Exceeded rate limit behavior