Configuration Reference
The Traffic Policy configuration reference for this action.Supported Phases
on_http_request
Type
rate-limit
Configuration Fields
A name for this rate limit configuration. Must be less than 1024
characters.
Controls whether the rate limit is actively applied at runtime. When enabled, requests exceeding the limit are blocked or throttled as configured. When disabled, the system evaluates the rate limit but does not enforce it, allowing you to test configurations, gather metrics, or return a custom response.
The default value is true
.
The rate limit algorithm to be used.
The maximum number of requests allowed to reach your upstream server.
The minimum capacity is 1
and the maximum capacity is 2,000,000,000
.
The duration in which events may be limited based on the current capacity. Must be specified as a time duration that is a multiple of ten seconds (e.g., ”90s”
, “10m”
).
The minimum value is ”60s”
and the maximum value is “24h”
.
The elements of this collection define the unique key of a request to track the rate at which the capacity is being met.
Each bucket key is a CEL expression which includes all valid traffic policy variables and macros.
Up to ten bucket keys can be specified. For multiple buckets, the action will rate limit by each unique combination of buckets.
Behavior
Determining the Rate Limit Bucket
When this action is executed, information from the incoming HTTP request is used to determine which rate limit bucket the request falls into. Each bucket is defined by specific criteria through thebucket_key
configuration field
such as client IP, request host, or a header value.
If the bucket has not exceeded its capacity, the request proceeds to the next
action in your policy configuration.
Multiple Buckets
If multiplebucket_key
values are specified, the action will create a
unique rate limit bucket for each combination of the specified keys. For
example, if you have two bucket_key
values, such as req.host
and conn.client_ip
,
all incoming requests that have the exact same combination of Host
header and client IP
will be grouped into the same rate limit bucket. To rate limit separately with two different
buckets, you can create multiple rate-limit
actions instead.
Rate Limit Exceeded
If the identified bucket has received more events than its capacity over the specified duration:- The request is rejected with an
HTTP 429 — Too Many Requests
status code. - The
retry-after
header is included in the response, indicating the number of seconds after which the request may be retried.
Capacity per Ingress Server
Currently, thecapacity
for each rate limit bucket is applied per ingress
server. This means that each server independently tracks the number of requests
and enforces the rate limits accordingly.
Examples
Rate Limit by Host Header
The following Traffic Policy configuration demonstrates how to use therate-limit
action to rate limit
all incoming requests by the Host
header.
Example Traffic Policy Document
Example Request
httpbin.ngrok.app
using the curl
command and get back a 429
status code with a retry-after
header telling us
the number of seconds we must wait before retrying the request.
Action Result Variables
The following variables are made available for use in subsequent expressions and CEL interpolations after the action has run. Variable values will only apply to the last action execution, results are not concatenated.The key used for bucketing requests. This is the key used to group and track requests in the rate-limiting process, ensuring that the same bucket is subject to the rate limit across multiple requests.
Indicates whether the request was limited by the rate limit. If true
, the request was rate-limited based on the configured limits for the specified bucket.
A machine-readable code describing an error that occurred during the action’s execution.
A human-readable message providing details about an error that occurred during the action’s execution.