Rate limits + quotas
Every accepted ingest request flows through two gates: a per-second rate limit (token bucket) and a monthly event quota. Both are per-workspace. They are independent — a workspace can be rate-limited without exhausting its quota, and vice versa.
Per-second rate limit
Section titled “Per-second rate limit”A token bucket with refill rate RATE_LIMIT_RPS and bucket capacity
RATE_LIMIT_BURST. Defaults: 100 rps, 200 burst.
When the bucket is empty, the request returns:
HTTP/1.1 429 Too Many RequestsRetry-After: 1X-Ratelimit-Reason: per_second_rate_limitContent-Type: application/json
{ "error": { "code": "rate_limit_exceeded", "message": "per-second rate limit exceeded" } }The SDK retry orchestrator treats 429 as transient and backs off (see
Retry). The Retry-After header is always 1 for
this gate — the bucket replenishes within one second of refill rate.
Monthly quota
Section titled “Monthly quota”A per-workspace monthly event ceiling. Defaults: 10_000_000 events per
calendar month. Two thresholds:
- Soft ceiling (
QUOTA_SOFT_PCT, default80%): the request still succeeds but a warning header is attached so operators can set up alerts. Header:X-Ratelimit-Reason: monthly_quota_soft. - Hard ceiling (
100%): subsequent requests return:
HTTP/1.1 402 Payment RequiredX-Ratelimit-Reason: monthly_quota_exceededContent-Type: application/json
{ "error": { "code": "monthly_quota_exceeded", "message": "workspace monthly event quota exhausted" } }The SDK retry orchestrator treats 402 as permanent for the rest of
the month — there is no point retrying when the cause is a quota that
won’t reset until midnight on the 1st. This is a deliberate choice to
prevent retry storms from sites whose plan is over-limit.
Per-workspace overrides
Section titled “Per-workspace overrides”Defaults set via env are platform-wide. Per-workspace overrides live in
the workspaces row:
| Column | Purpose |
|---|---|
rate_limit_rps | Per-second refill rate. NULL falls back to env default. |
rate_limit_burst | Bucket capacity. NULL falls back to env default. |
quota_monthly_events | Monthly hard ceiling. NULL falls back to env default. |
Operators set these via direct SQL.
What gets counted
Section titled “What gets counted”Every accepted event counts against the monthly quota — including those
that are subsequently dropped by the bot or internal-traffic filter (in
Drop mode they were “accepted” by the rate limiter and then dropped by
the filter, so they do count). DLQ rows count.
What does NOT count:
- Requests rejected at auth / consent (4xx never reaches the counter).
- Health checks (
/healthz,/metrics— they bypass the limiter entirely).
Performance
Section titled “Performance”The token-bucket check is benchmarked at p99 < 1 µs, well inside the 5 ms ingest budget. The quota counter is incremented once per accepted event inside the same database transaction as the storage write — no extra round trip.
Monitoring
Section titled “Monitoring”Alerts to consider:
- Rate-limit drops sustained for more than 5 min for a single
workspace — usually a rogue client or an under-tuned default. The
syntarie_events_dropped_total{reason="rate_limit"}metric carries no per-workspace label by design (cardinality), so you must cross-check against operator logs to identify the workspace. - Soft ceiling exceeded — alert per-workspace and contact the customer. Catching this before the hard ceiling avoids end-of-month surprises.
- Hard ceiling exceeded — alert per-workspace; the customer’s events are not flowing.
The structured-log line that fires on each rate-limit decision carries the workspace id at debug level for forensic analysis.