Download all docs
modifiers

Rate Limit

A throttling modifier you attach to any element to cap how fast it can be called — the cascade middleware enforces the limit on every incoming request, returns HTTP 429 when a key runs over, and resolves the strictest limit when inherited and local rules conflict.

Working with it

Selecting a Rate Limit reveals its settings in the properties panel; it has no dedicated full-screen workbench.

How it appears

The same element type rendered as a definition, a circle instance, and a live workspace card.

Rl
type

Rate Limit

Control request rates and protect against abuse

modifiersmodifierdefinition

When to use / not

When to use

  • Protecting an element from abuse or traffic spikes — cap requests-per-window so a runaway caller can't overwhelm it.
  • Guarding expensive operations (LLM calls, file uploads, image generation) where each request carries real cost.
  • Enforcing fair-usage or tier-based quotas per user, org, or API key across everything attached below it.
  • Showing callers their remaining quota before they hit a 429, via the live `status` operation keyed on user/IP.

When not to use

  • Deciding *who* may call an element at all — that is authentication/authorization; use auth-policy, which evaluates first (order 50) before rate-limit (order 100).
  • Metering spend or enforcing a wallet balance — rate-limit caps request frequency, not cost; billing is handled by the AU ledger, not this modifier.
  • Smoothing or retrying work that overflows — rate-limit rejects with 429; queue the overflow with a queue element instead of expecting the limiter to buffer it.

Topology

Attaches to another element as a modifier, shaping that element's behaviour rather than running on its own.

Properties

limitinteger
Simple format: max requests per window (use with window_seconds)
window_secondsinteger
Simple format: time window in seconds (use with limit)
limitsarray
Rate limit rules (array format). Prefer simple format (limit + window_seconds) for single rules.
keyobject
Rate limit key configuration
scopeobject
Rate limit scope
responseobject
Response when rate limited
exemptionsarray
Exempted identities

Capabilities

Defined for this element
  • Evaluate
  • Storage
  • Observe

Operations

  • attachPOST
  • deleteDELETE
  • detachPOST
  • disablePOST
  • enablePOST
  • evaluatePOST
  • getGET
  • get_attached_modifiersGET
  • intentionGET
  • list_attachmentsGET
  • readme_updatePOST
  • resetPOST
  • schemaGET
  • statusGET
  • updatePATCH

Ports

Inputs

  • limitsconfig
  • burstconfig
  • responseconfig

Errors / when it fails

Rate limit by header requires key.header_name
At least one rate limit rule must be defined
Fails unless: len(limits) > 0

Validation rules

  • IP-based rate limiting may affect users behind NAT/proxies
  • No exemptions configured - all traffic is rate limited

Rate Limit (rate-limit)

Category: modifiers | Form: | Symbol: Rl

Control request rates and protect against abuse

Limits request throughput on attached elements. Cascade strategy: restrictive — when inherited and local rate-limits conflict, the lower (more restrictive) requests_per_minute wins. Evaluation order 100 (after auth-policy at 50). Applies to all element types. Fails with HTTP 429. Spec accepts two formats: array format {limits: [{requests: N, window_seconds: N, burst: N}]} or simple format {limit: N, window_seconds: N}. Both are normalized to requests_per_minute during cascade resolution. Use reset operation to clear counters. Use status to check remaining quota for a key. Note: direct evaluate calls are stateless preview checks — actual rate limiting enforcement happens at the cascade middleware layer on incoming requests to attached elements. Common mistake: setting window_seconds to 0 (causes division, defaults to multiplying limit by 60).

Guide

Overview

A Rate Limit resource defines request throttling rules to protect services from abuse, ensure fair usage, and enforce billing tier limits. Rate limits can be applied globally, per-endpoint, per-user, or per-organization.

Why Rate Limits Exist

  • Protection: Prevent DDoS and abuse
  • Fair Usage: Ensure equitable resource distribution
  • Billing Enforcement: Enforce tier-based usage limits
  • Cost Control: Limit expensive operations (LLM calls, etc.)
  • Stability: Prevent cascading failures from traffic spikes

Directory Structure

api-limits/
├── README.md
├── .triform/
│   ├── triform.yaml       # kind: config/rate-limit
│   ├── contract.yaml
│   ├── spec.yaml
│   └── schema.json
└── examples/
    └── tiered-limits.yaml

Creating a Rate Limit

$ triform create config/rate-limit api-limits

Configuration

Basic Rate Limit

kind: config/rate-limit
slug: api-limits

spec:
  # Default limit for all endpoints
  default:
    requests: 1000
    window: "1m"
    burst: 50

Per-Endpoint Limits

spec:
  default:
    requests: 1000
    window: "1m"

  endpoints:
    # Expensive LLM endpoints
    "/api/v1/llm/*":
      requests: 100
      window: "1m"
      cost_weight: 10  # Counts as 10 requests for quota

    # File uploads
    "/api/v1/files/upload":
      requests: 50
      window: "1h"
      max_size_mb: 100

    # Public read endpoints (more generous)
    "/api/v1/public/*":
      requests: 10000
      window: "1m"

Tier-Based Limits

spec:
  default:
    requests: 100
    window: "1m"

  tiers:
    free:
      requests: 100
      window: "1m"
      daily_cap: 1000

    pro:
      requests: 1000
      window: "1m"
      daily_cap: 50000

    enterprise:
      requests: 10000
      window: "1m"
      daily_cap: null  # Unlimited

User and Org Limits

spec:
  # Per-user limits
  per_user:
    requests: 100
    window: "1m"

  # Per-organization limits (shared across all org members)
  per_org:
    requests: 5000
    window: "1m"

  # Per-API-key limits
  per_api_key:
    requests: 1000
    window: "1m"

Burst Handling

spec:
  default:
    requests: 100
    window: "1m"

    # Allow short bursts above limit
    burst:
      size: 50          # Extra requests allowed
      refill_rate: 10   # Tokens/second added back

    # Token bucket algorithm
    algorithm: "token-bucket"

Cost-Weighted Limits

Some operations should count more toward the limit:

spec:
  default:
    requests: 1000
    window: "1m"

  cost_weights:
    # LLM completions cost more
    "/api/v1/llm/complete":
      weight: 10
      # Also consider token count
      dynamic_weight:
        header: "X-Token-Count"
        multiplier: 0.001  # 1000 tokens = 1 extra unit

    # Image generation is expensive
    "/api/v1/images/generate":
      weight: 50

Response Headers

Rate limit info included in responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1704631200
X-RateLimit-Policy: "1000;w=60"
Retry-After: 42

Exceeded Behavior

spec:
  default:
    requests: 100
    window: "1m"

  on_exceed:
    # HTTP status code
    status: 429

    # Response body
    body:
      error: "rate_limit_exceeded"
      message: "Too many requests. Please retry after {retry_after} seconds."
      retry_after: "{seconds_until_reset}"

    # Optional: queue instead of reject
    queue:
      enabled: false
      max_wait_seconds: 30

Rate Limit Strategies

Sliding Window (Default)

More accurate but slightly more expensive:

spec:
  algorithm: "sliding-window"
  # Considers requests in rolling time window

Fixed Window

Simpler, resets at window boundary:

spec:
  algorithm: "fixed-window"
  # Resets every minute on the minute

Token Bucket

Best for bursty traffic:

spec:
  algorithm: "token-bucket"
  bucket:
    capacity: 100
    refill_rate: 10  # tokens/second

Leaky Bucket

Smooth output rate:

spec:
  algorithm: "leaky-bucket"
  bucket:
    capacity: 100
    leak_rate: 10  # requests/second processed

SDK Usage

Checking Limits Programmatically

from triform import rate_limits

# Check before making request
limit_info = rate_limits.check("api-limits", user=current_user)
if limit_info.remaining > 0:
    make_request()
else:
    wait(limit_info.retry_after)

Custom Keys

# Rate limit by custom key (e.g., IP address)
rate_limits.check("api-limits", key=f"ip:{request.ip}")

# Rate limit by resource
rate_limits.check("api-limits", key=f"resource:{resource_id}")

Monitoring

Rate limits emit metrics:

  • rate_limit_requests_total - Total requests checked
  • rate_limit_exceeded_total - Requests that exceeded limit
  • rate_limit_remaining - Current remaining quota
  • rate_limit_queue_size - Requests waiting in queue

Runtime Behavior

PropertyValue
Cascaderestrictive — strictest limit wins across scopes
Eval Order100
Phaserequest
Fail ActionHTTP 429 (Too Many Requests) with Retry-After header
Applies Toactors, frontend

Files in This Resource

  • README.md - Documentation
  • .triform/triform.yaml - Metadata
  • .triform/contract.yaml - Capabilities
  • .triform/spec.yaml - Configuration

Capabilities

  • sliding-window: Sliding window rate limiting
  • multi-key: Rate limit by IP, user, API key
  • burst: Burst allowance
  • headers: Rate limit headers (X-RateLimit-*)

Properties

PropertyTypeDefaultDescription
limitintegerSimple format: max requests per window (use with window_seconds)
window_secondsintegerSimple format: time window in seconds (use with limit)
limitsarray[{"burst":20,"requests":100,"window_seconds":60}]Rate limit rules (array format). Prefer simple format (limit + window_seconds) for single rules.
keyobjectRate limit key configuration
scopeobjectRate limit scope
responseobjectResponse when rate limited
exemptionsarray[]Exempted identities

Operations

attach

Post /ops/attach | Auth: Read

Attach this modifier to a target element

Attaches this modifier to a target element. The target_id must be a UUID of an existing element that supports this modifier type (check applies_to in definition.yaml). Priority controls evaluation order when multiple modifiers of the same type are attached — lower priority runs first. The attachment is stored in element_modifiers table. Cascade resolution runs at bond-time to merge this modifier into the target’s resolved config. Common mistake: attaching to an incompatible element type — check topology rules first.

delete

Delete /ops/delete | Auth: Admin

Delete element (soft delete)

Soft delete — sets state to ‘deleted’ but retains the record. Cannot delete elements that have children (has_no_bond precondition) or active runs. Requires admin auth and confirmation.

detach

Post /ops/detach | Auth: Read

Detach this modifier from a target element

Removes this modifier from a target element. Requires the target_id. Pervasive modifiers (audit, policy) can only be detached at the level they were originally attached — inherited pervasive modifiers cannot be detached by child elements. After detach, cascade resolution re-runs to remove this modifier’s effect from the resolved config.

disable

Post /ops/disable | Auth: Admin

Disable element (hides and prevents use)

Idempotent — safe to call on already-disabled elements. Optionally pass a reason string. Disabled elements cannot be invoked or executed. Inverse of enable.

enable

Post /ops/enable | Auth: Admin

Enable element (makes usable and visible)

Idempotent — safe to call on already-enabled elements. Transitions element to ready/enabled state. Cannot enable deleted elements. Inverse of disable.

evaluate

Post /ops/evaluate | Auth: Read

Preview rate-limit evaluation (stateless — does NOT affect actual counters)

IMPORTANT: This is a PREVIEW-ONLY operation. It performs a stateless check against the current rate-limit config (given the supplied context) and returns what the decision WOULD be if this request hit the real middleware. It does NOT increment counters, consume quota, or affect the live rate-limit state in any way. Two requests in quick succession will both see the same remaining value. Actual enforcement happens in the cascade middleware layer on incoming requests to attached elements — NOT here. If this rate-limit is not yet attached to any target, calling evaluate will still succeed (stateless preview), but nothing is actually being throttled in production. The response will include a _note field when delivered_to == 0 warning that the modifier is unattached. To test real enforcement end-to-end:

  1. POST /attach with target_id → actually wire the modifier to an element
  2. Deploy / reload the target element
  3. Send real requests to the target element’s operations
  4. Observe 429 responses when limits are exceeded
  5. Use status to inspect live counters, reset to clear them

Common confusion: users think evaluate consumes their quota. It does not. Users also think evaluate reflects live counter state. It does not — it previews the decision based on the provided context, not the current counter.

get

Get /ops/get | Auth: Read

Get element details

Element is already resolved by the routing layer — this returns the cached element, not a fresh DB query. Use the path /api/{circle}/{slug} to address elements.

get_attached_modifiers

Get /ops/attached/{target_id} | Auth: Read

Get all modifiers attached to a target element

Lists all modifiers attached to a specific target element, including modifier_id, type, subcategory, and priority. Useful for debugging cascade resolution or understanding which policies apply to an element before invoking it.

intention

Get /ops/intention | Auth: Read

Get element intention with full inheritance chain

Returns three levels: direct (this element’s intention), inherited (from category and root), and resolved (final merged intention). Useful for understanding an element’s purpose in context of its hierarchy.

list_attachments

Get /ops/targets | Auth: Read

List all elements this modifier is attached to

Returns all target elements where this modifier is currently applied. Shows target_id, target_type, priority, and cascade_policy.

readme_update

Post /ops/readme_update | Auth: Write

Update element README.md content

Creates or overwrites README.md in the element’s git repo. Commits to the draft branch. Content must be provided as a markdown string.

reset

Post /ops/reset | Auth: Admin

Reset rate limit counter for a key

Clears rate limit counters. Pass key to reset a specific user/IP, or omit to reset all. Requires admin auth. Counters are ephemeral (in-memory/Redis) so this is non-destructive. Use when a user was incorrectly rate-limited or during testing.

schema

Get /ops/schema | Auth: Read

Get element input/output schema (MCP tools/list compatible)

Returns type-level port schemas from the TypeRegistry — not instance-specific overrides. Includes direction (input/output), required flag, and JSON schema per port. Useful for understanding what data an element accepts and produces.

status

Get /ops/status | Auth: Read

Get current rate limit status for a key (LIVE counter state)

Returns LIVE remaining requests, limit, window duration, and reset time for a given key. Pass ?key=<user_id|ip> as query param. Unlike evaluate (which is stateless preview), status reflects the actual current counter state stored in the rate-limit backend — the same counter that the cascade middleware increments on incoming requests. Limit rules use ‘requests’ (not ‘max_requests’) and ‘window_seconds’ fields. Useful for showing users their remaining quota before they hit 429, or for debugging rate-limit state in production. Counters are ephemeral (in-memory/NATS KV) and reset when the window expires. Use reset to manually clear.

update

Patch /ops/update | Auth: Write

Update element

Partial update — send only the fields you want to change. spec, name, and intention are all independently optional. spec MUST be a JSON object when present; deep-merged into the existing spec by default. Empty {"spec":{}} preserves existing spec content but still records a new version (no-op for content, not for version state). To clear/replace the entire spec wholesale send {"spec":{...},"deep":false}. List-typed spec fields use replace semantics (the patch list replaces the existing list, no array merging). Coordinates Git + DB writes. Slug cannot be changed after creation.

Error Codes

CodeClassRetryableDescription
RATE_LIMIT_EXCEEDEDlimityesRate limit exceeded
RATE_LIMIT_CONFIG_INVALIDvalidationnoInvalid rate limit configuration

Lifecycle / runtime

Defined for this element

Execution model: sync

Observability

Defined for this element

Metrics

  • evaluation_count
  • rejection_count

Events

  • rate-limit.evaluated
  • rate-limit.rejected

Pricing / cost

Platform default

Operation costs

  • create: free
  • update: free
  • delete: free
  • get: free
  • list: free
  • invoke: 10000 micro-AU
  • tool_use: free

Set it up

Rate Limitsstring
Add rate limit rules