A throttling modifier you attach to any element to cap how fast it can be called — the cascade middleware enforces the limit on every incoming request, returns HTTP 429 when a key runs over, and resolves the strictest limit when inherited and local rules conflict.

Rate Limit (rate-limit)

Category: modifiers | Form: | Symbol: Rl

Control request rates and protect against abuse

Limits request throughput on attached elements. Cascade strategy: restrictive — when inherited and local rate-limits conflict, the lower (more restrictive) requests_per_minute wins. Evaluation order 100 (after auth-policy at 50). Applies to all element types. Fails with HTTP 429. Spec accepts two formats: array format {limits: [{requests: N, window_seconds: N, burst: N}]} or simple format {limit: N, window_seconds: N}. Both are normalized to requests_per_minute during cascade resolution. Use reset operation to clear counters. Use status to check remaining quota for a key. Note: direct evaluate calls are stateless preview checks — actual rate limiting enforcement happens at the cascade middleware layer on incoming requests to attached elements. Common mistake: setting window_seconds to 0 (causes division, defaults to multiplying limit by 60).

Guide

Overview

A Rate Limit resource defines request throttling rules to protect services from abuse, ensure fair usage, and enforce billing tier limits. Rate limits can be applied globally, per-endpoint, per-user, or per-organization.

Why Rate Limits Exist

Protection: Prevent DDoS and abuse
Fair Usage: Ensure equitable resource distribution
Billing Enforcement: Enforce tier-based usage limits
Cost Control: Limit expensive operations (LLM calls, etc.)
Stability: Prevent cascading failures from traffic spikes

Directory Structure

api-limits/
├── README.md
├── .triform/
│   ├── triform.yaml       # kind: config/rate-limit
│   ├── contract.yaml
│   ├── spec.yaml
│   └── schema.json
└── examples/
    └── tiered-limits.yaml

Creating a Rate Limit

$ triform create config/rate-limit api-limits

Configuration

Basic Rate Limit

kind: config/rate-limit
slug: api-limits

spec:
  # Default limit for all endpoints
  default:
    requests: 1000
    window: "1m"
    burst: 50

Per-Endpoint Limits

spec:
  default:
    requests: 1000
    window: "1m"

  endpoints:
    # Expensive LLM endpoints
    "/api/v1/llm/*":
      requests: 100
      window: "1m"
      cost_weight: 10  # Counts as 10 requests for quota

    # File uploads
    "/api/v1/files/upload":
      requests: 50
      window: "1h"
      max_size_mb: 100

    # Public read endpoints (more generous)
    "/api/v1/public/*":
      requests: 10000
      window: "1m"

Tier-Based Limits

spec:
  default:
    requests: 100
    window: "1m"

  tiers:
    free:
      requests: 100
      window: "1m"
      daily_cap: 1000

    pro:
      requests: 1000
      window: "1m"
      daily_cap: 50000

    enterprise:
      requests: 10000
      window: "1m"
      daily_cap: null  # Unlimited

User and Org Limits

spec:
  # Per-user limits
  per_user:
    requests: 100
    window: "1m"

  # Per-organization limits (shared across all org members)
  per_org:
    requests: 5000
    window: "1m"

  # Per-API-key limits
  per_api_key:
    requests: 1000
    window: "1m"

Burst Handling

spec:
  default:
    requests: 100
    window: "1m"

    # Allow short bursts above limit
    burst:
      size: 50          # Extra requests allowed
      refill_rate: 10   # Tokens/second added back

    # Token bucket algorithm
    algorithm: "token-bucket"

Cost-Weighted Limits

Some operations should count more toward the limit:

spec:
  default:
    requests: 1000
    window: "1m"

  cost_weights:
    # LLM completions cost more
    "/api/v1/llm/complete":
      weight: 10
      # Also consider token count
      dynamic_weight:
        header: "X-Token-Count"
        multiplier: 0.001  # 1000 tokens = 1 extra unit

    # Image generation is expensive
    "/api/v1/images/generate":
      weight: 50

Response Headers

Rate limit info included in responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1704631200
X-RateLimit-Policy: "1000;w=60"
Retry-After: 42

Exceeded Behavior

spec:
  default:
    requests: 100
    window: "1m"

  on_exceed:
    # HTTP status code
    status: 429

    # Response body
    body:
      error: "rate_limit_exceeded"
      message: "Too many requests. Please retry after {retry_after} seconds."
      retry_after: "{seconds_until_reset}"

    # Optional: queue instead of reject
    queue:
      enabled: false
      max_wait_seconds: 30

Rate Limit Strategies

Sliding Window (Default)

More accurate but slightly more expensive:

spec:
  algorithm: "sliding-window"
  # Considers requests in rolling time window

Fixed Window

Simpler, resets at window boundary:

spec:
  algorithm: "fixed-window"
  # Resets every minute on the minute

Token Bucket

Best for bursty traffic:

spec:
  algorithm: "token-bucket"
  bucket:
    capacity: 100
    refill_rate: 10  # tokens/second

Leaky Bucket

Smooth output rate:

spec:
  algorithm: "leaky-bucket"
  bucket:
    capacity: 100
    leak_rate: 10  # requests/second processed

SDK Usage

Checking Limits Programmatically

from triform import rate_limits

# Check before making request
limit_info = rate_limits.check("api-limits", user=current_user)
if limit_info.remaining > 0:
    make_request()
else:
    wait(limit_info.retry_after)

Custom Keys

# Rate limit by custom key (e.g., IP address)
rate_limits.check("api-limits", key=f"ip:{request.ip}")

# Rate limit by resource
rate_limits.check("api-limits", key=f"resource:{resource_id}")

Monitoring

Rate limits emit metrics:

rate_limit_requests_total - Total requests checked
rate_limit_exceeded_total - Requests that exceeded limit
rate_limit_remaining - Current remaining quota
rate_limit_queue_size - Requests waiting in queue

Runtime Behavior

Property	Value
Cascade	restrictive — strictest limit wins across scopes
Eval Order	100
Phase	request
Fail Action	HTTP 429 (Too Many Requests) with `Retry-After` header
Applies To	actors, frontend

Files in This Resource

README.md - Documentation
.triform/triform.yaml - Metadata
.triform/contract.yaml - Capabilities
.triform/spec.yaml - Configuration

Capabilities

sliding-window: Sliding window rate limiting
multi-key: Rate limit by IP, user, API key
burst: Burst allowance
headers: Rate limit headers (X-RateLimit-*)

Properties

Property	Type	Default	Description
`limit`	integer	—	Simple format: max requests per window (use with window_seconds)
`window_seconds`	integer	—	Simple format: time window in seconds (use with limit)
`limits`	array	`[{"burst":20,"requests":100,"window_seconds":60}]`	Rate limit rules (array format). Prefer simple format (limit + window_seconds) for single rules.
`key`	object	—	Rate limit key configuration
`scope`	object	—	Rate limit scope
`response`	object	—	Response when rate limited
`exemptions`	array	`[]`	Exempted identities

Operations

`attach`

Post /ops/attach | Auth: Read

Attach this modifier to a target element

Attaches this modifier to a target element. The target_id must be a UUID of an existing element that supports this modifier type (check applies_to in definition.yaml). Priority controls evaluation order when multiple modifiers of the same type are attached — lower priority runs first. The attachment is stored in element_modifiers table. Cascade resolution runs at bond-time to merge this modifier into the target’s resolved config. Common mistake: attaching to an incompatible element type — check topology rules first.

`delete`

Delete /ops/delete | Auth: Admin

Delete element (soft delete)

Soft delete — sets state to ‘deleted’ but retains the record. Cannot delete elements that have children (has_no_bond precondition) or active runs. Requires admin auth and confirmation.

`detach`

Post /ops/detach | Auth: Read

Detach this modifier from a target element

Removes this modifier from a target element. Requires the target_id. Pervasive modifiers (audit, policy) can only be detached at the level they were originally attached — inherited pervasive modifiers cannot be detached by child elements. After detach, cascade resolution re-runs to remove this modifier’s effect from the resolved config.

`disable`

Post /ops/disable | Auth: Admin

Disable element (hides and prevents use)

Idempotent — safe to call on already-disabled elements. Optionally pass a reason string. Disabled elements cannot be invoked or executed. Inverse of enable.

`enable`

Post /ops/enable | Auth: Admin

Enable element (makes usable and visible)

Idempotent — safe to call on already-enabled elements. Transitions element to ready/enabled state. Cannot enable deleted elements. Inverse of disable.

`evaluate`

Post /ops/evaluate | Auth: Read

Preview rate-limit evaluation (stateless — does NOT affect actual counters)

IMPORTANT: This is a PREVIEW-ONLY operation. It performs a stateless check against the current rate-limit config (given the supplied context) and returns what the decision WOULD be if this request hit the real middleware. It does NOT increment counters, consume quota, or affect the live rate-limit state in any way. Two requests in quick succession will both see the same remaining value. Actual enforcement happens in the cascade middleware layer on incoming requests to attached elements — NOT here. If this rate-limit is not yet attached to any target, calling evaluate will still succeed (stateless preview), but nothing is actually being throttled in production. The response will include a _note field when delivered_to == 0 warning that the modifier is unattached. To test real enforcement end-to-end:

POST /attach with target_id → actually wire the modifier to an element

Deploy / reload the target element

Send real requests to the target element’s operations

Observe 429 responses when limits are exceeded

Use status to inspect live counters, reset to clear them

Common confusion: users think evaluate consumes their quota. It does not. Users also think evaluate reflects live counter state. It does not — it previews the decision based on the provided context, not the current counter.

`get`

Get /ops/get | Auth: Read

Get element details

Element is already resolved by the routing layer — this returns the cached element, not a fresh DB query. Use the path /api/{circle}/{slug} to address elements.

`get_attached_modifiers`

Get /ops/attached/{target_id} | Auth: Read

Get all modifiers attached to a target element

Lists all modifiers attached to a specific target element, including modifier_id, type, subcategory, and priority. Useful for debugging cascade resolution or understanding which policies apply to an element before invoking it.

`intention`

Get /ops/intention | Auth: Read

Get element intention with full inheritance chain

Returns three levels: direct (this element’s intention), inherited (from category and root), and resolved (final merged intention). Useful for understanding an element’s purpose in context of its hierarchy.

`list_attachments`

Get /ops/targets | Auth: Read

List all elements this modifier is attached to

Returns all target elements where this modifier is currently applied. Shows target_id, target_type, priority, and cascade_policy.

`readme_update`

Post /ops/readme_update | Auth: Write

Update element README.md content

Creates or overwrites README.md in the element’s git repo. Commits to the draft branch. Content must be provided as a markdown string.

`reset`

Post /ops/reset | Auth: Admin

Reset rate limit counter for a key

Clears rate limit counters. Pass key to reset a specific user/IP, or omit to reset all. Requires admin auth. Counters are ephemeral (in-memory/Redis) so this is non-destructive. Use when a user was incorrectly rate-limited or during testing.

`schema`

Get /ops/schema | Auth: Read

Get element input/output schema (MCP tools/list compatible)

Returns type-level port schemas from the TypeRegistry — not instance-specific overrides. Includes direction (input/output), required flag, and JSON schema per port. Useful for understanding what data an element accepts and produces.

`status`

Get /ops/status | Auth: Read

Get current rate limit status for a key (LIVE counter state)

Returns LIVE remaining requests, limit, window duration, and reset time for a given key. Pass ?key=<user_id|ip> as query param. Unlike evaluate (which is stateless preview), status reflects the actual current counter state stored in the rate-limit backend — the same counter that the cascade middleware increments on incoming requests. Limit rules use ‘requests’ (not ‘max_requests’) and ‘window_seconds’ fields. Useful for showing users their remaining quota before they hit 429, or for debugging rate-limit state in production. Counters are ephemeral (in-memory/NATS KV) and reset when the window expires. Use reset to manually clear.

`update`

Patch /ops/update | Auth: Write

Update element

Partial update — send only the fields you want to change. spec, name, and intention are all independently optional. spec MUST be a JSON object when present; deep-merged into the existing spec by default. Empty {"spec":{}} preserves existing spec content but still records a new version (no-op for content, not for version state). To clear/replace the entire spec wholesale send {"spec":{...},"deep":false}. List-typed spec fields use replace semantics (the patch list replaces the existing list, no array merging). Coordinates Git + DB writes. Slug cannot be changed after creation.

Error Codes

Code	Class	Retryable	Description
`RATE_LIMIT_EXCEEDED`	limit	yes	Rate limit exceeded
`RATE_LIMIT_CONFIG_INVALID`	validation	no	Invalid rate limit configuration

Working with it

How it appears

Rate Limit

When to use / not

When to use

When not to use

Topology

Properties

Capabilities

Operations

Ports

Inputs

Errors / when it fails

Validation rules

Rate Limit (rate-limit)

Guide

Overview

Why Rate Limits Exist

Directory Structure

Creating a Rate Limit

Configuration

Basic Rate Limit

Per-Endpoint Limits

Tier-Based Limits

User and Org Limits

Burst Handling

Cost-Weighted Limits

Response Headers

Exceeded Behavior

Rate Limit Strategies

Sliding Window (Default)

Fixed Window

Token Bucket

Leaky Bucket

SDK Usage

Checking Limits Programmatically

Custom Keys

Monitoring

Runtime Behavior

Files in This Resource

Capabilities

Properties

Operations

attach

delete

detach

disable

enable

evaluate

get

get_attached_modifiers

intention

list_attachments

readme_update

reset

schema

status

update

Error Codes

Lifecycle / runtime

Observability

Metrics

Events

Pricing / cost

Operation costs

Set it up

Related elements

Related concepts

Related recipes

`attach`

`delete`

`detach`

`disable`

`enable`

`evaluate`

`get`

`get_attached_modifiers`

`intention`

`list_attachments`

`readme_update`

`reset`

`schema`

`status`

`update`