A response-phase guard that scans what your actors say on the way out — redacting forbidden terms and anonymizing sensitive ones — so an agent's tool output is scrubbed before it ever reaches a caller, without ever failing the request.

Filter Words (filter-words)

Category: modifiers | Form: | Symbol: Fw

Filter forbidden and anonymize sensitive words in agent tool outputs

Filters forbidden words and anonymizes sensitive terms in agent tool outputs. Phase: response (processes output after execution). Evaluation order 50. Applies to actors only. Cascade behavior: union — inherited and local filter lists are merged (all forbidden words from all levels apply). Fail action: redact (replaces matched words rather than rejecting the entire response). Spec defines forbidden (blocked term rules), anonymized (words replaced with placeholders), and case_sensitive flag. Use filter-words for content safety in agent outputs; use validation for input schema checking. Common mistake: expecting filter-words to work on request input — it only runs on response output (phase: response).

Guide

Overview

A Filter Words modifier defines lists of prohibited or sensitive words and phrases that are scanned in actor responses before they are returned to callers. Matched content is redacted or replaced rather than causing a hard failure.

Why Filter Words Exists

Content Safety: Prevent sensitive or offensive terms from appearing in responses
Compliance: Redact regulated data patterns (e.g., PII, profanity) before delivery
Brand Protection: Block competitor names or restricted terminology
Layered Defense: Works alongside auth-policy and protection for defense in depth

Configuration

Basic Example

element_type: filter-words
slug: content-filter
name: Content Filter

spec:
  # Exact words or phrases to redact
  words:
    - "internal-codename"
    - "confidential"

  # Regex patterns to redact
  patterns:
    - "\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b"  # Credit card numbers
    - "\\b[A-Z]{2}\\d{6}\\b"                  # Passport-style IDs

  # Replacement string (defaults to [REDACTED])
  replacement: "[REDACTED]"

  # Case-insensitive matching (default: true)
  case_sensitive: false

Category Presets

spec:
  # Enable built-in category presets
  presets:
    - pii           # Names, emails, phone numbers, SSNs
    - credit_cards  # Card number patterns
    - profanity     # Common profanity list

  replacement: "***"

Allow List

spec:
  words:
    - "password"

  # Never redact these even if they match a pattern
  allowlist:
    - "reset-password-guide"

Cascade Behavior

Filter word lists from all attached scopes are unioned — every word and pattern from every scope applies. There is no way for a child element to remove a word added by a parent scope.

Response Transformation

When matches are found, the response body is rewritten with replacements before delivery. The HTTP status code is not changed — the response succeeds but with redacted content. This is different from middleware modifiers that can abort a request.

Files

README.md - Documentation
.triform/definition.yaml - Element type definition
.triform/properties.yaml - Configurable properties
.triform/contract.yaml - Bonds and capabilities
.triform/ops.yaml - Operations

Runtime Behavior

Property	Value
Cascade	union — all word lists from all scopes are combined
Eval Order	50
Phase	response
Fail Action	redact matched content (no HTTP error — response succeeds with replacements)
Applies To	actors

Relationships

Attaches to: circle

Capabilities

word-blocking: Block tool output containing forbidden words
word-redaction: Redact forbidden words from tool output
anonymization: Replace sensitive words with consistent anonymous placeholders

Properties

Property	Type	Default	Description
`forbidden`	array	`[]`	Forbidden words with block or redact actions
`anonymized`	array	`[]`	Words to anonymize with consistent placeholders ([ANON-1], [ANON-2], etc.)
`case_sensitive`	boolean	`false`	Whether word matching is case-sensitive
`words`	array	`[]`	Words or phrases to filter (used with action and replacement)
`action`	string	`"redact"`	Action when word is found: redact (replace), reject (block), or warn (log and continue)
`replacement`	string	`"[FILTERED]"`	Replacement string when masking words

Operations

`attach`

Post /ops/attach | Auth: Read

Attach this modifier to a target element

Attaches this modifier to a target element. The target_id must be a UUID of an existing element that supports this modifier type (check applies_to in definition.yaml). Priority controls evaluation order when multiple modifiers of the same type are attached — lower priority runs first. The attachment is stored in element_modifiers table. Cascade resolution runs at bond-time to merge this modifier into the target’s resolved config. Common mistake: attaching to an incompatible element type — check topology rules first.

`delete`

Delete /ops/delete | Auth: Admin

Delete element (soft delete)

Soft delete — sets state to ‘deleted’ but retains the record. Cannot delete elements that have children (has_no_bond precondition) or active runs. Requires admin auth and confirmation.

`detach`

Post /ops/detach | Auth: Read

Detach this modifier from a target element

Removes this modifier from a target element. Requires the target_id. Pervasive modifiers (audit, policy) can only be detached at the level they were originally attached — inherited pervasive modifiers cannot be detached by child elements. After detach, cascade resolution re-runs to remove this modifier’s effect from the resolved config.

`disable`

Post /ops/disable | Auth: Admin

Disable element (hides and prevents use)

Idempotent — safe to call on already-disabled elements. Optionally pass a reason string. Disabled elements cannot be invoked or executed. Inverse of enable.

`enable`

Post /ops/enable | Auth: Admin

Enable element (makes usable and visible)

Idempotent — safe to call on already-enabled elements. Transitions element to ready/enabled state. Cannot enable deleted elements. Inverse of disable.

`evaluate`

Post /ops/evaluate | Auth: Read

Evaluate text against the filter — return matches, blocked status, and sanitized text

Pass {text: “…”} to check the text against the configured forbidden and anonymized word lists. Returns evaluation (“pass” or “fail” — fail when any forbidden word with action “block”/“reject” matched), the matched words, and a sanitized version of the text with forbidden words redacted and anonymized terms replaced by [ANON-N] placeholders. Used by automation pipelines to gate or scrub agent tool outputs before they reach the LLM.

`get`

Get /ops/get | Auth: Read

Get element details

Element is already resolved by the routing layer — this returns the cached element, not a fresh DB query. Use the path /api/{circle}/{slug} to address elements.

`get_attached_modifiers`

Get /ops/attached/{target_id} | Auth: Read

Get all modifiers attached to a target element

Lists all modifiers attached to a specific target element, including modifier_id, type, subcategory, and priority. Useful for debugging cascade resolution or understanding which policies apply to an element before invoking it.

`intention`

Get /ops/intention | Auth: Read

Get element intention with full inheritance chain

Returns three levels: direct (this element’s intention), inherited (from category and root), and resolved (final merged intention). Useful for understanding an element’s purpose in context of its hierarchy.

`list_attachments`

Get /ops/targets | Auth: Read

List all elements this modifier is attached to

Returns all target elements where this modifier is currently applied. Shows target_id, target_type, priority, and cascade_policy.

`readme_update`

Post /ops/readme_update | Auth: Write

Update element README.md content

Creates or overwrites README.md in the element’s git repo. Commits to the draft branch. Content must be provided as a markdown string.

`schema`

Get /ops/schema | Auth: Read

Get element input/output schema (MCP tools/list compatible)

Returns type-level port schemas from the TypeRegistry — not instance-specific overrides. Includes direction (input/output), required flag, and JSON schema per port. Useful for understanding what data an element accepts and produces.

`status`

Get /ops/status | Auth: Read

Get current filter words configuration summary

Returns a summary of the filter configuration: forbidden_count (blocked words), anonymized_count (words replaced with placeholders), and case_sensitive flag. Use to verify the filter is configured correctly before attaching to an actor.

`update`

Patch /ops/update | Auth: Write

Update element

Partial update — send only the fields you want to change. spec, name, and intention are all independently optional. spec MUST be a JSON object when present; deep-merged into the existing spec by default. Empty {"spec":{}} preserves existing spec content but still records a new version (no-op for content, not for version state). To clear/replace the entire spec wholesale send {"spec":{...},"deep":false}. List-typed spec fields use replace semantics (the patch list replaces the existing list, no array merging). Coordinates Git + DB writes. Slug cannot be changed after creation.

Error Codes

Code	Class	Retryable	Description
`FILTER_WORDS_BLOCKED`	limit	no	Tool output blocked due to forbidden word
`FILTER_WORDS_CONFIG_INVALID`	validation	no	Invalid filter words configuration

Working with it

How it appears

Filter Words

When to use / not

When to use

When not to use

Topology

Properties

Capabilities

Operations

Ports

Inputs

Composition

Errors / when it fails

Validation rules

Filter Words (filter-words)

Guide

Overview

Why Filter Words Exists

Configuration

Basic Example

Category Presets

Allow List

Cascade Behavior

Response Transformation

Files

Runtime Behavior

Relationships

Capabilities

Properties

Operations

attach

delete

detach

disable

enable

evaluate

get

get_attached_modifiers

intention

list_attachments

readme_update

schema

status

update

Error Codes

Lifecycle / runtime

Observability

Metrics

Events

Pricing / cost

Operation costs

Set it up

Related elements

Related concepts

Related recipes

`attach`

`delete`

`detach`

`disable`

`enable`

`evaluate`

`get`

`get_attached_modifiers`

`intention`

`list_attachments`

`readme_update`

`schema`

`status`

`update`