The voice an agent speaks with — a text-to-speech model paired with a library of voices, living inside an Intelligence Lab so the lab's provider connection turns written agent output into spoken audio.

Working with it

Selecting a Mouth reveals its settings in the properties panel; it has no dedicated full-screen workbench.

How it appears

The same element type rendered as a definition, a circle instance, and a live workspace card.

type

Mouth

A text-to-speech voice within an Intelligence Lab

intelligenceatomdefinition

When to use / not

When to use

Giving an agent or automation a spoken voice — turning generated text into audio a caller or listener actually hears.
Driving the speaking side of a phone or voice conversation, where conversation_pacing tunes turn-taking against the telephony bridge.
Building and holding a library of cloned, preset, or instruction-based voices that agents select per line by (mouth, voice_id).

When not to use

Turning incoming speech into text — that is the input half; use the sibling ears element for transcription.
Plain text generation or chat completion with no audio — that is the brain element inside the same lab.
Storing the provider endpoint and credentials yourself — a mouth nests in a lab and inherits the lab's connection; create the lab first.

Topology

Lives nested inside a parent element rather than standing alone — it is created in the context of its container.

Properties

providerstring: TTS provider
voice_modestring: How consumers (agents) pick a specific voice from this mouth: - cloning: choose from this mouth's voices[] array (Mistral Voxtral, ElevenLabs) - preset: name a provider preset (OpenAI "alloy", "echo", …) - instruction: free-text voice description Default follows the provider (cloning for mistral/elevenlabs/custom, preset for openai/google) but can be overridden for custom mouths.
model_idstring: Model identifier sent to the provider API (e.g. voxtral-mini-tts-2603)
display_namestring: Human label for this mouth (e.g. "Primary voice bank")
credential_refstring: Reference to secret element with provider API key. Optional — falls back to platform MISTRAL_API_KEY (or provider-equivalent) when unset.
response_formatstring: Audio encoding returned by synthesis. WAV is the default because that's what the browser's decodeAudioData can parse reliably — raw PCM is headerless and silently fails to decode in Web Audio. Switch to PCM only for server-side / RTP pipelines that know the sample rate and channel count out of band.
sample_rateinteger: Sample rate in Hz for PCM output. Ignored for compressed formats.
streamingboolean: Stream audio chunks as they're synthesized. Recommended for conversational UX.
conversation_pacingobject: Per-mouth turn-taking rhythm for telephony calls. Each silence threshold is how long the bridge waits after voice activity stops before declaring end-of-turn — punctuation in the live STT transcript drives which threshold applies, so the agent can react fast on completed thoughts and stay patient on mid-sentence pauses.
pricingobject: Cost per million tokens (USD) for billing reference

Capabilities

Defined for this element

Observe

Operations

activityGET
attachmentsGET
batch_statsGET
clone-voicePOST
composePOST
contextGET
createPOST
deleteDELETE
delete-voicePOST
disablePOST
enablePOST
export_bundleGET
getGET
import_bundlePOST
infoGET
intentionGET
list-voicesGET
promotePOST
readmeGET
readme_updatePOST
remove-modifierPOST
restorePOST
schemaGET
sourceGET
source_branchesGET
source_promotePOST
source_repairPOST
source_statusGET
source_validatePOST
statsGET
synthesizePOST
testPOST
treeGET
updatePATCH
update_metaPATCH
versionGET

Ports

Inputs

requestrequest
inforequest
resultevent

Composition

Attaches

Referenced by

Validation rules

Mouth model id required

Mouth (mouth)

Category: intelligence | Form: | Symbol: Mo

A text-to-speech voice within an Intelligence Lab

A Mouth is a specific TTS model + voice combination (e.g. Mistral Voxtral speaking in a cloned voice). Voice data (sample audio, provider-returned voice_id) lives on the mouth itself because it’s model-specific. To give an agent a different voice, create another mouth element. Agents select a mouth by ID; the runtime finds the parent lab and handles synthesis.

Guide

A text-to-speech voice within an Intelligence Lab — a TTS model bound to a library of voices that agents speak through

What It Does

A Mouth is the output / synthesis half of an Intelligence Lab. It represents a specific TTS model (e.g. Mistral Voxtral) plus a library of voices usable through that model. Where a sibling ears element turns speech into text (input / transcription), a mouth turns an agent’s text output into spoken audio. An agent selects a mouth by ID; the runtime resolves which lab the mouth belongs to and uses that lab’s connection details to synthesize speech.

One mouth can hold many voices — cloned, preset, or instruction-based depending on the provider. Voice data (sample audio, the provider-returned provider_voice_id) lives on the mouth itself because it is model-specific and cannot be reused across providers. Agents reference a specific voice via the pair (mouth_slug, voice_id). To give an agent a different voice, add another voice to the mouth (or create another mouth element).

Mouths are atoms with residence: nested — they live inside a Lab element, which supplies the API endpoint and credentials, and sit alongside brain (chat completions) and ears (transcription) under the same lab. Because cloned voice data is biometric-adjacent, mouth visibility is restricted to collaborators (allowed_visibility: [collaborator]).

Element Definition

Property	Value
Type	`mouth`
Category	`intelligence`
Form	`atom`
Residence	`nested`
Symbol	`Mo` / `#F97316`
Activity type	`resource`
Streaming	supported (`supports_streaming: true`)
Visibility	`collaborator` only

Properties

Field	Type	Default	Description
`provider`	string (enum: `mistral`, `openai`, `elevenlabs`, `google`, `custom`)	`mistral`	TTS provider
`voice_mode`	string (enum: `cloning`, `preset`, `instruction`)	`cloning`	How agents pick a voice: choose from `voices[]` (cloning), name a provider preset (preset), or free-text description (instruction)
`model_id`	string	—	Model identifier sent to the provider API (e.g. `voxtral-mini-tts-2603`)
`display_name`	string	—	Human label for this mouth (e.g. “Primary voice bank”)
`voices`	array	`[]`	Voices this mouth can speak with. Managed via the clone-voice / delete-voice ops
`credential_ref`	string	—	Reference to a secret element holding the provider API key. Optional — falls back to a platform key (e.g. `MISTRAL_API_KEY`) when unset
`response_format`	string (enum: `pcm`, `mp3`, `opus`, `wav`, `flac`)	`wav`	Audio encoding returned by synthesis. WAV is the default because the browser’s `decodeAudioData` can parse it reliably
`sample_rate`	integer (enum: `8000`, `16000`, `22050`, `24000`, `44100`, `48000`)	`24000`	Sample rate in Hz for PCM output. Ignored for compressed formats
`streaming`	boolean	`true`	Stream audio chunks as they’re synthesized
`conversation_pacing`	object	—	Per-mouth turn-taking rhythm consumed by the telephony bridge (silence thresholds + speculative generation)
`pricing`	object	—	`input_per_mtok` / `output_per_mtok` cost reference (USD per million tokens)

Each entry in voices[] carries: id (required, stable UUID referenced by agents), name (required), provider_voice_id, sample_file_ref, sample_content_hash, sample_content_type, instruction, languages, gender, tags, speed (0.5–2.0, default 1.0 — a post-synthesis playback multiplier, not sent upstream), and created_at.

The conversation_pacing object holds silence_after_period_ms (default 1500), silence_after_question_ms (default 1000), silence_default_ms (default 2500), speculative_generation (default true), and speculative_prefill_words (0–10, default 3).

States

draft → cloning → active → error. Initial state is draft.

Capabilities

text-to-speech
voice-cloning — clone voices from audio samples (cloning providers only)
voice-library — hold multiple voices on one mouth
streaming

Ports

Direction	Port	Schema	Description
Input	`request`	`SpeechRequest`	Text + voice selection to synthesize
Output	`info`	`MouthInfo`	Mouth metadata (required)
Output	`result`	`SpeechResponse`	Synthesized audio (event port)

Modifiers

Attaches rate-limit.

Error Codes

Code	Class	Retryable	Meaning
`MOUTH_UNAVAILABLE`	internal	yes	Mouth could not be reached
`MOUTH_CREDENTIAL_MISSING`	auth	no	No usable provider credential
`MOUTH_VOICE_NOT_FOUND`	validation	no	Requested `voice_id` isn’t in this mouth’s voices array
`MOUTH_VOICE_AMBIGUOUS`	validation	no	Mouth has multiple voices but no `voice_id` was specified
`MOUTH_VOICE_NOT_CLONED`	validation	no	Voice exists locally but `provider_voice_id` is missing — re-run clone-voice
`MOUTH_SAMPLE_UNUSABLE`	validation	no	Provided sample could not be used
`MOUTH_SAMPLE_TOO_LARGE`	validation	no	Sample exceeds the 10 MB cap
`MOUTH_TEXT_TOO_LONG`	validation	no	Input text exceeds the synthesis limit
`MOUTH_SYNTHESIS_FAILED`	internal	yes	Synthesis failed downstream

Operations

`info`

GET info · auth: read

Returns mouth metadata — provider, model_id, display_name, voice_count, default_voice_id, streaming, response_format, sample_rate. Used by the mouth picker UI.

`list-voices`

GET list-voices · auth: read

Lists every voice configured on this mouth, each with id, name, provider_voice_id, languages, gender, tags, and created_at. Used by the agent voice picker.

`synthesize`

POST synthesize · auth: execute

Speaks a line of text with a chosen voice. Input requires text; optional voice_id (must match an entry in the mouth’s voices array — when omitted and only one voice exists, that voice is used automatically, otherwise an error is returned) and response_format (overrides the mouth default; one of pcm, mp3, opus, wav, flac). Returns audio_data_b64, content_type, duration_ms, sample_rate, voice_id, and cost_au.

`clone-voice`

POST clone-voice · auth: write

Adds a new voice to this mouth’s voices array. Input requires name; accepts either a files-element reference (sample_file_ref) or inline base64 audio (sample_data_b64 with sample_content_type, 10 MB cap). For cloning providers (Mistral Voxtral), the sample is uploaded to the provider and the returned voice id is stored; for preset providers (OpenAI), set provider_voice_id to the preset name directly without a sample. Also accepts instruction, languages, gender, and tags. Returns the new voice_id, provider_voice_id, and name.

`delete-voice`

POST delete-voice · auth: write

Removes a voice (by voice_id) from the mouth’s array and asks the provider to delete it upstream for cloning providers. Returns deleted and voice_id. Agents still referencing the removed voice_id will fail to synthesize until reconfigured.

`test`

POST test · auth: execute

Synthesizes a short test phrase with a chosen voice (defaults text to “Hello, this is a test of the text to speech connection.” and falls back to the first voice if voice_id is omitted). Returns success, latency_ms, duration_ms, audio_data_b64, voice_id, and error — useful for verifying connectivity before assigning the mouth to an agent.

Quick Start

Create a mouth (inside a lab)

POST /api/{circle}/{lab-element}/
Content-Type: application/json

{
  "element_type": "mouth",
  "slug": "support-voice",
  "name": "Support Voice",
  "spec": {
    "provider": "mistral",
    "voice_mode": "cloning",
    "model_id": "voxtral-mini-tts-2603",
    "display_name": "Primary voice bank",
    "response_format": "wav",
    "sample_rate": 24000,
    "streaming": true
  }
}

Add a voice, then speak

POST /api/{circle}/{lab}/{mouth}/ops/clone-voice
{
  "name": "Narrator",
  "sample_data_b64": "<base64 audio, max 10 MB>",
  "sample_content_type": "audio/webm",
  "gender": "neutral"
}

POST /api/{circle}/{lab}/{mouth}/ops/synthesize
{
  "text": "Welcome back. Your report is ready.",
  "voice_id": "<id returned by clone-voice>"
}

The response carries audio_data_b64 plus content_type, duration_ms, sample_rate, and cost_au.

Common Mistakes

No model_id. A mouth without a model_id can’t synthesize (validation flags this). Set the exact model identifier the provider expects (e.g. voxtral-mini-tts-2603).

No voices yet. A fresh mouth has an empty voices[] array — add a voice via clone-voice before any agent can speak through it.

Omitting voice_id with multiple voices. When a mouth holds more than one voice, synthesize requires an explicit voice_id; omitting it returns MOUTH_VOICE_AMBIGUOUS. The omit-and-default shortcut only works when exactly one voice is configured.

Referencing an uncloned voice. If a voice entry exists locally but has no provider_voice_id (cloning provider), synthesis fails with MOUTH_VOICE_NOT_CLONED — re-run clone-voice for that voice.

Oversized samples. Inline sample_data_b64 is capped at 10 MB; larger samples return MOUTH_SAMPLE_TOO_LARGE. Use a sample_file_ref (files-element upload) for large clone sources.

Expecting raw PCM to play in the browser. wav is the default response_format because raw pcm is headerless and silently fails to decode in Web Audio. Only switch to pcm for server-side / RTP pipelines that know the sample rate and channel count out of band.

Relationships

Attaches to: rate-limit

Capabilities

text-to-speech:
voice-cloning: Clone voices from audio samples (cloning providers only)
voice-library: Hold multiple voices on one mouth
streaming:

Properties

Property	Type	Default	Description
`provider`	string	`"mistral"`	TTS provider
`voice_mode`	string	`"cloning"`	How consumers (agents) pick a specific voice from this mouth: - cloning: choose from this mouth’s voices[] array (Mistral Voxtral, ElevenLabs) - preset: name a provider preset (OpenAI “alloy”, “echo”, …) - instruction: free-text voice description Default follows the provider (cloning for mistral/elevenlabs/custom, preset for openai/google) but can be overridden for custom mouths.
`model_id`	string	—	Model identifier sent to the provider API (e.g. voxtral-mini-tts-2603)
`display_name`	string	—	Human label for this mouth (e.g. “Primary voice bank”)
`voices`	array	`[]`	Voices this mouth can speak with. Managed via clone-voice / delete-voice ops.
`credential_ref`	string	—	Reference to secret element with provider API key. Optional — falls back to platform MISTRAL_API_KEY (or provider-equivalent) when unset.
`response_format`	string	`"wav"`	Audio encoding returned by synthesis. WAV is the default because that’s what the browser’s decodeAudioData can parse reliably — raw PCM is headerless and silently fails to decode in Web Audio. Switch to PCM only for server-side / RTP pipelines that know the sample rate and channel count out of band.
`sample_rate`	integer	`24000`	Sample rate in Hz for PCM output. Ignored for compressed formats.
`streaming`	boolean	`true`	Stream audio chunks as they’re synthesized. Recommended for conversational UX.
`conversation_pacing`	object	—	Per-mouth turn-taking rhythm for telephony calls. Each silence threshold is how long the bridge waits after voice activity stops before declaring end-of-turn — punctuation in the live STT transcript drives which threshold applies, so the agent can react fast on completed thoughts and stay patient on mid-sentence pauses.
`pricing`	object	—	Cost per million tokens (USD) for billing reference

Operations

`activity`

Get /ops/activity | Auth: Read

Get activity events for this element

Scope depends on element capabilities: individual elements query by element_id, project-form elements with activity-scope-members include member activities, circle-level elements with activity-scope-all query the entire circle. Gracefully returns empty list if activities table is missing (old circles).

`attachments`

Get /ops/attachments | Auth: Read

List all modifiers and resources attached to this element

Returns both modifiers (policy enforcement) and resources (data injection) with is_modifier flag to distinguish. Items in the generated MODIFIER_TYPES list are modifiers; everything else is a resource. Includes cascade_policy and version pin info.

`batch_stats`

Get /ops/batch_stats | Auth: Read

Get per-element statistics for all children of this element

Returns per-child stats plus an aggregate. Most meaningful on compound or manifest form elements (repositories, circles, projects); atoms have no children so the result is an empty children array with a zeroed aggregate. Uses efficient GROUP BY SQL. Weighted averages for eval scores.

`clone-voice`

Post /ops/clone-voice | Auth: Write

Add a new voice to this mouth from a recording or uploaded file

Creates a new entry in this mouth’s voices array. Accepts either a files-element reference (upload path) or inline base64 audio (direct recording path, 10 MB cap). For cloning providers (Mistral Voxtral), the sample is uploaded to the provider and the returned voice_id is stored. For preset providers (OpenAI), set provider_voice_id directly without a sample. Returns the new voice’s id.

`compose`

Post /ops/compose | Auth: Execute

Batch add and remove modifiers on this element in a single call

Declarative composition: add modifiers by ref path (slug or path@version) and remove by attachment ID, all in one atomic call on the target element. Each ‘add’ entry resolves the source element, validates topology, attaches with optional priority and cascade policy. Each ‘remove’ entry deletes the attachment row. Returns a summary of what was added and removed. Example: compose({ add: [{ref: “my-prompt”}, {ref: “rate-limit/api@v2”, priority: 50}], remove: [{attachment_id: “uuid”}] })

`context`

Get /ops/context | Auth: Read

Get connected elements (graph traversal)

Graph traversal showing all connected elements with their relationship type (contains, contained_by, references, referenced_by, attaches, etc.). Use ?depth=N to control traversal depth (default 1) and ?types=actor,data to filter by element types.

`create`

Post /ops/create | Auth: Write

Create child element

POST to the parent path — element_type goes in the request body, NOT the URL. Both element_type and slug are required and must be non-empty. Name is derived from slug if omitted. Writes to both Git and PostgreSQL. All elements are stored flat under the circle — no intermediate library wrapper rows.

`delete`

Delete /ops/delete | Auth: Admin

Delete element (soft delete)

Soft delete — sets state to ‘deleted’ but retains the record. Cannot delete elements that have children (has_no_bond precondition) or active runs. Requires admin auth and confirmation.

`delete-voice`

Post /ops/delete-voice | Auth: Write

Remove a voice from this mouth

Removes the voice from the mouth’s array and asks the provider to delete it upstream (cloning providers). Agents still referencing this voice_id will fail to synthesize until reconfigured.

`disable`

Post /ops/disable | Auth: Admin

Disable element (hides and prevents use)

Idempotent — safe to call on already-disabled elements. Optionally pass a reason string. Disabled elements cannot be invoked or executed. Inverse of enable.

`enable`

Post /ops/enable | Auth: Admin

Enable element (makes usable and visible)

Idempotent — safe to call on already-enabled elements. Transitions element to ready/enabled state. Cannot enable deleted elements. Inverse of disable.

`export_bundle`

Get /ops/export/bundle | Auth: Read

Export element as downloadable git bundle

On non-root-namespace elements, returns a binary git bundle. On root-namespace (circle) elements, dispatch hands off to the circle’s own export_bundle op, which returns a multi-element JSON envelope with one base64 bundle per child element — this is intentional, not an error.

`get`

Get /ops/get | Auth: Read

Get element details

Element is already resolved by the routing layer — this returns the cached element, not a fresh DB query. Use the path /api/{circle}/{slug} to address elements.

`import_bundle`

Post /ops/import/bundle | Auth: Write

Import git bundle into element

Accepts a base64-encoded git bundle in the JSON bundle_base64 field. Use overwrite=true to replace existing elements with same slug (default skips duplicates). Imported elements get new UUIDs. Returns counts of imported/skipped elements and any errors.

`info`

Get /ops/info | Auth: Read

Get mouth metadata (provider, model, voices count)

Returns provider, model_id, number of voices, and defaults — used by the mouth picker UI.

`intention`

Get /ops/intention | Auth: Read

Get element intention with full inheritance chain

Returns three levels: direct (this element’s intention), inherited (from category and root), and resolved (final merged intention). Useful for understanding an element’s purpose in context of its hierarchy.

`list-voices`

Get /ops/list-voices | Auth: Read

List all voices configured on this mouth

Returns every voice in this mouth’s library with name, id, and metadata. Used by the agent voice picker.

`promote`

Post /ops/promote | Auth: Admin

Promote element configuration to a target environment

Only for manifest-form elements (projects). Environments advance: dev → demo → live. dev→demo requires member+ role, demo→live requires admin. Freezes member versions at promotion time (creates snapshot). Persists environment config to spec.environments.

`readme`

Get /ops/readme | Auth: Read

Get element README.md content

Reads README.md from the element’s git repository. Returns empty content (not an error) if no README exists. Always returns markdown format.

`readme_update`

Post /ops/readme_update | Auth: Write

Update element README.md content

Creates or overwrites README.md in the element’s git repo. Commits to the draft branch. Content must be provided as a markdown string.

`remove-modifier`

Post /ops/remove-modifier | Auth: Execute

Remove an attached modifier from this element by attachment ID

Removes a modifier/resource attachment by its row ID. The ID comes from the attachments or context API. This is the reverse of attach — called on the target element, not the source.

`restore`

Post /ops/restore | Auth: Admin

Restore element to a specific version

Automatically snapshots the current state before restoring (creates a ‘Before restore to vN’ version entry). Writes restored spec to git as .triform/spec.yaml. Git failures warn but don’t fail the operation — DB state is authoritative. Cannot restore deleted elements.

`schema`

Get /ops/schema | Auth: Read

Get element input/output schema (MCP tools/list compatible)

Returns type-level port schemas from the TypeRegistry — not instance-specific overrides. Includes direction (input/output), required flag, and JSON schema per port. Useful for understanding what data an element accepts and produces.

`source`

Get /ops/source | Auth: Read

Get any file’s content from the element’s git repository

Reads an arbitrary file from the element’s CAS-backed git tree by its relative path. Same store as readme, just generalized. Path safety: rejects .. traversal, leading /, and null bytes. Use this to view main.py for action elements, asset files for SPAs, etc. Returns empty content (not an error) if the file doesn’t exist.

`source_branches`

Get /ops/source/branches | Auth: Read

List Source branches for this element

Returns the standard draft/demo/live Source branches, their current commits, and promotion relationships. Use GET /api/{element_path}/ops/source/branches.

`source_promote`

Post /ops/source/promote | Auth: Write

Promote Source branch forward

Promotes draft to demo or demo to live through the generated element op path. Direct Git pushes to demo/live are blocked by Source policy.

`source_repair`

Post /ops/source/repair | Auth: Write

Inspect or repair the element Source index

Runs Source repair through the element operation path. Defaults to dry_run=true; set dry_run=false only after reviewing a dry-run report.

`source_status`

Get /ops/source/status | Auth: Read

Get Source control status for this element

Returns the branch-aware clone URL, checkout commands, current draft commit, child source-link count, portable export summary, Source health, warnings, and auth hints for the addressed element. Use the element-first path: GET /api/{element_path}/ops/source/status.

`source_validate`

Post /ops/source/validate | Auth: Read

Validate Source branch contents

Validates a Source branch before accepting local Git workflow changes or promotion. Defaults to branch=draft and rejects runtime data, generated output, secret material, and unreadable CAS refs.

`stats`

Get /ops/stats | Auth: Read

Get aggregate statistics for this element

Health status is computed: error if errors_per_day > 5 or success_rate < 0.8, warning if errors_per_day > 0 or success_rate < 0.95. Firing alerts escalate health to error/warning. Default period is ‘day’. Returns runs_per_day, success_rate, avg_duration_ms, and more.

`synthesize`

Post /ops/synthesize | Auth: Execute

Speak a line of text with a chosen voice

Synthesize speech from text using one of this mouth’s voices. The voice_id must match an entry in the mouth’s voices array; if omitted and only one voice exists it’s used automatically, otherwise an error is returned. Use PCM for the lowest latency (~0.7s time-to-first-audio).

`test`

Post /ops/test | Auth: Execute

Synthesize a short test phrase with a chosen voice

Round-trips text → audio and returns latency + audio data. Defaults to the first voice if voice_id is omitted.

`tree`

Get /ops/tree | Auth: Read

Get the element’s position in the graph — ancestors, children, references, and subtree statistics

Uses per-circle ElementGraph cache for O(1) lookups. Returns ancestors (containment chain), children (direct), members (references), referenced_by (reverse refs), attachments, and subtree stats. Default depth is 3, max is 10. Pass ?include_metadata=true for name/state on each node.

`update`

Patch /ops/update | Auth: Write

Update element

Partial update — send only the fields you want to change. spec, name, and intention are all independently optional. spec MUST be a JSON object when present; deep-merged into the existing spec by default. Empty {"spec":{}} preserves existing spec content but still records a new version (no-op for content, not for version state). To clear/replace the entire spec wholesale send {"spec":{...},"deep":false}. List-typed spec fields use replace semantics (the patch list replaces the existing list, no array merging). Coordinates Git + DB writes. Slug cannot be changed after creation.

`update_meta`

Patch /ops/update_meta | Auth: Write

Update element metadata (lightweight merge — does NOT bump version or snapshot spec)

Shallow JSONB merge into element.meta. Top-level keys in the provided value replace existing meta values; other keys are preserved. Used for UI metadata like canvas positions, panel state, viewer preferences. Wire-shape op_name is update_meta (distinct from update) so SSE subscribers + the cache auto-invalidator can distinguish lightweight metadata changes from spec edits without inspecting the payload. The MutatingElementStore wrapper stamps this op_name on the lifecycle event emitted by update_element_meta storage calls.

`version`

Get /ops/version | Auth: Read

Get current version or full history

Returns current version by default. Pass ?history=true for full version history (up to ?limit=N, default 50). Versions are backed by the element_versions table. Every spec update creates a new version entry.

Error Codes

Code	Class	Retryable	Description
`MOUTH_UNAVAILABLE`	internal	yes
`MOUTH_CREDENTIAL_MISSING`	auth	no
`MOUTH_VOICE_NOT_FOUND`	validation	no	Requested voice_id isn’t in this mouth’s voices array
`MOUTH_VOICE_AMBIGUOUS`	validation	no	Mouth has multiple voices but no voice_id specified
`MOUTH_VOICE_NOT_CLONED`	validation	no	Voice exists locally but provider_voice_id is missing — re-run clone-voice
`MOUTH_SAMPLE_UNUSABLE`	validation	no
`MOUTH_SAMPLE_TOO_LARGE`	validation	no	Sample exceeds 10 MB cap
`MOUTH_TEXT_TOO_LONG`	validation	no
`MOUTH_SYNTHESIS_FAILED`	internal	yes

Observability

Defined for this element

Metrics

mouth_synthesize_total
mouth_synthesize_latency_ms
mouth_synthesize_audio_ms

Pricing / cost

Inherited from intelligence

Operation costs

invoke: 10000 micro-AU

Set it up

Namestring: A label for this mouth (e.g. "Team voices", "Primary bank")
Providerstring: TTS provider
Modelstring: Model ID (e.g. voxtral-mini-tts-2603)
Audio formatstring