Mouth
The voice an agent speaks with — a text-to-speech model paired with a library of voices, living inside an Intelligence Lab so the lab's provider connection turns written agent output into spoken audio.
Working with it
Selecting a Mouth reveals its settings in the properties panel; it has no dedicated full-screen workbench.
How it appears
The same element type rendered as a definition, a circle instance, and a live workspace card.
When to use / not
When to use
- Giving an agent or automation a spoken voice — turning generated text into audio a caller or listener actually hears.
- Driving the speaking side of a phone or voice conversation, where conversation_pacing tunes turn-taking against the telephony bridge.
- Building and holding a library of cloned, preset, or instruction-based voices that agents select per line by (mouth, voice_id).
When not to use
- Turning incoming speech into text — that is the input half; use the sibling ears element for transcription.
- Plain text generation or chat completion with no audio — that is the brain element inside the same lab.
- Storing the provider endpoint and credentials yourself — a mouth nests in a lab and inherits the lab's connection; create the lab first.
Topology
Lives nested inside a parent element rather than standing alone — it is created in the context of its container.
Properties
providerstring- TTS provider
voice_modestring- How consumers (agents) pick a specific voice from this mouth: - cloning: choose from this mouth's voices[] array (Mistral Voxtral, ElevenLabs) - preset: name a provider preset (OpenAI "alloy", "echo", …) - instruction: free-text voice description Default follows the provider (cloning for mistral/elevenlabs/custom, preset for openai/google) but can be overridden for custom mouths.
model_idstring- Model identifier sent to the provider API (e.g. voxtral-mini-tts-2603)
display_namestring- Human label for this mouth (e.g. "Primary voice bank")
credential_refstring- Reference to secret element with provider API key. Optional — falls back to platform MISTRAL_API_KEY (or provider-equivalent) when unset.
response_formatstring- Audio encoding returned by synthesis. WAV is the default because that's what the browser's decodeAudioData can parse reliably — raw PCM is headerless and silently fails to decode in Web Audio. Switch to PCM only for server-side / RTP pipelines that know the sample rate and channel count out of band.
sample_rateinteger- Sample rate in Hz for PCM output. Ignored for compressed formats.
streamingboolean- Stream audio chunks as they're synthesized. Recommended for conversational UX.
conversation_pacingobject- Per-mouth turn-taking rhythm for telephony calls. Each silence threshold is how long the bridge waits after voice activity stops before declaring end-of-turn — punctuation in the live STT transcript drives which threshold applies, so the agent can react fast on completed thoughts and stay patient on mid-sentence pauses.
pricingobject- Cost per million tokens (USD) for billing reference
Capabilities
Defined for this element
- Observe
Operations
- activityGET
- attachmentsGET
- batch_statsGET
- clone-voicePOST
- composePOST
- contextGET
- createPOST
- deleteDELETE
- delete-voicePOST
- disablePOST
- enablePOST
- export_bundleGET
- getGET
- import_bundlePOST
- infoGET
- intentionGET
- list-voicesGET
- promotePOST
- readmeGET
- readme_updatePOST
- remove-modifierPOST
- restorePOST
- schemaGET
- sourceGET
- source_branchesGET
- source_promotePOST
- source_repairPOST
- source_statusGET
- source_validatePOST
- statsGET
- synthesizePOST
- testPOST
- treeGET
- updatePATCH
- update_metaPATCH
- versionGET
Ports
Inputs
- requestrequest
- inforequest
- resultevent
Composition
Validation rules
- Mouth model id required
Mouth (mouth)
Category: intelligence | Form: | Symbol: Mo
A text-to-speech voice within an Intelligence Lab
A Mouth is a specific TTS model + voice combination (e.g. Mistral Voxtral speaking in a cloned voice). Voice data (sample audio, provider-returned voice_id) lives on the mouth itself because it’s model-specific. To give an agent a different voice, create another mouth element. Agents select a mouth by ID; the runtime finds the parent lab and handles synthesis.
Guide
A text-to-speech voice within an Intelligence Lab — a TTS model bound to a library of voices that agents speak through
What It Does
A Mouth is the output / synthesis half of an Intelligence Lab. It represents a specific TTS model (e.g. Mistral Voxtral) plus a library of voices usable through that model. Where a sibling ears element turns speech into text (input / transcription), a mouth turns an agent’s text output into spoken audio. An agent selects a mouth by ID; the runtime resolves which lab the mouth belongs to and uses that lab’s connection details to synthesize speech.
One mouth can hold many voices — cloned, preset, or instruction-based depending on the provider. Voice data (sample audio, the provider-returned provider_voice_id) lives on the mouth itself because it is model-specific and cannot be reused across providers. Agents reference a specific voice via the pair (mouth_slug, voice_id). To give an agent a different voice, add another voice to the mouth (or create another mouth element).
Mouths are atoms with residence: nested — they live inside a Lab element, which supplies the API endpoint and credentials, and sit alongside brain (chat completions) and ears (transcription) under the same lab. Because cloned voice data is biometric-adjacent, mouth visibility is restricted to collaborators (allowed_visibility: [collaborator]).
Element Definition
| Property | Value |
|---|---|
| Type | mouth |
| Category | intelligence |
| Form | atom |
| Residence | nested |
| Symbol | Mo / #F97316 |
| Activity type | resource |
| Streaming | supported (supports_streaming: true) |
| Visibility | collaborator only |
Properties
| Field | Type | Default | Description |
|---|---|---|---|
provider | string (enum: mistral, openai, elevenlabs, google, custom) | mistral | TTS provider |
voice_mode | string (enum: cloning, preset, instruction) | cloning | How agents pick a voice: choose from voices[] (cloning), name a provider preset (preset), or free-text description (instruction) |
model_id | string | — | Model identifier sent to the provider API (e.g. voxtral-mini-tts-2603) |
display_name | string | — | Human label for this mouth (e.g. “Primary voice bank”) |
voices | array | [] | Voices this mouth can speak with. Managed via the clone-voice / delete-voice ops |
credential_ref | string | — | Reference to a secret element holding the provider API key. Optional — falls back to a platform key (e.g. MISTRAL_API_KEY) when unset |
response_format | string (enum: pcm, mp3, opus, wav, flac) | wav | Audio encoding returned by synthesis. WAV is the default because the browser’s decodeAudioData can parse it reliably |
sample_rate | integer (enum: 8000, 16000, 22050, 24000, 44100, 48000) | 24000 | Sample rate in Hz for PCM output. Ignored for compressed formats |
streaming | boolean | true | Stream audio chunks as they’re synthesized |
conversation_pacing | object | — | Per-mouth turn-taking rhythm consumed by the telephony bridge (silence thresholds + speculative generation) |
pricing | object | — | input_per_mtok / output_per_mtok cost reference (USD per million tokens) |
Each entry in voices[] carries: id (required, stable UUID referenced by agents), name (required), provider_voice_id, sample_file_ref, sample_content_hash, sample_content_type, instruction, languages, gender, tags, speed (0.5–2.0, default 1.0 — a post-synthesis playback multiplier, not sent upstream), and created_at.
The conversation_pacing object holds silence_after_period_ms (default 1500), silence_after_question_ms (default 1000), silence_default_ms (default 2500), speculative_generation (default true), and speculative_prefill_words (0–10, default 3).
States
draft → cloning → active → error. Initial state is draft.
Capabilities
text-to-speechvoice-cloning— clone voices from audio samples (cloning providers only)voice-library— hold multiple voices on one mouthstreaming
Ports
| Direction | Port | Schema | Description |
|---|---|---|---|
| Input | request | SpeechRequest | Text + voice selection to synthesize |
| Output | info | MouthInfo | Mouth metadata (required) |
| Output | result | SpeechResponse | Synthesized audio (event port) |
Modifiers
Attaches rate-limit.
Error Codes
| Code | Class | Retryable | Meaning |
|---|---|---|---|
MOUTH_UNAVAILABLE | internal | yes | Mouth could not be reached |
MOUTH_CREDENTIAL_MISSING | auth | no | No usable provider credential |
MOUTH_VOICE_NOT_FOUND | validation | no | Requested voice_id isn’t in this mouth’s voices array |
MOUTH_VOICE_AMBIGUOUS | validation | no | Mouth has multiple voices but no voice_id was specified |
MOUTH_VOICE_NOT_CLONED | validation | no | Voice exists locally but provider_voice_id is missing — re-run clone-voice |
MOUTH_SAMPLE_UNUSABLE | validation | no | Provided sample could not be used |
MOUTH_SAMPLE_TOO_LARGE | validation | no | Sample exceeds the 10 MB cap |
MOUTH_TEXT_TOO_LONG | validation | no | Input text exceeds the synthesis limit |
MOUTH_SYNTHESIS_FAILED | internal | yes | Synthesis failed downstream |
Operations
info
GET info · auth: read
Returns mouth metadata — provider, model_id, display_name, voice_count, default_voice_id, streaming, response_format, sample_rate. Used by the mouth picker UI.
list-voices
GET list-voices · auth: read
Lists every voice configured on this mouth, each with id, name, provider_voice_id, languages, gender, tags, and created_at. Used by the agent voice picker.
synthesize
POST synthesize · auth: execute
Speaks a line of text with a chosen voice. Input requires text; optional voice_id (must match an entry in the mouth’s voices array — when omitted and only one voice exists, that voice is used automatically, otherwise an error is returned) and response_format (overrides the mouth default; one of pcm, mp3, opus, wav, flac). Returns audio_data_b64, content_type, duration_ms, sample_rate, voice_id, and cost_au.
clone-voice
POST clone-voice · auth: write
Adds a new voice to this mouth’s voices array. Input requires name; accepts either a files-element reference (sample_file_ref) or inline base64 audio (sample_data_b64 with sample_content_type, 10 MB cap). For cloning providers (Mistral Voxtral), the sample is uploaded to the provider and the returned voice id is stored; for preset providers (OpenAI), set provider_voice_id to the preset name directly without a sample. Also accepts instruction, languages, gender, and tags. Returns the new voice_id, provider_voice_id, and name.
delete-voice
POST delete-voice · auth: write
Removes a voice (by voice_id) from the mouth’s array and asks the provider to delete it upstream for cloning providers. Returns deleted and voice_id. Agents still referencing the removed voice_id will fail to synthesize until reconfigured.
test
POST test · auth: execute
Synthesizes a short test phrase with a chosen voice (defaults text to “Hello, this is a test of the text to speech connection.” and falls back to the first voice if voice_id is omitted). Returns success, latency_ms, duration_ms, audio_data_b64, voice_id, and error — useful for verifying connectivity before assigning the mouth to an agent.
Quick Start
Create a mouth (inside a lab)
POST /api/{circle}/{lab-element}/
Content-Type: application/json
{
"element_type": "mouth",
"slug": "support-voice",
"name": "Support Voice",
"spec": {
"provider": "mistral",
"voice_mode": "cloning",
"model_id": "voxtral-mini-tts-2603",
"display_name": "Primary voice bank",
"response_format": "wav",
"sample_rate": 24000,
"streaming": true
}
}
Add a voice, then speak
POST /api/{circle}/{lab}/{mouth}/ops/clone-voice
{
"name": "Narrator",
"sample_data_b64": "<base64 audio, max 10 MB>",
"sample_content_type": "audio/webm",
"gender": "neutral"
}
POST /api/{circle}/{lab}/{mouth}/ops/synthesize
{
"text": "Welcome back. Your report is ready.",
"voice_id": "<id returned by clone-voice>"
}
The response carries audio_data_b64 plus content_type, duration_ms, sample_rate, and cost_au.
Common Mistakes
No model_id. A mouth without a model_id can’t synthesize (validation flags this). Set the exact model identifier the provider expects (e.g. voxtral-mini-tts-2603).
No voices yet. A fresh mouth has an empty voices[] array — add a voice via clone-voice before any agent can speak through it.
Omitting voice_id with multiple voices. When a mouth holds more than one voice, synthesize requires an explicit voice_id; omitting it returns MOUTH_VOICE_AMBIGUOUS. The omit-and-default shortcut only works when exactly one voice is configured.
Referencing an uncloned voice. If a voice entry exists locally but has no provider_voice_id (cloning provider), synthesis fails with MOUTH_VOICE_NOT_CLONED — re-run clone-voice for that voice.
Oversized samples. Inline sample_data_b64 is capped at 10 MB; larger samples return MOUTH_SAMPLE_TOO_LARGE. Use a sample_file_ref (files-element upload) for large clone sources.
Expecting raw PCM to play in the browser. wav is the default response_format because raw pcm is headerless and silently fails to decode in Web Audio. Only switch to pcm for server-side / RTP pipelines that know the sample rate and channel count out of band.
Relationships
- Attaches to: rate-limit
Capabilities
- text-to-speech:
- voice-cloning: Clone voices from audio samples (cloning providers only)
- voice-library: Hold multiple voices on one mouth
- streaming:
Properties
| Property | Type | Default | Description |
|---|---|---|---|
provider | string | "mistral" | TTS provider |
voice_mode | string | "cloning" | How consumers (agents) pick a specific voice from this mouth: - cloning: choose from this mouth’s voices[] array (Mistral Voxtral, ElevenLabs) - preset: name a provider preset (OpenAI “alloy”, “echo”, …) - instruction: free-text voice description Default follows the provider (cloning for mistral/elevenlabs/custom, preset for openai/google) but can be overridden for custom mouths. |
model_id | string | — | Model identifier sent to the provider API (e.g. voxtral-mini-tts-2603) |
display_name | string | — | Human label for this mouth (e.g. “Primary voice bank”) |
voices | array | [] | Voices this mouth can speak with. Managed via clone-voice / delete-voice ops. |
credential_ref | string | — | Reference to secret element with provider API key. Optional — falls back to platform MISTRAL_API_KEY (or provider-equivalent) when unset. |
response_format | string | "wav" | Audio encoding returned by synthesis. WAV is the default because that’s what the browser’s decodeAudioData can parse reliably — raw PCM is headerless and silently fails to decode in Web Audio. Switch to PCM only for server-side / RTP pipelines that know the sample rate and channel count out of band. |
sample_rate | integer | 24000 | Sample rate in Hz for PCM output. Ignored for compressed formats. |
streaming | boolean | true | Stream audio chunks as they’re synthesized. Recommended for conversational UX. |
conversation_pacing | object | — | Per-mouth turn-taking rhythm for telephony calls. Each silence threshold is how long the bridge waits after voice activity stops before declaring end-of-turn — punctuation in the live STT transcript drives which threshold applies, so the agent can react fast on completed thoughts and stay patient on mid-sentence pauses. |
pricing | object | — | Cost per million tokens (USD) for billing reference |
Operations
activity
Get /ops/activity | Auth: Read
Get activity events for this element
Scope depends on element capabilities: individual elements query by element_id, project-form elements with activity-scope-members include member activities, circle-level elements with activity-scope-all query the entire circle. Gracefully returns empty list if activities table is missing (old circles).
attachments
Get /ops/attachments | Auth: Read
List all modifiers and resources attached to this element
Returns both modifiers (policy enforcement) and resources (data injection) with is_modifier flag to distinguish. Items in the generated MODIFIER_TYPES list are modifiers; everything else is a resource. Includes cascade_policy and version pin info.
batch_stats
Get /ops/batch_stats | Auth: Read
Get per-element statistics for all children of this element
Returns per-child stats plus an aggregate. Most meaningful on compound or manifest form elements (repositories, circles, projects); atoms have no children so the result is an empty children array with a zeroed aggregate. Uses efficient GROUP BY SQL. Weighted averages for eval scores.
clone-voice
Post /ops/clone-voice | Auth: Write
Add a new voice to this mouth from a recording or uploaded file
Creates a new entry in this mouth’s voices array. Accepts either a files-element reference (upload path) or inline base64 audio (direct recording path, 10 MB cap). For cloning providers (Mistral Voxtral), the sample is uploaded to the provider and the returned voice_id is stored. For preset providers (OpenAI), set
provider_voice_iddirectly without a sample. Returns the new voice’s id.
compose
Post /ops/compose | Auth: Execute
Batch add and remove modifiers on this element in a single call
Declarative composition: add modifiers by ref path (slug or path@version) and remove by attachment ID, all in one atomic call on the target element. Each ‘add’ entry resolves the source element, validates topology, attaches with optional priority and cascade policy. Each ‘remove’ entry deletes the attachment row. Returns a summary of what was added and removed. Example: compose({ add: [{ref: “my-prompt”}, {ref: “rate-limit/api@v2”, priority: 50}], remove: [{attachment_id: “uuid”}] })
context
Get /ops/context | Auth: Read
Get connected elements (graph traversal)
Graph traversal showing all connected elements with their relationship type (contains, contained_by, references, referenced_by, attaches, etc.). Use ?depth=N to control traversal depth (default 1) and ?types=actor,data to filter by element types.
create
Post /ops/create | Auth: Write
Create child element
POST to the parent path — element_type goes in the request body, NOT the URL. Both element_type and slug are required and must be non-empty. Name is derived from slug if omitted. Writes to both Git and PostgreSQL. All elements are stored flat under the circle — no intermediate library wrapper rows.
delete
Delete /ops/delete | Auth: Admin
Delete element (soft delete)
Soft delete — sets state to ‘deleted’ but retains the record. Cannot delete elements that have children (has_no_bond precondition) or active runs. Requires admin auth and confirmation.
delete-voice
Post /ops/delete-voice | Auth: Write
Remove a voice from this mouth
Removes the voice from the mouth’s array and asks the provider to delete it upstream (cloning providers). Agents still referencing this voice_id will fail to synthesize until reconfigured.
disable
Post /ops/disable | Auth: Admin
Disable element (hides and prevents use)
Idempotent — safe to call on already-disabled elements. Optionally pass a reason string. Disabled elements cannot be invoked or executed. Inverse of enable.
enable
Post /ops/enable | Auth: Admin
Enable element (makes usable and visible)
Idempotent — safe to call on already-enabled elements. Transitions element to ready/enabled state. Cannot enable deleted elements. Inverse of disable.
export_bundle
Get /ops/export/bundle | Auth: Read
Export element as downloadable git bundle
On non-root-namespace elements, returns a binary git bundle. On root-namespace (circle) elements, dispatch hands off to the circle’s own export_bundle op, which returns a multi-element JSON envelope with one base64 bundle per child element — this is intentional, not an error.
get
Get /ops/get | Auth: Read
Get element details
Element is already resolved by the routing layer — this returns the cached element, not a fresh DB query. Use the path /api/{circle}/{slug} to address elements.
import_bundle
Post /ops/import/bundle | Auth: Write
Import git bundle into element
Accepts a base64-encoded git bundle in the JSON bundle_base64 field. Use overwrite=true to replace existing elements with same slug (default skips duplicates). Imported elements get new UUIDs. Returns counts of imported/skipped elements and any errors.
info
Get /ops/info | Auth: Read
Get mouth metadata (provider, model, voices count)
Returns provider, model_id, number of voices, and defaults — used by the mouth picker UI.
intention
Get /ops/intention | Auth: Read
Get element intention with full inheritance chain
Returns three levels: direct (this element’s intention), inherited (from category and root), and resolved (final merged intention). Useful for understanding an element’s purpose in context of its hierarchy.
list-voices
Get /ops/list-voices | Auth: Read
List all voices configured on this mouth
Returns every voice in this mouth’s library with name, id, and metadata. Used by the agent voice picker.
promote
Post /ops/promote | Auth: Admin
Promote element configuration to a target environment
Only for manifest-form elements (projects). Environments advance: dev → demo → live. dev→demo requires member+ role, demo→live requires admin. Freezes member versions at promotion time (creates snapshot). Persists environment config to spec.environments.
readme
Get /ops/readme | Auth: Read
Get element README.md content
Reads README.md from the element’s git repository. Returns empty content (not an error) if no README exists. Always returns markdown format.
readme_update
Post /ops/readme_update | Auth: Write
Update element README.md content
Creates or overwrites README.md in the element’s git repo. Commits to the draft branch. Content must be provided as a markdown string.
remove-modifier
Post /ops/remove-modifier | Auth: Execute
Remove an attached modifier from this element by attachment ID
Removes a modifier/resource attachment by its row ID. The ID comes from the attachments or context API. This is the reverse of attach — called on the target element, not the source.
restore
Post /ops/restore | Auth: Admin
Restore element to a specific version
Automatically snapshots the current state before restoring (creates a ‘Before restore to vN’ version entry). Writes restored spec to git as .triform/spec.yaml. Git failures warn but don’t fail the operation — DB state is authoritative. Cannot restore deleted elements.
schema
Get /ops/schema | Auth: Read
Get element input/output schema (MCP tools/list compatible)
Returns type-level port schemas from the TypeRegistry — not instance-specific overrides. Includes direction (input/output), required flag, and JSON schema per port. Useful for understanding what data an element accepts and produces.
source
Get /ops/source | Auth: Read
Get any file’s content from the element’s git repository
Reads an arbitrary file from the element’s CAS-backed git tree by its relative path. Same store as
readme, just generalized. Path safety: rejects..traversal, leading/, and null bytes. Use this to viewmain.pyfor action elements, asset files for SPAs, etc. Returns empty content (not an error) if the file doesn’t exist.
source_branches
Get /ops/source/branches | Auth: Read
List Source branches for this element
Returns the standard draft/demo/live Source branches, their current commits, and promotion relationships. Use GET /api/{element_path}/ops/source/branches.
source_promote
Post /ops/source/promote | Auth: Write
Promote Source branch forward
Promotes draft to demo or demo to live through the generated element op path. Direct Git pushes to demo/live are blocked by Source policy.
source_repair
Post /ops/source/repair | Auth: Write
Inspect or repair the element Source index
Runs Source repair through the element operation path. Defaults to dry_run=true; set dry_run=false only after reviewing a dry-run report.
source_status
Get /ops/source/status | Auth: Read
Get Source control status for this element
Returns the branch-aware clone URL, checkout commands, current draft commit, child source-link count, portable export summary, Source health, warnings, and auth hints for the addressed element. Use the element-first path: GET /api/{element_path}/ops/source/status.
source_validate
Post /ops/source/validate | Auth: Read
Validate Source branch contents
Validates a Source branch before accepting local Git workflow changes or promotion. Defaults to branch=draft and rejects runtime data, generated output, secret material, and unreadable CAS refs.
stats
Get /ops/stats | Auth: Read
Get aggregate statistics for this element
Health status is computed: error if errors_per_day > 5 or success_rate < 0.8, warning if errors_per_day > 0 or success_rate < 0.95. Firing alerts escalate health to error/warning. Default period is ‘day’. Returns runs_per_day, success_rate, avg_duration_ms, and more.
synthesize
Post /ops/synthesize | Auth: Execute
Speak a line of text with a chosen voice
Synthesize speech from text using one of this mouth’s voices. The voice_id must match an entry in the mouth’s voices array; if omitted and only one voice exists it’s used automatically, otherwise an error is returned. Use PCM for the lowest latency (~0.7s time-to-first-audio).
test
Post /ops/test | Auth: Execute
Synthesize a short test phrase with a chosen voice
Round-trips text → audio and returns latency + audio data. Defaults to the first voice if voice_id is omitted.
tree
Get /ops/tree | Auth: Read
Get the element’s position in the graph — ancestors, children, references, and subtree statistics
Uses per-circle ElementGraph cache for O(1) lookups. Returns ancestors (containment chain), children (direct), members (references), referenced_by (reverse refs), attachments, and subtree stats. Default depth is 3, max is 10. Pass ?include_metadata=true for name/state on each node.
update
Patch /ops/update | Auth: Write
Update element
Partial update — send only the fields you want to change.
spec,name, andintentionare all independently optional.specMUST be a JSON object when present; deep-merged into the existing spec by default. Empty{"spec":{}}preserves existing spec content but still records a new version (no-op for content, not for version state). To clear/replace the entire spec wholesale send{"spec":{...},"deep":false}. List-typed spec fields use replace semantics (the patch list replaces the existing list, no array merging). Coordinates Git + DB writes. Slug cannot be changed after creation.
update_meta
Patch /ops/update_meta | Auth: Write
Update element metadata (lightweight merge — does NOT bump version or snapshot spec)
Shallow JSONB merge into element.meta. Top-level keys in the provided value replace existing meta values; other keys are preserved. Used for UI metadata like canvas positions, panel state, viewer preferences. Wire-shape op_name is
update_meta(distinct fromupdate) so SSE subscribers + the cache auto-invalidator can distinguish lightweight metadata changes from spec edits without inspecting the payload. The MutatingElementStore wrapper stamps this op_name on the lifecycle event emitted byupdate_element_metastorage calls.
version
Get /ops/version | Auth: Read
Get current version or full history
Returns current version by default. Pass ?history=true for full version history (up to ?limit=N, default 50). Versions are backed by the element_versions table. Every spec update creates a new version entry.
Error Codes
| Code | Class | Retryable | Description |
|---|---|---|---|
MOUTH_UNAVAILABLE | internal | yes | |
MOUTH_CREDENTIAL_MISSING | auth | no | |
MOUTH_VOICE_NOT_FOUND | validation | no | Requested voice_id isn’t in this mouth’s voices array |
MOUTH_VOICE_AMBIGUOUS | validation | no | Mouth has multiple voices but no voice_id specified |
MOUTH_VOICE_NOT_CLONED | validation | no | Voice exists locally but provider_voice_id is missing — re-run clone-voice |
MOUTH_SAMPLE_UNUSABLE | validation | no | |
MOUTH_SAMPLE_TOO_LARGE | validation | no | Sample exceeds 10 MB cap |
MOUTH_TEXT_TOO_LONG | validation | no | |
MOUTH_SYNTHESIS_FAILED | internal | yes |
Observability
Defined for this element
Metrics
- mouth_synthesize_total
- mouth_synthesize_latency_ms
- mouth_synthesize_audio_ms
Pricing / cost
Inherited from intelligence
Operation costs
- invoke: 10000 micro-AU
Set it up
- Namestring
- A label for this mouth (e.g. "Team voices", "Primary bank")
- Providerstring
- TTS provider
- Modelstring
- Model ID (e.g. voxtral-mini-tts-2603)
- Audio formatstring