Download all docs
intelligence

Mouth

The voice an agent speaks with — a text-to-speech model paired with a library of voices, living inside an Intelligence Lab so the lab's provider connection turns written agent output into spoken audio.

Working with it

Selecting a Mouth reveals its settings in the properties panel; it has no dedicated full-screen workbench.

How it appears

The same element type rendered as a definition, a circle instance, and a live workspace card.

Mo
type

Mouth

A text-to-speech voice within an Intelligence Lab

intelligenceatomdefinition

When to use / not

When to use

  • Giving an agent or automation a spoken voice — turning generated text into audio a caller or listener actually hears.
  • Driving the speaking side of a phone or voice conversation, where conversation_pacing tunes turn-taking against the telephony bridge.
  • Building and holding a library of cloned, preset, or instruction-based voices that agents select per line by (mouth, voice_id).

When not to use

  • Turning incoming speech into text — that is the input half; use the sibling ears element for transcription.
  • Plain text generation or chat completion with no audio — that is the brain element inside the same lab.
  • Storing the provider endpoint and credentials yourself — a mouth nests in a lab and inherits the lab's connection; create the lab first.

Topology

Lives nested inside a parent element rather than standing alone — it is created in the context of its container.

Properties

providerstring
TTS provider
voice_modestring
How consumers (agents) pick a specific voice from this mouth: - cloning: choose from this mouth's voices[] array (Mistral Voxtral, ElevenLabs) - preset: name a provider preset (OpenAI "alloy", "echo", …) - instruction: free-text voice description Default follows the provider (cloning for mistral/elevenlabs/custom, preset for openai/google) but can be overridden for custom mouths.
model_idstring
Model identifier sent to the provider API (e.g. voxtral-mini-tts-2603)
display_namestring
Human label for this mouth (e.g. "Primary voice bank")
credential_refstring
Reference to secret element with provider API key. Optional — falls back to platform MISTRAL_API_KEY (or provider-equivalent) when unset.
response_formatstring
Audio encoding returned by synthesis. WAV is the default because that's what the browser's decodeAudioData can parse reliably — raw PCM is headerless and silently fails to decode in Web Audio. Switch to PCM only for server-side / RTP pipelines that know the sample rate and channel count out of band.
sample_rateinteger
Sample rate in Hz for PCM output. Ignored for compressed formats.
streamingboolean
Stream audio chunks as they're synthesized. Recommended for conversational UX.
conversation_pacingobject
Per-mouth turn-taking rhythm for telephony calls. Each silence threshold is how long the bridge waits after voice activity stops before declaring end-of-turn — punctuation in the live STT transcript drives which threshold applies, so the agent can react fast on completed thoughts and stay patient on mid-sentence pauses.
pricingobject
Cost per million tokens (USD) for billing reference

Capabilities

Defined for this element
  • Observe

Operations

  • activityGET
  • attachmentsGET
  • batch_statsGET
  • clone-voicePOST
  • composePOST
  • contextGET
  • createPOST
  • deleteDELETE
  • delete-voicePOST
  • disablePOST
  • enablePOST
  • export_bundleGET
  • getGET
  • import_bundlePOST
  • infoGET
  • intentionGET
  • list-voicesGET
  • promotePOST
  • readmeGET
  • readme_updatePOST
  • remove-modifierPOST
  • restorePOST
  • schemaGET
  • sourceGET
  • source_branchesGET
  • source_promotePOST
  • source_repairPOST
  • source_statusGET
  • source_validatePOST
  • statsGET
  • synthesizePOST
  • testPOST
  • treeGET
  • updatePATCH
  • update_metaPATCH
  • versionGET

Ports

Inputs

  • requestrequest
  • inforequest
  • resultevent

Composition

Attaches
Referenced by

Validation rules

  • Mouth model id required

Mouth (mouth)

Category: intelligence | Form: | Symbol: Mo

A text-to-speech voice within an Intelligence Lab

A Mouth is a specific TTS model + voice combination (e.g. Mistral Voxtral speaking in a cloned voice). Voice data (sample audio, provider-returned voice_id) lives on the mouth itself because it’s model-specific. To give an agent a different voice, create another mouth element. Agents select a mouth by ID; the runtime finds the parent lab and handles synthesis.

Guide

A text-to-speech voice within an Intelligence Lab — a TTS model bound to a library of voices that agents speak through

What It Does

A Mouth is the output / synthesis half of an Intelligence Lab. It represents a specific TTS model (e.g. Mistral Voxtral) plus a library of voices usable through that model. Where a sibling ears element turns speech into text (input / transcription), a mouth turns an agent’s text output into spoken audio. An agent selects a mouth by ID; the runtime resolves which lab the mouth belongs to and uses that lab’s connection details to synthesize speech.

One mouth can hold many voices — cloned, preset, or instruction-based depending on the provider. Voice data (sample audio, the provider-returned provider_voice_id) lives on the mouth itself because it is model-specific and cannot be reused across providers. Agents reference a specific voice via the pair (mouth_slug, voice_id). To give an agent a different voice, add another voice to the mouth (or create another mouth element).

Mouths are atoms with residence: nested — they live inside a Lab element, which supplies the API endpoint and credentials, and sit alongside brain (chat completions) and ears (transcription) under the same lab. Because cloned voice data is biometric-adjacent, mouth visibility is restricted to collaborators (allowed_visibility: [collaborator]).

Element Definition

PropertyValue
Typemouth
Categoryintelligence
Formatom
Residencenested
SymbolMo / #F97316
Activity typeresource
Streamingsupported (supports_streaming: true)
Visibilitycollaborator only

Properties

FieldTypeDefaultDescription
providerstring (enum: mistral, openai, elevenlabs, google, custom)mistralTTS provider
voice_modestring (enum: cloning, preset, instruction)cloningHow agents pick a voice: choose from voices[] (cloning), name a provider preset (preset), or free-text description (instruction)
model_idstringModel identifier sent to the provider API (e.g. voxtral-mini-tts-2603)
display_namestringHuman label for this mouth (e.g. “Primary voice bank”)
voicesarray[]Voices this mouth can speak with. Managed via the clone-voice / delete-voice ops
credential_refstringReference to a secret element holding the provider API key. Optional — falls back to a platform key (e.g. MISTRAL_API_KEY) when unset
response_formatstring (enum: pcm, mp3, opus, wav, flac)wavAudio encoding returned by synthesis. WAV is the default because the browser’s decodeAudioData can parse it reliably
sample_rateinteger (enum: 8000, 16000, 22050, 24000, 44100, 48000)24000Sample rate in Hz for PCM output. Ignored for compressed formats
streamingbooleantrueStream audio chunks as they’re synthesized
conversation_pacingobjectPer-mouth turn-taking rhythm consumed by the telephony bridge (silence thresholds + speculative generation)
pricingobjectinput_per_mtok / output_per_mtok cost reference (USD per million tokens)

Each entry in voices[] carries: id (required, stable UUID referenced by agents), name (required), provider_voice_id, sample_file_ref, sample_content_hash, sample_content_type, instruction, languages, gender, tags, speed (0.5–2.0, default 1.0 — a post-synthesis playback multiplier, not sent upstream), and created_at.

The conversation_pacing object holds silence_after_period_ms (default 1500), silence_after_question_ms (default 1000), silence_default_ms (default 2500), speculative_generation (default true), and speculative_prefill_words (0–10, default 3).

States

draftcloningactiveerror. Initial state is draft.

Capabilities

  • text-to-speech
  • voice-cloning — clone voices from audio samples (cloning providers only)
  • voice-library — hold multiple voices on one mouth
  • streaming

Ports

DirectionPortSchemaDescription
InputrequestSpeechRequestText + voice selection to synthesize
OutputinfoMouthInfoMouth metadata (required)
OutputresultSpeechResponseSynthesized audio (event port)

Modifiers

Attaches rate-limit.

Error Codes

CodeClassRetryableMeaning
MOUTH_UNAVAILABLEinternalyesMouth could not be reached
MOUTH_CREDENTIAL_MISSINGauthnoNo usable provider credential
MOUTH_VOICE_NOT_FOUNDvalidationnoRequested voice_id isn’t in this mouth’s voices array
MOUTH_VOICE_AMBIGUOUSvalidationnoMouth has multiple voices but no voice_id was specified
MOUTH_VOICE_NOT_CLONEDvalidationnoVoice exists locally but provider_voice_id is missing — re-run clone-voice
MOUTH_SAMPLE_UNUSABLEvalidationnoProvided sample could not be used
MOUTH_SAMPLE_TOO_LARGEvalidationnoSample exceeds the 10 MB cap
MOUTH_TEXT_TOO_LONGvalidationnoInput text exceeds the synthesis limit
MOUTH_SYNTHESIS_FAILEDinternalyesSynthesis failed downstream

Operations

info

GET info · auth: read

Returns mouth metadata — provider, model_id, display_name, voice_count, default_voice_id, streaming, response_format, sample_rate. Used by the mouth picker UI.

list-voices

GET list-voices · auth: read

Lists every voice configured on this mouth, each with id, name, provider_voice_id, languages, gender, tags, and created_at. Used by the agent voice picker.

synthesize

POST synthesize · auth: execute

Speaks a line of text with a chosen voice. Input requires text; optional voice_id (must match an entry in the mouth’s voices array — when omitted and only one voice exists, that voice is used automatically, otherwise an error is returned) and response_format (overrides the mouth default; one of pcm, mp3, opus, wav, flac). Returns audio_data_b64, content_type, duration_ms, sample_rate, voice_id, and cost_au.

clone-voice

POST clone-voice · auth: write

Adds a new voice to this mouth’s voices array. Input requires name; accepts either a files-element reference (sample_file_ref) or inline base64 audio (sample_data_b64 with sample_content_type, 10 MB cap). For cloning providers (Mistral Voxtral), the sample is uploaded to the provider and the returned voice id is stored; for preset providers (OpenAI), set provider_voice_id to the preset name directly without a sample. Also accepts instruction, languages, gender, and tags. Returns the new voice_id, provider_voice_id, and name.

delete-voice

POST delete-voice · auth: write

Removes a voice (by voice_id) from the mouth’s array and asks the provider to delete it upstream for cloning providers. Returns deleted and voice_id. Agents still referencing the removed voice_id will fail to synthesize until reconfigured.

test

POST test · auth: execute

Synthesizes a short test phrase with a chosen voice (defaults text to “Hello, this is a test of the text to speech connection.” and falls back to the first voice if voice_id is omitted). Returns success, latency_ms, duration_ms, audio_data_b64, voice_id, and error — useful for verifying connectivity before assigning the mouth to an agent.

Quick Start

Create a mouth (inside a lab)

POST /api/{circle}/{lab-element}/
Content-Type: application/json

{
  "element_type": "mouth",
  "slug": "support-voice",
  "name": "Support Voice",
  "spec": {
    "provider": "mistral",
    "voice_mode": "cloning",
    "model_id": "voxtral-mini-tts-2603",
    "display_name": "Primary voice bank",
    "response_format": "wav",
    "sample_rate": 24000,
    "streaming": true
  }
}

Add a voice, then speak

POST /api/{circle}/{lab}/{mouth}/ops/clone-voice
{
  "name": "Narrator",
  "sample_data_b64": "<base64 audio, max 10 MB>",
  "sample_content_type": "audio/webm",
  "gender": "neutral"
}
POST /api/{circle}/{lab}/{mouth}/ops/synthesize
{
  "text": "Welcome back. Your report is ready.",
  "voice_id": "<id returned by clone-voice>"
}

The response carries audio_data_b64 plus content_type, duration_ms, sample_rate, and cost_au.

Common Mistakes

No model_id. A mouth without a model_id can’t synthesize (validation flags this). Set the exact model identifier the provider expects (e.g. voxtral-mini-tts-2603).

No voices yet. A fresh mouth has an empty voices[] array — add a voice via clone-voice before any agent can speak through it.

Omitting voice_id with multiple voices. When a mouth holds more than one voice, synthesize requires an explicit voice_id; omitting it returns MOUTH_VOICE_AMBIGUOUS. The omit-and-default shortcut only works when exactly one voice is configured.

Referencing an uncloned voice. If a voice entry exists locally but has no provider_voice_id (cloning provider), synthesis fails with MOUTH_VOICE_NOT_CLONED — re-run clone-voice for that voice.

Oversized samples. Inline sample_data_b64 is capped at 10 MB; larger samples return MOUTH_SAMPLE_TOO_LARGE. Use a sample_file_ref (files-element upload) for large clone sources.

Expecting raw PCM to play in the browser. wav is the default response_format because raw pcm is headerless and silently fails to decode in Web Audio. Only switch to pcm for server-side / RTP pipelines that know the sample rate and channel count out of band.

Relationships

  • Attaches to: rate-limit

Capabilities

  • text-to-speech:
  • voice-cloning: Clone voices from audio samples (cloning providers only)
  • voice-library: Hold multiple voices on one mouth
  • streaming:

Properties

PropertyTypeDefaultDescription
providerstring"mistral"TTS provider
voice_modestring"cloning"How consumers (agents) pick a specific voice from this mouth:
- cloning: choose from this mouth’s voices[] array
(Mistral Voxtral, ElevenLabs)
- preset: name a provider preset (OpenAI “alloy”, “echo”, …)
- instruction: free-text voice description
Default follows the provider (cloning for mistral/elevenlabs/custom, preset for openai/google) but can be overridden for custom mouths.
model_idstringModel identifier sent to the provider API (e.g. voxtral-mini-tts-2603)
display_namestringHuman label for this mouth (e.g. “Primary voice bank”)
voicesarray[]Voices this mouth can speak with. Managed via clone-voice / delete-voice ops.
credential_refstringReference to secret element with provider API key. Optional — falls back to platform MISTRAL_API_KEY (or provider-equivalent) when unset.
response_formatstring"wav"Audio encoding returned by synthesis. WAV is the default because that’s what the browser’s decodeAudioData can parse reliably — raw PCM is headerless and silently fails to decode in Web Audio. Switch to PCM only for server-side / RTP pipelines that know the sample rate and channel count out of band.
sample_rateinteger24000Sample rate in Hz for PCM output. Ignored for compressed formats.
streamingbooleantrueStream audio chunks as they’re synthesized. Recommended for conversational UX.
conversation_pacingobjectPer-mouth turn-taking rhythm for telephony calls. Each silence threshold is how long the bridge waits after voice activity stops before declaring end-of-turn — punctuation in the live STT transcript drives which threshold applies, so the agent can react fast on completed thoughts and stay patient on mid-sentence pauses.
pricingobjectCost per million tokens (USD) for billing reference

Operations

activity

Get /ops/activity | Auth: Read

Get activity events for this element

Scope depends on element capabilities: individual elements query by element_id, project-form elements with activity-scope-members include member activities, circle-level elements with activity-scope-all query the entire circle. Gracefully returns empty list if activities table is missing (old circles).

attachments

Get /ops/attachments | Auth: Read

List all modifiers and resources attached to this element

Returns both modifiers (policy enforcement) and resources (data injection) with is_modifier flag to distinguish. Items in the generated MODIFIER_TYPES list are modifiers; everything else is a resource. Includes cascade_policy and version pin info.

batch_stats

Get /ops/batch_stats | Auth: Read

Get per-element statistics for all children of this element

Returns per-child stats plus an aggregate. Most meaningful on compound or manifest form elements (repositories, circles, projects); atoms have no children so the result is an empty children array with a zeroed aggregate. Uses efficient GROUP BY SQL. Weighted averages for eval scores.

clone-voice

Post /ops/clone-voice | Auth: Write

Add a new voice to this mouth from a recording or uploaded file

Creates a new entry in this mouth’s voices array. Accepts either a files-element reference (upload path) or inline base64 audio (direct recording path, 10 MB cap). For cloning providers (Mistral Voxtral), the sample is uploaded to the provider and the returned voice_id is stored. For preset providers (OpenAI), set provider_voice_id directly without a sample. Returns the new voice’s id.

compose

Post /ops/compose | Auth: Execute

Batch add and remove modifiers on this element in a single call

Declarative composition: add modifiers by ref path (slug or path@version) and remove by attachment ID, all in one atomic call on the target element. Each ‘add’ entry resolves the source element, validates topology, attaches with optional priority and cascade policy. Each ‘remove’ entry deletes the attachment row. Returns a summary of what was added and removed. Example: compose({ add: [{ref: “my-prompt”}, {ref: “rate-limit/api@v2”, priority: 50}], remove: [{attachment_id: “uuid”}] })

context

Get /ops/context | Auth: Read

Get connected elements (graph traversal)

Graph traversal showing all connected elements with their relationship type (contains, contained_by, references, referenced_by, attaches, etc.). Use ?depth=N to control traversal depth (default 1) and ?types=actor,data to filter by element types.

create

Post /ops/create | Auth: Write

Create child element

POST to the parent path — element_type goes in the request body, NOT the URL. Both element_type and slug are required and must be non-empty. Name is derived from slug if omitted. Writes to both Git and PostgreSQL. All elements are stored flat under the circle — no intermediate library wrapper rows.

delete

Delete /ops/delete | Auth: Admin

Delete element (soft delete)

Soft delete — sets state to ‘deleted’ but retains the record. Cannot delete elements that have children (has_no_bond precondition) or active runs. Requires admin auth and confirmation.

delete-voice

Post /ops/delete-voice | Auth: Write

Remove a voice from this mouth

Removes the voice from the mouth’s array and asks the provider to delete it upstream (cloning providers). Agents still referencing this voice_id will fail to synthesize until reconfigured.

disable

Post /ops/disable | Auth: Admin

Disable element (hides and prevents use)

Idempotent — safe to call on already-disabled elements. Optionally pass a reason string. Disabled elements cannot be invoked or executed. Inverse of enable.

enable

Post /ops/enable | Auth: Admin

Enable element (makes usable and visible)

Idempotent — safe to call on already-enabled elements. Transitions element to ready/enabled state. Cannot enable deleted elements. Inverse of disable.

export_bundle

Get /ops/export/bundle | Auth: Read

Export element as downloadable git bundle

On non-root-namespace elements, returns a binary git bundle. On root-namespace (circle) elements, dispatch hands off to the circle’s own export_bundle op, which returns a multi-element JSON envelope with one base64 bundle per child element — this is intentional, not an error.

get

Get /ops/get | Auth: Read

Get element details

Element is already resolved by the routing layer — this returns the cached element, not a fresh DB query. Use the path /api/{circle}/{slug} to address elements.

import_bundle

Post /ops/import/bundle | Auth: Write

Import git bundle into element

Accepts a base64-encoded git bundle in the JSON bundle_base64 field. Use overwrite=true to replace existing elements with same slug (default skips duplicates). Imported elements get new UUIDs. Returns counts of imported/skipped elements and any errors.

info

Get /ops/info | Auth: Read

Get mouth metadata (provider, model, voices count)

Returns provider, model_id, number of voices, and defaults — used by the mouth picker UI.

intention

Get /ops/intention | Auth: Read

Get element intention with full inheritance chain

Returns three levels: direct (this element’s intention), inherited (from category and root), and resolved (final merged intention). Useful for understanding an element’s purpose in context of its hierarchy.

list-voices

Get /ops/list-voices | Auth: Read

List all voices configured on this mouth

Returns every voice in this mouth’s library with name, id, and metadata. Used by the agent voice picker.

promote

Post /ops/promote | Auth: Admin

Promote element configuration to a target environment

Only for manifest-form elements (projects). Environments advance: dev → demo → live. dev→demo requires member+ role, demo→live requires admin. Freezes member versions at promotion time (creates snapshot). Persists environment config to spec.environments.

readme

Get /ops/readme | Auth: Read

Get element README.md content

Reads README.md from the element’s git repository. Returns empty content (not an error) if no README exists. Always returns markdown format.

readme_update

Post /ops/readme_update | Auth: Write

Update element README.md content

Creates or overwrites README.md in the element’s git repo. Commits to the draft branch. Content must be provided as a markdown string.

remove-modifier

Post /ops/remove-modifier | Auth: Execute

Remove an attached modifier from this element by attachment ID

Removes a modifier/resource attachment by its row ID. The ID comes from the attachments or context API. This is the reverse of attach — called on the target element, not the source.

restore

Post /ops/restore | Auth: Admin

Restore element to a specific version

Automatically snapshots the current state before restoring (creates a ‘Before restore to vN’ version entry). Writes restored spec to git as .triform/spec.yaml. Git failures warn but don’t fail the operation — DB state is authoritative. Cannot restore deleted elements.

schema

Get /ops/schema | Auth: Read

Get element input/output schema (MCP tools/list compatible)

Returns type-level port schemas from the TypeRegistry — not instance-specific overrides. Includes direction (input/output), required flag, and JSON schema per port. Useful for understanding what data an element accepts and produces.

source

Get /ops/source | Auth: Read

Get any file’s content from the element’s git repository

Reads an arbitrary file from the element’s CAS-backed git tree by its relative path. Same store as readme, just generalized. Path safety: rejects .. traversal, leading /, and null bytes. Use this to view main.py for action elements, asset files for SPAs, etc. Returns empty content (not an error) if the file doesn’t exist.

source_branches

Get /ops/source/branches | Auth: Read

List Source branches for this element

Returns the standard draft/demo/live Source branches, their current commits, and promotion relationships. Use GET /api/{element_path}/ops/source/branches.

source_promote

Post /ops/source/promote | Auth: Write

Promote Source branch forward

Promotes draft to demo or demo to live through the generated element op path. Direct Git pushes to demo/live are blocked by Source policy.

source_repair

Post /ops/source/repair | Auth: Write

Inspect or repair the element Source index

Runs Source repair through the element operation path. Defaults to dry_run=true; set dry_run=false only after reviewing a dry-run report.

source_status

Get /ops/source/status | Auth: Read

Get Source control status for this element

Returns the branch-aware clone URL, checkout commands, current draft commit, child source-link count, portable export summary, Source health, warnings, and auth hints for the addressed element. Use the element-first path: GET /api/{element_path}/ops/source/status.

source_validate

Post /ops/source/validate | Auth: Read

Validate Source branch contents

Validates a Source branch before accepting local Git workflow changes or promotion. Defaults to branch=draft and rejects runtime data, generated output, secret material, and unreadable CAS refs.

stats

Get /ops/stats | Auth: Read

Get aggregate statistics for this element

Health status is computed: error if errors_per_day > 5 or success_rate < 0.8, warning if errors_per_day > 0 or success_rate < 0.95. Firing alerts escalate health to error/warning. Default period is ‘day’. Returns runs_per_day, success_rate, avg_duration_ms, and more.

synthesize

Post /ops/synthesize | Auth: Execute

Speak a line of text with a chosen voice

Synthesize speech from text using one of this mouth’s voices. The voice_id must match an entry in the mouth’s voices array; if omitted and only one voice exists it’s used automatically, otherwise an error is returned. Use PCM for the lowest latency (~0.7s time-to-first-audio).

test

Post /ops/test | Auth: Execute

Synthesize a short test phrase with a chosen voice

Round-trips text → audio and returns latency + audio data. Defaults to the first voice if voice_id is omitted.

tree

Get /ops/tree | Auth: Read

Get the element’s position in the graph — ancestors, children, references, and subtree statistics

Uses per-circle ElementGraph cache for O(1) lookups. Returns ancestors (containment chain), children (direct), members (references), referenced_by (reverse refs), attachments, and subtree stats. Default depth is 3, max is 10. Pass ?include_metadata=true for name/state on each node.

update

Patch /ops/update | Auth: Write

Update element

Partial update — send only the fields you want to change. spec, name, and intention are all independently optional. spec MUST be a JSON object when present; deep-merged into the existing spec by default. Empty {"spec":{}} preserves existing spec content but still records a new version (no-op for content, not for version state). To clear/replace the entire spec wholesale send {"spec":{...},"deep":false}. List-typed spec fields use replace semantics (the patch list replaces the existing list, no array merging). Coordinates Git + DB writes. Slug cannot be changed after creation.

update_meta

Patch /ops/update_meta | Auth: Write

Update element metadata (lightweight merge — does NOT bump version or snapshot spec)

Shallow JSONB merge into element.meta. Top-level keys in the provided value replace existing meta values; other keys are preserved. Used for UI metadata like canvas positions, panel state, viewer preferences. Wire-shape op_name is update_meta (distinct from update) so SSE subscribers + the cache auto-invalidator can distinguish lightweight metadata changes from spec edits without inspecting the payload. The MutatingElementStore wrapper stamps this op_name on the lifecycle event emitted by update_element_meta storage calls.

version

Get /ops/version | Auth: Read

Get current version or full history

Returns current version by default. Pass ?history=true for full version history (up to ?limit=N, default 50). Versions are backed by the element_versions table. Every spec update creates a new version entry.

Error Codes

CodeClassRetryableDescription
MOUTH_UNAVAILABLEinternalyes
MOUTH_CREDENTIAL_MISSINGauthno
MOUTH_VOICE_NOT_FOUNDvalidationnoRequested voice_id isn’t in this mouth’s voices array
MOUTH_VOICE_AMBIGUOUSvalidationnoMouth has multiple voices but no voice_id specified
MOUTH_VOICE_NOT_CLONEDvalidationnoVoice exists locally but provider_voice_id is missing — re-run clone-voice
MOUTH_SAMPLE_UNUSABLEvalidationno
MOUTH_SAMPLE_TOO_LARGEvalidationnoSample exceeds 10 MB cap
MOUTH_TEXT_TOO_LONGvalidationno
MOUTH_SYNTHESIS_FAILEDinternalyes

Observability

Defined for this element

Metrics

  • mouth_synthesize_total
  • mouth_synthesize_latency_ms
  • mouth_synthesize_audio_ms

Pricing / cost

Inherited from intelligence

Operation costs

  • invoke: 10000 micro-AU

Set it up

Namestring
A label for this mouth (e.g. "Team voices", "Primary bank")
Providerstring
TTS provider
Modelstring
Model ID (e.g. voxtral-mini-tts-2603)
Audio formatstring