Experimental Feature: This feature is currently in testing, so stability is not guaranteed.
How It Works
- Enable Thought Detection by sending a JSON start message on the WebSocket that includes
detect_thoughts: true. You can also tuneend_thought_eagernessandforce_complete_timehere. - As you stream audio, the server transcribes it internally but does not immediately send back a transcript after every pause. Instead, it buffers these transcripts and keeps them as one longer string.
- Only when the model determines a thought is complete does the server send a single message of type
complete_thoughtcontaining the full text of that idea.
end_thought_eagerness to make the detector more or less willing to close a thought, and force_complete_time to set a drop-dead timeout (in seconds) that will force emission of the current buffered thought after silence. This is helpful if a finished turn is incorrectly identified as an unfinished turn.
Enabling the Feature
Start Message FieldsTuning When a Thought Ends
These fields belong in the initial start message you send after opening the WebSocket.end_thought_eagerness (string, default: “medium”)
Controls how aggressively the model closes a thought.
Allowed values: “low”, “medium”, “high”.
force_complete_time (number, default: 2.0)
A drop-dead timer in seconds. If the model has not marked the current thought complete and the user has gone silent, the server will force-emit the buffered text once this many seconds have elapsed.
Range: 1.0–60.0 seconds. Values outside this range are rejected.
Note: You still enable the feature with detect_thoughts: true. These extra fields only tune when the thought ends.
Getting Started
Install the SDK mic addon
Python SDK Example
An SDK Example (mic_ws_continuous_thought_detection_sdk.py)
Code Samples
Code Samples
This client script includes a simple
ENABLE_THOUGHT_DETECTION flag. When set to true, it automatically adjusts the WebSocket URL and the VAD settings, then listens for the specific complete_thought message from the server.A Full Example (mic_ws_thought_detection.py)
Example Interaction
Here is how the experience differs when speaking the same sentence: “I was thinking about the quarterly report… and it seems like the numbers for Q3 are a bit lower than we expected.”Without Thought Detection
The application receives multiple, fragmented transcripts based purely on pauses.Received Transcripts:
With Thought Detection
The application receives a single, semantically complete transcript after the user finishes their entire point.Received Transcript: