Experimental Feature: This feature is currently in testing, so stability is not guaranteed.
How It Works
- Enable Thought Detection by adding a simple query parameter to your WebSocket URL.
- As you stream audio, the server transcribes it internally but does not immediately send back a transcript after every pause. Instead, it buffers these transcripts and keeps them as one longer string.
- Only when the model determines a thought is complete does the server send a single message of type
complete_thought
containing the full text of that idea.
Enabling the Feature
Activation is controlled by a single URL parameter.Set this to
true
in the WebSocket connection URL to enable server-side thought detection.Install the SDK mic addon
Python SDK Example
An SDK Example (mic_ws_continuous_thought_detection_sdk.py)
Code Samples
Code Samples
This client script includes a simple
ENABLE_THOUGHT_DETECTION
flag. When set to true
, it automatically adjusts the WebSocket URL and the VAD settings, then listens for the specific complete_thought
message from the server.A Full Example (mic_ws_thought_detection.py)
Example Interaction
Here is how the experience differs when speaking the same sentence: “I was thinking about the quarterly report… and it seems like the numbers for Q3 are a bit lower than we expected.”Without Thought Detection
The application receives multiple, fragmented transcripts based purely on pauses.Received Transcripts:
With Thought Detection
The application receives a single, semantically complete transcript after the user finishes their entire point.Received Transcript: