Diarization is the process of partitioning an audio stream into segments according to speaker identity. In simple terms, it answers the question, “Who spoke when?” Fennec’s diarization not only separates speakers but can also use AI to assign specific names to them, transforming a raw transcript into a structured, easy-to-read dialogue.

How It Works

Use these parameters in your /transcribe request. Supplying known speaker names helps the model map voices to real people.
diarize
boolean
Set this to true to enable speaker diarization. The transcript will be returned with speaker labels (e.g., [SPEAKER_00], [SPEAKER_01]).
speaker_recognition_context
string
Provide a short sentence listing the speakers (e.g., “The two speakers are Marv Esserman and the host, Ally Holt”). When diarize is enabled, the AI uses this text plus voice cues to replace generic labels like [SPEAKER_00] with the actual names.
Formatting & Performance: Enabling diarization increases processing time and cost. The formatting parameter is ignored when diarize is true because diarization dictates the output format.

How to Use It

Adding diarize and (optionally) speaker_recognition_context is all you need.

Python SDK Example

quickstart_sdk.py
from fennec_asr import FennecASRClient

asr_client = FennecASRClient(api_key="YOUR_API_KEY")

transcription = asr_client.transcribe_file(
    file_path="interview_session.mp3",
    diarize=True,
    speaker_recognition_context="The two speakers are Marv Esserman, the guest, and the host, Ally Holt."
)

print(transcription)

Example Result

Without diarization, a conversation is a wall of text. With diarization and speaker context, it becomes a readable script.

Before Diarization

Transcript:
What's up, everybody, and welcome to the podcast. Today, we've got a great show, because we'll be sitting down with Marv Esserman, the mastermind behind the new hit single Decorations. Marv, first off, congrats on the success of Decorations. That track's been everywhere lately. What inspired it? Thanks, Allie. It actually came from a pretty raw place. I was staring at these old holiday lights I never took down, and it just hit me. They were like metaphors for all the stuff I never let go of. So the song kind of wrote itself after that. That's wild. And the bridge in the song? Pure emotion? You layer this subtle synth hum beneath the vocals that feels like... like a memory almost. Yeah, that hum is a field recording of my childhood homes heater. I wanted the song to feel like nostalgia, not just talk about it. That's art. You've got listeners crying in their cars and they don't even know why. What can we expect next? I'm going darker next. More stripped down, less perfection, more feeling. It's about chasing the ghosts in your own house, you know. You heard it here first, folks? Marv is just getting started. Until next time, stay curious and keep listening.

After Diarization (diarize=true)

Transcript:
[SPEAKER_00]: What's up, everybody, and welcome to the podcast. Today, we've got a great show, because we'll be sitting down with Marv Esserman, the mastermind behind the new hit single Decorations. Marv, first off, congrats on the success of Decorations. That track's been everywhere lately. What inspired it?
[SPEAKER_01]: Thanks, Allie. It actually came from a pretty raw place. I was staring at these old holiday lights I never took down, and it just hit me. They were like metaphors for all the stuff I never let go of. So the song kind of wrote itself after that.
[SPEAKER_00]: That's wild. And the bridge in the song? Pure emotion? You layer this subtle synth hum beneath the vocals that feels like... like a memory almost.
[SPEAKER_01]: Yeah, that hum is a field recording of my childhood homes heater. I wanted the song to feel like nostalgia, not just talk about it.
[SPEAKER_00]: That's art. You've got listeners crying in their cars and they don't even know why. What can we expect next?
[SPEAKER_01]: I'm going darker next. More stripped down, less perfection, more feeling. It's about chasing the ghosts in your own house, you know.
[SPEAKER_00]: You heard it here first, folks? Marv is just getting started. Until next time, stay curious and keep listening.

After Diarization with Speaker Context

Context Provided: The two speakers are Marv Esserman, the guest, and the host, Ally Holt.Transcript:
[Ally Holt]: What's up, everybody, and welcome to the podcast. Today, we've got a great show, because we'll be sitting down with Marv Esserman, the mastermind behind the new hit single Decorations. Marv, first off, congrats on the success of Decorations. That track's been everywhere lately. What inspired it?
[Marv Esserman]: Thanks, Allie. It actually came from a pretty raw place. I was staring at these old holiday lights I never took down, and it just hit me. They were like metaphors for all the stuff I never let go of. So the song kind of wrote itself after that.
[Ally Holt]: That's wild. And the bridge in the song? Pure emotion? You layer this subtle synth hum beneath the vocals that feels like... like a memory almost.
[Marv Esserman]: Yeah, that hum is a field recording of my childhood homes heater. I wanted the song to feel like nostalgia, not just talk about it.
[Ally Holt]: That's art. You've got listeners crying in their cars and they don't even know why. What can we expect next?
[Marv Esserman]: I'm going darker next. More stripped down, less perfection, more feeling. It's about chasing the ghosts in your own house, you know.
[Ally Holt]: You heard it here first, folks? Marv is just getting started. Until next time, stay curious and keep listening.

Tips for Writing Effective Speaker Context

  • Be specific: Provide full names and roles if possible. For example, The interviewer is Dr. Anya Sharma, and the patient's name is Ben Carter.
  • List all speakers: Try to list all known speakers to give the AI the best chance of correctly identifying everyone.
  • Clarity is key: The AI uses this text to make an intelligent assignment. The clearer and more descriptive your context, the more accurate the final named speaker labels will be.