Speaker Identification

While processing audio files very often you will find that there are more than one speakers in it. In such cases using just an STT API is not enough. You would want to transcribe each speaker's speech separately. Along with that, you would want to know in which part of the audio each of these speakers are present. This is very critical while transcribing phone calls. Whether they are call center phone calls, recorded meeting or interviews speaker based transcriptions are a must.
That is why we developed a language agnostic speaker identification service. This service can do everything we just talked about.


  • Out-of-the-box Models: No need to train a model. Our pre-trained models can be used off-the-shelf to identify speakers in audio files.

  • No need to specify the number of speakers: In most cases you won't know how many speakers are there in an audio file. Our service handles such cases seamlessly.

  • Language agnostic: Our speaker identification service works for any language in the world.

Upcoming Features

  • Speaker Profiling: By uploading just 30 seconds of audio data for speakers you can enable speaker identification based on voice signatures/biometrics. Think of this as a voice fingerprint.


Meeting Transcription

If you record all your meetings and want to archive them to be able to search later, or you want to send automated meeting notes to all attendees you can use this speaker identification service. Furthermore, you can use the speaker transcripts to programmatically create actionable items for each meeting attendee. You can also combine speaker identification with Language Understanding to classify what each speaker said into a certain category.

Phone Call Transcription

This is especially important for call centers, where thousands of hours of calls are analysed and reported every day. Manual transcription and analysis is practically impossible to scale. Speaker identification can help you accelerate this process.

Interview Transcription

A lot of market surveys and research is done by interviewing people. Speaker identification can be used to transcribe these interviews automatically.

Try Out

