Skip to main content

Identify Speakers in An Audio File

Run in Postman

Here we will show you how to use our speaker identification API.

Prerequisites

Make sure to follow Getting Started to log. If you are using APIs, save your authorization token in a variable called AUTHORIZATION_TOKEN before moving ahead.

Identify Speakers

STEP 1 - Upload a file

To identify speakers in an audio file you will have to upload your file first and get the respective file ID. Follow the steps in this page to do so.

STEP 2 - Create a speaker identification task

Call this API to start the speaker identification process

curl --location --request POST 'https://platform.neuralspace.ai/api/speaker-identification/v1/indentify/speakers' \
--header 'Authorization;' \
--header 'Content-Type: application/json' \
--data-raw '{
"fileId": "<YOUR-FILE-ID>"
}'

This will return a taskId, which you can use to check the status of this task later.

Here is a sample response:

{
"success": true,
"message": "Audio has been queued for speaker identification. Check task status using speaker identification status API by providing the taskId",
"data": {
"taskId": "<YOUR-SPEAKER-IDENTIFICATION-TASK-ID>"
}
}

STEP 3 - Fetch speaker identification results

curl --location --request GET 'https://platform.neuralspace.ai/api/speaker-identification/v1/task/status?taskId=<YOUR-SPEAKER-IDENTIFICATION-TASK-ID>' \
--header 'Authorization: <ACCESS-TOKEN>'

This will return an object like this, in which you can see the current speaker identification status along with some related metadata.

{
"success": true,
"message": "Data fetched succssfully",
"data": {
"apikey": "...",
"taskId": "<YOUR-SPEAKER-IDENTIFICATION-TASK-ID>",
"fileId": "<YOUR-FILE-ID>",
"jobStatus": "Completed",
"jobProgress": [
"Queued",
"Identification Job Started",
"Diarizing Audio",
"Audio Diarized",
"Speakers identified",
"Completed"
],
"fileDuration": 2115,
"results": [
{
"start": 0.4,
"stop": 11,
"user": "2"
},
{
"start": 11.5,
"stop": 13.2,
"user": "1"
}
],
"message": "Speakers have been identified in audio file successfully",
"jobTime": 1,
"createdAt": "2022-10-31T08:57:33.547Z"
}
}

Transcription Status Codes

Since files are transcribed asynchronously, every step is logged in the jobProgress attribute. All the status messages have been shown in the example above but in case a speaker identification job fails a Failed status is set instead for Completed. These are all the status messages:

  • Queued
  • Identification Job Started
  • Diarizing Audio
  • Audio Diarized
  • Speakers identified
  • Completed

What are timestamps?

When jobStatus becomes Completed you will see a list of objects in the results field. These objects represent chunks of audio belonging to a certain speakers along with their their respective start and end time in the audio file. user is an identifier given to the users in the audio file. The first speaker is assigned the id 0, the second speaker 1, and so on.

Get Segments For Transcription

If you have used the same audio file for speaker identification that you have used for transcription, you can merge the results by calling the following API. This API takes speaker segments from the speaker identification API and transcripts from the transcription APi and merges them for you to get what each speaker has said.

E.g., if in an audio file there are 3 speakers then speaker identification will identify which parts of the audio file belongs to these three speakers. On the other hand the transcription API will extract what was spoken in the audio file in text format. When you combine them together you get what and when each of the three speakers said in the audio file.

curl --location --request POST 'https://platform.neuralspace.ai/api/transcription/v1/get_segments' \
--header 'Authorization: <YOUR-API-KEY>' \
--header 'Content-Type: application/json' \
--data-raw '{
"transcribeId": "<YOUR-TRANSCRIBE-ID>",
"speakerIdentificationTaskId": "<YOUR-SPEAKER-IDENTIFICATION-TASK-ID>"
}'

In the response you get a segmentsFileId, which contains all the speaker segments, text, start and end timestamps.

{
"success": true,
"message": "Data fetched succssfully",
"data": {
"segmentsFileId": "<YOUR-SEGMENTS-FILE-ID>"
}
}

For this API to give a success response both transcribeId and speakerIdentificationTaskId have to be valid and in Completed status.

You can fetch the segments file using the segmentsFileId by following the instructions here.