For the complete documentation index, see llms.txt. This page is also available as Markdown.

Which request parameters are correct for my audio?

The correct request parameters for your audio depend on a few different factors, including:

  • Audio format and encoding

  • Transcription features (like punctuation and numerals)

  • Business use case

  • Desired speaker organization (and number of channels)

While that is the case, we have developed a few recommendations to help you navigate how to determine which request parameters are correct for your audio and solve your use case.

Audio Format and Encoding

For batch transcription, the audio format (and related encoding) can be inferred from the audio headers in the file itself, so it does not need to be directly specified in your query parameters.

For streaming transcription, direct specification of the audio format is only required when raw, headerless audio packets are sent to the streaming service. In this case, it is necessary to know the codec of the audio that the streaming audio is using, and it can be specified with the related encoding query parameter.

For reference, a complete list of audio content types as well as Deepgram’s supported audio formats can be found at the links below:

Transcription Features

If you know that you’d like to use a specific set of transcription or post-processing features that we offer, you can append them to your HTTP or WebSockets request URL as query parameters.

For example, if you wanted to enable punctuation and numerals for your transcription request with multichannel audio searching for the word “yes”, you would add something like:

Business Use Case

Your business use case can impact the query parameters that you select for your transcription requests in a few different ways, but one of the most impactful ways is in the model that you select to transcribe your audio.

At Deepgram we develop our models based on use case, and while the general model may provide good results for a call center use case, you may consider testing with the phonecall model to see if your results are even better!

Desired Speaker Organization

If you have multiple speakers in your audio that you wish to be distinguished from one another, you can do so by using diarization (diarize=true) or specifying multichannel audio (multichannel=true) in your requests.

Diarization is appropriate in cases where all speakers are on the same audio channel (mono channel audio). Specifying multichannel is best when you have speakers distributed across their own individual channels in audio. If speakers are “stereo’d” across multiple channels (where each speaker is on each channel) more analysis may be required to determine the best option.

Last updated