ASRParameters
parameters. Can be passed as arguments to the [VoxEngine.createASR] method.
Add the following line to your scenario code to use the interface:
require(Modules.ASR);
Props
adaptation
Optional. Speech adaptation configuration.
Available for providers: Google.
alternativeLanguageCodes
v1p1beta1 Speech API feature.
Optional. A list of up to 3 additional BCP-47 language tags, listing possible alternative languages of the supplied audio. See Language Support for a list of the currently supported language codes.
Requires the beta parameter set to true.
Available for providers: Google.
beta
Optional. Whether to use the Google v1p1beta1 Speech API, e.g., enableSeparateRecognitionPerChannel, alternativeLanguageCodes, enableWordTimeOffsets, etc.
Available for providers: Google.
diarizationConfig
v1p1beta1 Speech API feature.
Optional. Config to enable speaker diarization and set additional parameters to make diarization better suited for your application.
See the full list of available fields here.
Requires the beta parameter set to true.
Available for providers: Google.
enableAutomaticPunctuation
v1p1beta1 Speech API feature.
Optional. If set to true, adds punctuation to recognition result hypotheses. This feature is only available in select languages. Setting this for requests in other languages has no effect at all. The false value does not add punctuation to result hypotheses. The default value is false.
Requires the beta parameter set to true.
Available for providers: Google.
enableSeparateRecognitionPerChannel
v1p1beta1 Speech API feature.
Optional. The recognition result contains a [_ASRResultEvent.channelTag] field to state which channel that result belongs to. If set to false or omitted, only the first channel is recognized.
Requires the beta parameter set to true.
Available for providers: Google.
enableSpokenEmojis
Optional. Whether to enable the spoken emoji behavior for the call.
Available for providers: Google.
enableSpokenPunctuation
Optional. Whether to enable the spoken punctuation behavior for the call.
Available for providers: Google.
enableWordConfidence
v1p1beta1 Speech API feature.
Optional. If set to true, the top result includes a list of words and the confidence for those words. If set to false or omitted, no word-level confidence information is returned. The default value is false.
Requires the beta parameter set to true.
Available for providers: Google.
enableWordTimeOffsets
v1p1beta1 Speech API feature.
Optional. If set to true, the top result includes a list of words and the start and end time offsets (timestamps) for those words. If set to false or omitted, no word-level time offset information is returned. The default value is false.
Requires the beta parameter set to true.
Available for providers: Google.
headers
Optional. Request headers: {'x-data-logging-enabled': true}.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
interimResults
Optional. Whether to enable interim ASR results. If set to true, the [ASREvents.InterimResult] triggers many times according to the speech.
Available for providers: Amazon, Deepgram, Google, SaluteSpeech, T-Bank, Yandex.
maxAlternatives
Optional. Maximum number of recognition hypotheses to be returned.
Available for providers: Google.
metadata
v1p1beta1 Speech API feature.
Optional. Metadata regarding this request.
See the full list of available fields here.
Requires the beta parameter set to true.
Available for providers: Google.
model
Optional. Recognition model. Select the model best suited to your domain to get the best results. If it is not specified, the default model is used.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
phraseHints
Optional. Preferable words to recognize. Note that phraseHints do not limit the recognition to the specific list. Instead, words in the specified list has a higher chance to be selected.
Available for providers: Google.
profanityFilter
Optional. Whether to enable profanity filter. The default value is false.
If set to true, the server attempts to filter out profanities, replacing all but the initial character in each filtered word with asterisks, e.g. "f***". If set to *false* or omitted, profanities are not filtered out.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
profile
Profile that specifies an ASR provider and a language to use.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
request
Optional. Provide the ASR parameters directly to the provider in this parameter. Find more information in the documentation.
Available for providers: Deepgram, Google, SaluteSpeech, T-Bank, Yandex, YandexV3.
singleUtterance
Optional. Whether to enable single utterance. The default value is false, so:
1) if the speech is shorter than 60 sec, [ASREvents.Result] is triggered in unpredictable time. You could mute the mic when the speech is over - this increases the probability of [ASREvents.Result] catching;
2) if the speech is longer than 60 sec, [ASREvents.Result] is triggered each 60 seconds.
If it is true, the [ASREvents.Result] is triggered after every utterance.
Available for providers: Amazon, Google, Microsoft, SaluteSpeech, T-Bank, Yandex.
Note: for the SaluteSpeech provider the default value is true.
speechContexts
Optional. Increase the recognition model bias by assigning more weight to some phrases than others. Phrases is the word array, boost is the weight in the range of 1..20.
Available for providers: Google.
transcriptNormalization
Optional. Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing.
Available for providers: Google.
useEnhanced
Optional. Whether to use the enhanced models for speech recognition.
Available for providers: Google.