ASRParameters
ASR parameters. Can be passed as arguments to the [VoxEngine.createASR] method.
Add the following line to your scenario code to use the interface:
require(Modules.ASR);
Props
adaptation
adaptation: Object
Optional. Speech adaptation configuration.
Available for providers: Google.
alternativeLanguageCodes
alternativeLanguageCodes: string[]
v1p1beta1 Speech API feature.
Optional. A list of up to 3 additional BCP-47 language tags, listing possible alternative languages of the supplied audio. See Language Support for a list of the currently supported language codes.
Requires the beta parameter set to true.
Available for providers: Google.
beta
beta: false
| undefined
| true
Optional. Whether to use the Google v1p1beta1 Speech API, e.g., enableSeparateRecognitionPerChannel, alternativeLanguageCodes, enableWordTimeOffsets, etc.
Available for providers: Google.
diarizationConfig
diarizationConfig: {enableSpeakerDiarization: boolean}
| undefined
v1p1beta1 Speech API feature.
Optional. Config to enable speaker diarization and set additional parameters to make diarization better suited for your application.
See the full list of available fields here.
Requires the beta parameter set to true.
Available for providers: Google.
enableAutomaticPunctuation
enableAutomaticPunctuation: false
| undefined
| true
v1p1beta1 Speech API feature.
Optional. If set to true, adds punctuation to recognition result hypotheses. This feature is only available in select languages. Setting this for requests in other languages has no effect at all. The false value does not add punctuation to result hypotheses. The default value is false.
Requires the beta parameter set to true.
Available for providers: Google.
enableSeparateRecognitionPerChannel
enableSeparateRecognitionPerChannel: false
| undefined
| true
v1p1beta1 Speech API feature.
Optional. The recognition result contains a [_ASRResultEvent.channelTag] field to state which channel that result belongs to. If set to false or omitted, only the first channel is recognized.
Requires the beta parameter set to true.
Available for providers: Google.
enableSpokenEmojis
enableSpokenEmojis: false
| undefined
| true
Optional. Whether to enable the spoken emoji behavior for the call.
Available for providers: Google.
enableSpokenPunctuation
enableSpokenPunctuation: false
| undefined
| true
Optional. Whether to enable the spoken punctuation behavior for the call.
Available for providers: Google.
enableWordConfidence
enableWordConfidence: false
| undefined
| true
v1p1beta1 Speech API feature.
Optional. If set to true, the top result includes a list of words and the confidence for those words. If set to false or omitted, no word-level confidence information is returned. The default value is false.
Requires the beta parameter set to true.
Available for providers: Google.
enableWordTimeOffsets
enableWordTimeOffsets: false
| undefined
| true
v1p1beta1 Speech API feature.
Optional. If set to true, the top result includes a list of words and the start and end time offsets (timestamps) for those words. If set to false or omitted, no word-level time offset information is returned. The default value is false.
Requires the beta parameter set to true.
Available for providers: Google.
headers
headers: {[key: string]: any}
| undefined
Optional. Request headers: {'x-data-logging-enabled': true}.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
interimResults
interimResults: false
| undefined
| true
Optional. Whether to enable interim ASR results. If set to true, the [ASREvents.InterimResult] triggers many times according to the speech.
Available for providers: Amazon, Deepgram, Google, SaluteSpeech, T-Bank, Yandex.
maxAlternatives
maxAlternatives: number
| undefined
Optional. Maximum number of recognition hypotheses to be returned.
Available for providers: Google.
metadata
metadata: {microphoneDistance: string}
| undefined
v1p1beta1 Speech API feature.
Optional. Metadata regarding this request.
See the full list of available fields here.
Requires the beta parameter set to true.
Available for providers: Google.
model
model: | | | | | | |
Optional. Recognition model. Select the model best suited to your domain to get the best results. If it is not specified, the default model is used.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
phraseHints
phraseHints: string[]
Optional. Preferable words to recognize. Note that phraseHints do not limit the recognition to the specific list. Instead, words in the specified list has a higher chance to be selected.
Available for providers: Google.
profanityFilter
profanityFilter: false
| undefined
| true
Optional. Whether to enable profanity filter. The default value is false.
If set to true, the server attempts to filter out profanities, replacing all but the initial character in each filtered word with asterisks, e.g. "f***". If set to *false* or omitted, profanities are not filtered out.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
profile
profile: | | | | | | |
Profile that specifies an ASR provider and a language to use.
Available for providers: Amazon, Deepgram, Google, Microsoft, SaluteSpeech, T-Bank, Yandex, YandexV3.
request
request: Object
Optional. Provide the ASR parameters directly to the provider in this parameter. Find more information in the documentation.
Available for providers: Deepgram, Google, SaluteSpeech, T-Bank, Yandex, YandexV3.
singleUtterance
singleUtterance: false
| undefined
| true
Optional. Whether to enable single utterance. The default value is false, so:
1) if the speech is shorter than 60 sec, [ASREvents.Result] is triggered in unpredictable time. You could mute the mic when the speech is over - this increases the probability of [ASREvents.Result] catching;
2) if the speech is longer than 60 sec, [ASREvents.Result] is triggered each 60 seconds.
If it is true, the [ASREvents.Result] is triggered after every utterance.
Available for providers: Amazon, Google, Microsoft, SaluteSpeech, T-Bank, Yandex.
Note: for the SaluteSpeech provider the default value is true.
speechContexts
speechContexts: undefined
Optional. Increase the recognition model bias by assigning more weight to some phrases than others. Phrases is the word array, boost is the weight in the range of 1..20.
Available for providers: Google.
transcriptNormalization
transcriptNormalization: {entries: }
| undefined
Optional. Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing.
Available for providers: Google.
useEnhanced
useEnhanced: false
| undefined
| true
Optional. Whether to use the enhanced models for speech recognition.
Available for providers: Google.