ElevenLabs
ElevenLabs is an AI voice generator and a text-to-speech provider that you can use with Voximplant services. This article explains how to integrate ElevenLabs TTS into Voximplant Avatar.
Implement basic avatar
Refer to the Avatar's quickstart guide to implement basic avatar. This article has answers to the most common questions and provides a ready-to-use example.
Deploy a backend server for ElevenLabs TTS
Here is a diagram that explains how the backend server works:
- A customer talks to an avatar
- Avatar identifies the intent
- JS code in the avatar scenario sends the text to the backend and waits for the URL of the generated audio
- Once received, the avatar plays the URL audio back to the user
Implement the following code to run the backend:
This code example serves the following:
- Receives the text to be converted to speech
- Sends the text to ElevenLabs
- Receives the audio streaming
- Converts streaming into an mp3 file and stores it locally
- Returns the mp3 file URL as the response
You can use any tool to allow public access to the server you create, for testing purposes we recommend ngrok. Once completed, it provides a public URL to access the backend server.
Modify avatar to use with ElevenLabs TTS
In the avatar, create a JS function to connect with the backend. For example:
After that, you can use your Avatar with a voice from ElevenLabs.