Realtime API client
In the rapidly evolving landscape of AI-driven applications, real-time communication and interactivity have become pivotal in delivering enhanced user experiences. Voximplant's OpenAI realtime API client bridges the gap between cutting-edge AI capabilities and real-time communication, empowering developers to seamlessly integrate OpenAI's advanced models into voice applications.
This guide serves as a comprehensive resource for developers, providing step-by-step instructions, practical use cases, and best practices for leveraging this powerful API client. Whether you are building conversational AI, automating customer interactions, or crafting innovative real-time solutions, this guide helps you unlock the full potential of Voximplant’s integration with OpenAI, enabling smarter, more engaging, and responsive applications. Any Voximplant call can be connected to an OpenAI agent in a VoxEngine scenario; for that purpose, we have introduced the RealtimeAPIClient class. Your OpenAI API KEY is required to create its instance:
realtimeAPIClient = await OpenAI.Beta.createRealtimeAPIClient(OPENAI_API_KEY, onWebSocketClose);
The second parameter is a callback function that is called if the WebSocket connection to the OpenAI Realtime API endpoint is closed for some reason.
Connecting incoming calls to the OpenAI realtime API
The basic version of a VoxEngine scenario that connects an incoming call to an OpenAI agent looks as follows:
An outgoing call can be connected to an OpenAI agent in the same way, just the call itself is created in the scenario via VoxEngine.callPSTN, VoxEngine.callSIP, or VoxEngine.callUser function.
Using a 3rd party TTS with the OpenAI realtime API
In case audio generated by the OpenAI model does not work for your use case (for example, you do not like any of the available voices), there is an easy way to disable audio generation on the OpenAI side and use one of many Voximplant TTS integrations. We have created a code example that shows how to use ElevenLabs for speech synthesis together with the realtime API Client that generates text responses.
Function calling
The great benefit of Voximplant’s serverless architecture is that you can do function calling right in the VoxEngine scenario and return parameters back to the model right there too. That simplifies and speeds up the development process, what most developers would appreciate.