openvidu-speech-to-text 🔗

Check it on GitHub

An OpenVidu application built with plain JavaScript, HTML and CSS, demonstrating OpenVidu Speech to Text capabilities. It is highly recommended to read Speech to Text documentation before running the tutorial.

Speech to Text is part of OpenVidu PRO and ENTERPRISE editions.

Running this tutorial 🔗

To run the tutorial you need the three components stated in OpenVidu application architecture: an OpenVidu deployment, your server application and your client application. In this order:

1. Run OpenVidu deployment 🔗

You will need an OpenVidu Pro or OpenVidu Enterprise deployment to test Speech to Text capabilities. See Deployment documentation.

2. Run your preferred server application sample 🔗

For more information visit Application server.

IMPORTANT: No matter what server application you choose, make sure to update configuration variables OPENVIDU_URL and OPENVIDU_SECRET to the values of your OpenVidu Pro/Enterprise deployment.

3. Run the client application tutorial 🔗

You will need some kind of http web server installed in your development computer to serve the tutorial. If you have Node.js installed, you can use http-server. It can be installed with:

npm install --location=global http-server

To serve the tutorial:

# Using the same repository openvidu-tutorials from step 2
http-server openvidu-tutorials/openvidu-speech-to-text/web

Go to http://localhost:8080 to test the app once the server is running.

To test the application with other devices in your network, visit this FAQ. Skip step 1, as you already need a real OpenVidu deployment to use Speech to Text.

Understanding the code 🔗

This tutorial is exactly the same as openvidu-js, but adding Speech to Text capabilities. Let's focus on the usage of this feature.

In web/app.js file there is a section containing all methods making use of Speech to Text.

In this tutorial each participant video will have a button to start/stop the transcription of the audio. The transcription will be displayed at the bottom of the page view.

When the button is clicked, the following code will be executed:

if(text === 'Enable captions') {
  //  ...
  await this.session.subscribeToSpeechToText(stream, 'en-US');
} else {
  // ...
  await this.session.unsubscribeFromSpeechToText(stream);
}

The first time the button is clicked, the subscribeToSpeechToText method will be called. This method will start the transcription of the audio of the stream. OpenVidu will try to recognise the audio language and will return the transcription in that language. In this case, the transcription will be in English (en-US).

The second time the button is clicked, the unsubscribeFromSpeechToText method will be called. This method will stop the transcription of the audio of the stream.