Scalability



OpenVidu Pro architecture 🔗

OpenVidu Pro consists of different nodes that work together to offer OpenVidu services in a distributed and scalable way. Currently, OpenVidu Pro has two types of nodes, following a Master-Worker model:

  • Master Node: takes care of the signaling plane. Manages OpenVidu sessions, forwarding events and messages to clients and distributing the load across the available Media Nodes.

  • Media Nodes: these are the worker nodes, in charge of managing the media plane. For that reason, Media Nodes are the actual bottleneck of the OpenVidu cluster and the ones that determine its capacity: more Media Nodes means more concurrent OpenVidu sessions. Two important aspects of Media Nodes:
    • Each OpenVidu Session is currently hosted in one Media Node.
    • Each Media Node can host multiple OpenVidu Sessions.

How OpenVidu Pro sessions are distributed 🔗

There are two different ways to distribute OpenVidu sessions among the different Media Nodes of your cluster.

Automatic distribution 🔗

This is the default method to allocate sessions in your OpenVidu Pro cluster. OpenVidu periodically gathers the CPU load of all Media Nodes, and each new session will be initialized in the less loaded Media Node. The session is allocated in the less loaded Media Node at the exact moment when its first user connects to it.

When is this method recommended?

  • When your sessions are relatively small in number of participants. That is: when each session does not take a significant amount of the CPU capacity of the Media Node.
  • When your sessions are not expected to grow in size over time. If your sessions start small but keep adding participants, at some point they can overload their automatically assigned Media Node.
  • When you expect a fairly even spread of sessions over time. That is: if your system does not proactively initialize many sessions at once. This may cause the initialization of all of them in the same Media Node (the less loaded at that time), not taking into account the future load it will have to deal with when all the sessions begin to stream media.

If these conditions are sufficiently met, then the automatic distribution process will ensure that sessions are distributed evenly across all available Media Nodes, without having to worry about manual session allocation.

Manual distribution 🔗

You can force the Media Node where a session must be allocated.

OpenVidu openvidu = new OpenVidu(OPENVIDU_URL, OPENVIDU_SECRET);
SessionProperties sessionProperties = new SessionProperties.Builder()
    .mediaNode("media_i-1234567890abcdef0") // This string being the identifier of an available Media Node
    .build();
Session session = openvidu.createSession(sessionProperties);

See JavaDoc


When is this method recommended?

  • When your sessions are very big in number of participants and may require a significant share of its Media Node capacity. In other words: if you expect a particular session to need 50% of the Media Node CPU power, then manually controlling which other sessions will also be initialized in that Media Node becomes essential to avoid overloading it.
  • When your sessions keep growing over time. If more and more participants keep being added to your sessions, then having full control over where new sessions are started is important.
  • When you expect lots of sessions to be initialized in a very short amount of time. OpenVidu doesn't know how much CPU capacity will consume each session, and by default will initialize them in the less loaded Media Node. This can cause lots of sessions to be allocated in the same Media Node (the less loaded one at that time), and when they begin streaming media and adding participants the load in that specific node can increase to a dangerous point, even with idle Media Nodes available in the cluster. Manual session allocation is the only solution in this case.



How to deploy your OpenVidu Pro cluster 🔗

Different environments are supported when deploying an OpenVidu Pro cluster. At the current moment you can:

We are currently working to natively support other cloud providers such as Azure, Google Cloud and Digital Ocean the same way we support Amazon. But remember you are still able to deploy OpenVidu Pro wherever you want following the guide of deployment on premises



Set the number of Media Nodes on startup 🔗

When deploying your OpenVidu Pro cluster, you can set the initial desired number of Media Nodes. Each type of deployment has a way of setting this number. Visit your specific OpenVidu Pro cluster deployment instructions to learn more:



Change the number of Media Nodes on the fly 🔗

You can launch and drop Media Nodes dynamically in two different ways:

From OpenVidu Inspector 🔗

In Cluster page you can launch and drop Media Nodes just by pressing buttons.

With OpenVidu Pro REST API 🔗

You can programmatically launch and drop Media Nodes from your application by consuming OpenVidu Pro REST API.

WARNING: depending on the environment where your OpenVidu Pro cluster is deployed, you must take into account some important aspects regarding the launch and drop of Media Nodes. Visit the specific documentation page for your environment:



OpenVidu Pro cluster events 🔗

OpenVidu Pro provides an specific server-side event that will inform you every time there is a change in the status of the cluster. You can listen to this event by using OpenVidu Webhook (it will also be registered in OpenVidu CDR).

This event is mediaNodeStatusChanged. By listening to it you will have a complete record of your OpenVidu Pro cluster behavior in real time. And of course you can always use OpenVidu Pro Media Node REST API to retrieve or modify the status of a Media Node at any time.

Media Node statuses 🔗

Here are all the possible statuses of a Media Node within an OpenVidu Pro cluster.

  • launching: the Media Node is launching. This is the entry status and can also be reached from canceled status.
  • canceled: the Media Node will immediately enter terminating status after the launching process succeeds. This status can be reached from launching status.
  • failed: the Media Node failed to launch. This status can be reached from launching status.
  • running: the Media Node is up and running. New sessions can now be established in this Media Node. This status can be reached from launching and waiting-idle-to-terminate statuses.
  • waiting-idle-to-terminate: the Media Node is waiting until the last of its sessions is closed. Once this happens, it will automatically enter terminating status. The Media Node won't accept new sessions during this status. This status can be reached from running status.
  • terminating: the Media Node is shutting down. This status can be reached from running, waiting-idle-to-terminate and canceled statuses.
  • terminated: the Media Node is shut down. This status can be reached from terminating status. For On Premises OpenVidu Pro clusters, this status means that you can safely shut down the Media Node instance.



How many users an OpenVidu Pro cluster can handle 🔗

This is probably one of the most important questions when using OpenVidu Pro. The number of Media Nodes you need and the size of each Media Node depends on the answer. Therefore, the price of your OpenVidu Pro cluster also depends on the answer.

That being said, there is no single answer to this question. The load each Media Node can handle depends on many factors:

  • The topology of each OpenVidu Session (1:1, 1:N, N:M)
  • The type of media streams being published to the Session (only audio, only video, audio + video)
  • Whether your Sessions are using advanced features such as recording or audio/video filters

You will need to perform some tests for your specific use case, and adapt the size of your cluster accordingly. OpenVidu team can perform these tests for you as part of their commercial services (contact us through Commercial page to ask for an estimate).

For a quick reference, these are the results of some load tests performed in an OpenVidu Pro cluster deployed on Amazon Web Services with just 1 Media Node. These particular scenario is testing 7-to-7 sessions where every participant sends one audio-video stream (540x360, 30 fps) and receives 6 remote streams (same video). The table states the maximum number of entities that can be established until the Media Node CPU reaches 100% use. Take into account from a pricing point that the number of cores in each column header does not sum up the total number of cores of the cluster (Master Node cores should also be counted).

Here you can find the full article presenting these results.



Scalable recording 🔗

In OpenVidu there are two types of recordings: INDIVIDUAL recording and COMPOSED recording. Besides, each one of them can be performed recording only the audio tracks, only the video tracks or both of them. Composed recordings that have video are performed using a special module. An instance of this module must be launched for each Session that is being recorded.

In the monolithic setup of OpenVidu CE, this module must be launched in the single available node, which can overload the server to a dangerous point. But in OpenVidu Pro clusters, in order to avoid this, the recording module is not launched in Master Node, but in a Media Node. The default behavior is to launch it in the same Media Node hosting the Session to be recorded, as in this way the media streams don't need to travel from the Media Node hosting the Session to the Media Node hosting the recording, reducing network traffic in the cluster. But you can also force the Media Node where to initialize the composed video recording of any Session. See Scalable composed recording section to learn how.



Autoscaling 🔗

OpenVidu Pro autoscaling feature allows you to forget about monitoring the status and load of your cluster, letting the cluster itself decide when to automatically increase or decrease the number of Media Nodes. This provides a number of important advantages:

  • Real CPU load in your existing Media Nodes will determine the optimal size of the cluster at every moment. That very definite and conclusive measure is the one taken into account to decide if your cluster should grow or shrink.
  • The cost of your OpenVidu Pro cluster will always be dynamically adjusted to what is necessary to support the existing load. If the cluster needs to double its capacity only during 10 minutes, OpenVidu Pro will take care itself of doubling the number of Media Nodes during that time and dropping them after user load has come back to normal. So you will be charged as little as possible while being guaranteed enough space for all your sessions.
  • You can customize the limits of your cluster so that OpenVidu Pro doesn't launch infinite Media Nodes and always keeps a minimum by default. And you can also set the load threshold to let OpenVidu Pro know when the cluster is loaded or idle enough to launch or drop Media Nodes. To sum up, you have total control over autoscaling behavior.

Enable autoscaling 🔗

Configure the following property in the .env file at Master Node installation path (default to /opt/openvidu)

OPENVIDU_PRO_CLUSTER_AUTOSCALING=true

The following properties allows you to configure the autoscaling behavior: the upper and lower limits on the number of Media Nodes and the average load threshold. You have a complete description of them at OpenVidu Pro configuration section.

OPENVIDU_PRO_CLUSTER_AUTOSCALING_MAX_NODES=8
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MIN_NODES=2
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MAX_LOAD=70
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MIN_LOAD=30

Scope of autoscaling depending on the environment 🔗

The scope of autoscaling is different depending on the environment OpenVidu Pro is deployed:

  • For any deployment environment different to On Premises, OpenVidu Pro will automatically manage the complete lifecycle of all Media Nodes, being able to launch and drop instances on its own. In this case the user doesn't need to do anything regarding instance management.

  • For On Premises deployments, OpenVidu Pro won't be able to launch and drop instances from the cluster on its own. It will only be able to transition Media Node statuses from one status to another. That includes disconnecting Media Nodes from the cluster when required (so that you are no longer charged for them), but you will still be responsible of launching and adding to the cluster new Media Nodes when indicated and terminating the instances of disconnected Media Nodes (if that's what you want). In order to accomplish this you must listen to:
    • Event autoscaling: to know when to launch and/or add to the cluster new Media Nodes (property mediaNodes.launch.newNodes). You must launch the Media Node on your own and then you can add it to the cluster programmatically with OpenVidu Pro REST API.
    • Event mediaNodeStatusChanged: to know when to terminate the instance of a Media Node, if that's what you want. Wait for terminated status to know when you can safely terminate the Media Node instance without losing any data.

How does the autoscaling algorithm behave? 🔗

Let's take a look at how OpenVidu Pro autoscaling works. First of all, everything starts with the value given to the autoscaling configuration properties. You can set the maximum and minimum number of Media Nodes that the cluster should always respect, regardless of what the cluster load is. And you can also set the threshold indicating the "low load" and "high load" values, so when exceeded the autoscaling algorithm will make changes to the cluster size.

OpenVidu Pro will be constantly monitoring the load of each Media Node of the cluster. When their average load is higher or lower than the indicated limits, the autoscaling algorithm will launch new Media Nodes or drop existing Media Nodes respectively. The power of the autoscaling feature lies in the ability of the algorithm to determine the most optimal Media Node(s) to modify at any given time in order to reach the new desired number of Media Nodes in the least possible amount of time. All of this is determined by the Media Node statuses.

  • When adding Media Nodes to the cluster:

    • Those with waiting-idle-to-terminate status will have priority transitioning to running status. This is because this transition is instantaneous: the Media Node will be available again to host new sessions immediately.
    • If there are not enough new Media Nodes yet, then those with canceled status will transition to launching status. This is because Media Nodes that are already in the process of launching will require less time to be ready than new Media Nodes. Besides, oldest launching Media Nodes will have higher priority, as they will require less time to be finally available.
    • If there are not enough new Media Nodes yet, only then completely new Media Nodes will be launched and added to the cluster.

  • When removing Media Nodes from the cluster:

    • Those with launching status will have priority transitioning to canceled status. This is because Media Nodes in the process of launching won't ever have any session inside of them, and their shutdown will be immediately effective after the launching process completes.
    • If there are still too many Media Nodes, only then Media Nodes with running status will be terminated. Media Nodes with the lowest load will be terminated first, as they will usually take less time to be empty of sessions. If the Media Node is not hosting any session at all, then it will immediately transition to terminating status. If it is hosting sessions, then it will transition to waiting-idle-to-terminate status.

Examples of OpenVidu Pro autoscalable clusters 🔗

The best way to understand how OpenVidu Pro autoscaling works is by analyzing some real-world scenarios, and seeing how the cluster behaves.

Scenario 1: big sessions cause a simple growth and decline of the load 🔗

Let's suppose we configure our cluster with the following values:

OPENVIDU_PRO_CLUSTER_AUTOSCALING_MAX_NODES=8
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MIN_NODES=1
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MAX_LOAD=70
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MIN_LOAD=30

Now let's take a 1-hour time window in which 2 identical big OpenVidu sessions will be created. Each session represents a 10-to-10 video-audio conference, each one of them totalling 100 streams and increasing around 36% the CPU load of a 2 CPU - 4GB RAM server. This situation is represented in the graph below:




  1. The cluster starts at 00:00 with 1 Media Node (the minimum forced by Min Media Nodes) and 0 load.
  2. At 05:00 the first session is created, and at 10:00 the total load has increased up to 36.59%. The average load is still between the limits, so no action is taken.
  3. At 20:00 the second session is created, and at 25:00 the total load has increased up to 73.18%. The average load now exceeds the upper limit, so a new Media Node is added by the algorithm, entering "launching" status and immediately decreasing the average load back to a safe 36.59%.
  4. The new Media Node enters "running" status at 40:00, at which point one of the sessions ends. At 45:00 the total load has decreased to 36.59% and the average load down to 18.29%, because our cluster still has 2 running Media Nodes.
  5. At 45:00 one Media Node instantly enters "terminating" status. The one added in second place is for sure empty, as the only ongoing session is the first one, hosted by the first Media Node from the very beginning. So the termination process just immediately removes the second Media Node from the cluster. From 50:00 to the end, the cluster load remains at a comfortable 36.59% with the first Media Node still up and running, hosting the first session.



Scenario 2: small sessions cause a continuous growth until Media Node limit is reached 🔗

Let's suppose we configure our cluster with the following values:

OPENVIDU_PRO_CLUSTER_AUTOSCALING_MAX_NODES=2
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MIN_NODES=1
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MAX_LOAD=70
OPENVIDU_PRO_CLUSTER_AUTOSCALING_MIN_LOAD=30

Now let's take a 1-hour time window in which 6 identical small sessions will be created gradually and evenly distributed over time. Each session represents a 7-to-7 video-audio conference, each one of them totalling 49 streams and increasing around 25% the CPU load of a 2 CPU - 4GB RAM server. This situation is represented in the graph below: