Add SIP Calls to WebRTC Video Sessions

Vonage Dev
Level Up Coding
Published in
5 min readFeb 23, 2021

--

We’re living in a time of video conferencing. From school to work to family events, video conferencing has become a way of life for many, but there are times when joining from a computer isn’t possible. In this tutorial, we’ll cover how to allow participants to join your Vonage Video API sessions via phone.

Want to skip to the end? You can find all the source code for this tutorial on GitHub.

How Does It Work?

From the Video API session, we’ll make a call to the Voice API. This call will trigger the answer webhook in our application that will create a voice conversation. That conversation will join the video session as another stream.

When users dial into the conference number they will be prompted for a PIN. If the user provides the correct PIN, they’ll join the voice conversation. At that point, the user will be able to hear all participants in the video session and they will, in turn, be able to hear the voice of other participants.

Once the session is over, the call should be hung up to avoid additional Voice or Video API charges.

Prerequisites

To follow along with this tutorial, you will need:

  • A Vonage Video API account. Click here to get one for free.
  • Optional: Ngrok for testing locally

Vonage API Account

To complete this tutorial, you will need a Vonage API account. If you don’t have one already, you can sign up today and start building with free credit. Once you have an account, you can find your API Key and API Secret at the top of the Vonage API Dashboard.

This tutorial also uses a virtual phone number. To purchase one, go to Numbers > Buy Numbers and search for one that meets your needs.

Building the Frontend

Our frontend will use Express with an EJS template. For this article, we won’t cover how to create a Video API session, but you can review the code within the repository to see how we’re doing that. We’ll focus solely on how to add SIP calling to an existing session.

In the template for the video session, add the two JavaScript functions below:

Both of these functions are calling routes on our Express backend. The dialOut method will initiate the voice conference and add it as a stream in the video session. The hangUp function will be used at the end of the session to disconnect the voice conference from the session. Within our HTML, we'll want to add two buttons to call these function8s.

Express Lane to the Backend

With our frontend ready to go, let’s set up our backend to handle connecting to the Vonage Voice API via SIP.

Helper Functions

We’ll need some helper functions before we handle dial-outs and hang-ups.

The generatePin function generates a random 4 digit PIN that we'll use to create a unique PIN for each video session. Callers into the session will be prompted for that PIN before being allowed to join the session.

The generateToken function is used to create a Video API token with the SIP

The setSipOptions function creates an object that we'll use when dialing the SIP connection. It contains authentication information needed to join the voice conference.

Answer the Frontend

With those functions in place, let’s add routes to respond to our frontend. The dial-out route below will use the Video API to connect to a SIP conference. Later, we'll set up the Voice API to know how to respond to these calls.

The hang-up route disconnects the voice conference from the Video API session. Hanging up the call at the end of a meeting is critically important. Otherwise, the voice conference will stay open and connected to the video session. This would cause both to continue incrementing charges.

Voice API Webhooks

When creating a voice application, you’ll need to provide an Answer Url and an Event Url. If you’re running the application locally you’ll want to use ngrok to provide an external endpoint. Provide either your ngrok Url or Heroku Url with routes /voice-answer for the Answer Url and /voice-events for the Event Url.

The /voice-answer route will create a conversation when fired due to our dialing out. When other participants call in they will be prompted to provide the 4 digit PIN for the session. Entries from the caller will be forwarded to the /voice-dtmf route to potentially join the session.

Configuring Settings

Let’s begin by creating a .env file. You can use the .env-sample file within the repo as a template. Its contents should be:

To set videoApiKey and videoApiSecret, create a new project from the Video API dashboard.

Once it’s created, copy the API Key and Secret and paste them into your .env file as the videoApiKey and videoApiSecret respectively.

Now create a voice application and use the API Key and Secret as the voiceApiKey and voiceApiSecret. You'll need to purchase a number and associate it with your voice application. Use that number as the conferenceNumber variable.

Finally, enter the ngrok or Heroku url as the serverUrl.

Now you can join a video session and others can dial your number and enter a PIN code to join the session. It’s important to stress that you need to hang up the call at the end of the video session to prevent usage on both the video & voice accounts when you’re done.

Further Reading

Want to learn more about the SIP Interconnect feature of the Video API? Below are some links you might find useful.

--

--

Developer content from the team at Vonage, including posts on our Java, Node.js, Python, DotNet, Ruby and Go SDKs. https://developer.vonage.com