What are WebSockets? — The Technology that Connects Clients and Servers Together

Sunny Singh
Level Up Coding
Published in
12 min readOct 20, 2020

--

When you type in a URL and hit Enter, your browser essentially sends an HTTP request to the server that is associated with the URL. The server then responds with an HTTP response and may send back something like an HTML file.

Once the connection is made and the transaction is done, the connection ceases to exist. No more communication can happen between your browser and the server.

What if we wanted to build a chat application where we know when others have sent a new message? With regular HTTP, you’d have to keep refreshing the page to see the new messages. That’s not user friendly though. We can do better.

The History of Realtime Web Communication

When this problem of real-time communication between the client and server first arose in the early 2000s, the answer was AJAX, or Asynchronous Javascript And XML. It allowed web developers to make requests to the server without having to reload the entire page. You could now send a request for specific information and have your information updated on the page without having to reload.

The downfall with AJAX however was that it was uni-directional. Basically, your browser could send a request to the server to see if there is new data. However, the server cannot initiate a connection with your browser and send data updates to your browser.

So developers got creative with AJAX and created something called polling. Essentially, your browser would send requests on a set time interval or keep the connection alive until new data was passed back by the server.

Regular old polling is when you send AJAX requests on a set time interval. For example, with a chat application, you could send a request every 5 seconds to the server to get the newest messages.

Long-polling is when the connection between the server and the client is kept alive long enough for the data to respond with new data. Once new data has been given by the server, the connection is closed. However, you can immediately send a new long-polling request to the server to emulate real-time transaction of data.

WebSockets to Save the Day

AJAX only allowed us to send a request to the server whenever we wanted, however, it did not allow the server to send a response whenever it needed to.

WebSockets saves the day. WebSockets allow for bi-directional (duplex) communication. This means that you can have your client and server talk to each other without having the connection drop. No longer do we have to rely on the HTTP protocol for real-time communication. Rather, we “upgrade” the HTTP protocol so that we can maintain communication between the client and the server.

When a WebSocket first starts, it sends a simple HTTP request to a URL that you specify. From there, the HTTP request “upgrades” into a TCP socket, essentially a secure tunnel for data to flow through after the HTTP handshake occurs. The handshake can be thought of as the “agreement” between the client and the server to maintain a connection.

When to Use WebSockets?

As great as it sounds, WebSockets is not a full-on replacement for the HTTP protocol. The best time to use WebSockets is when you need low-latency real-time communication between your server and client.

So for example, if you are building a chat application, you’d like for your user’s app to stay connected to your application’s server so that it may get notifications when users text back. WebSockets can help with the continuous data flow.

Essentially, WebSockets is a good idea when you’d like to “stream” data over the internet, whether it be client-to-server or server-to-client.

It’s also helpful to note that WebSockets do not work on every browser. Some browsers support it, some don’t. A great resource to use to test if WebSockets has been integrated into browsers is CanIUse.com.

Did You Know?
Google Drive does not use WebSockets to enable the collaborative features of Drive. They use their own standard that is slightly different than WebSockets. This answer explains it well.

Google may have decided to do this because WebSockets are not yet supported on all browsers. Many people depend on Google products internationally and so older technologies have to be supported as well in order to provide a more unified user experience no matter the quality of the technology.

This does not mean that you should ditch WebSockets however. Just because Google does not use it does not mean that WebSockets shouldn’t be used. WebSockets have their own time and place. Especially when large browser support is not a concern for your business.

How WebSockets Work

WebSockets made it possible to keep an HTTP connection open between the client and the server so that they can both communicate. This was a major development in web communications. The following are some of the important distinctions of WebSockets:

  • WebSockets is an independent TCP-based protocol. Just how HTTP is an independent TCP-based protocol to send and receive data. However, WebSockets is designed to support any other protocol that runs on a TCP connection. Essentially, you can use HTTP and WebSockets at the same time.
  • WebSockets is a transport layer that allows other protocols to run on top of it. The WebSockets API allows you to define sub-protocols. Sub-protocols are libraries that help to interpret specific types of other protocols. Think of the sub-protocols as “plugins” for the WebSockets technology.
  • Examples of other protocols, or “plugins”, you can run are XMPP, STOMP, SOAP, WAMP, and many others.
  • The only requirement for using WebSockets is that you use a Javascript library that can handle the “handshake” and maintain a WebSockets connection. Think of the Javascript library as a module that allows you to launch a WebSockets connection and keep it open.

The Beginning of the Journey

WebSocket connections start off as simple HTTP requests. Your browser sends off the connection to the webserver and the webserver will see that you want to “upgrade” your connection. This kicks off the “handshake” process. Once that is done, a WebSocket connection is created.

Exactly what’s happening is this:

  1. Your browser sends off a regular HTTP request with an additional header to be requested.
  2. The webserver gets the HTTP request and notices the request for the Upgrade header. This lets the webserver know that we are requesting for a WebSockets connection.
  3. If the server supports the WebSockets protocol and responds with basically a confirmation that the communication protocol has switched.

Something to note is that WebSocket URLs use the ws or wss scheme. You’re no longer going to use http or https to communicate.

Additional Information
If you are using some sort of session cookie to identify the user that initiates an HTTP request, you will want to know how to protect against CSRF (Cross-Site Request Forgery) attacks. Here’s an amazing article that covers achieving a “firm handshake” for your WebSockets based applications.

An example of the regular HTTP request launched in the first step may look like the following. I have bolded the extra header that allows the webserver to know that we are attempting to upgrade the connection:

GET wss://websocket.sunnychopper.com/ HTTP/1.1
Origin: http://sunnychopper.com
Connection: Upgrade
Host: websocket.sunnychopper.com
Upgrade: websocket

Let There Be Communication

Once the HTTP request has been upgraded into a WebSockets connection, your browser and the server can now communicate.

Something to note is that all key players in the mobile devices industry provide WebSocket APIs for their native apps. This means that you can use WebSockets outside of just a web browser. You can enable WebSockets in your mobile applications using Swift and Java/Kotlin.

What about apps built with something like React Native or Flutter?
If you are using React Native, Flutter, Ionic, Xamarin or whatever hybrid method for developing apps, you can always check if the library supports WebSockets.

For example, React Native has an NPM package called react-native-websocketthat helps with establishing a WebSockets connection despite not developing directly in native languages.

Implementing WebSockets

In order to get started with WebSockets, you can use the following WebSockets URL to test things out: wss://echo.websocket.org

That’s a public address that the nice people behind WebSockets provide us so that we can get familiar with the technology without having us to set up a server ourselves.

You can view all of the code with this link here. It will take you to CodePen where you can analyze the code in realtime and play around with it.

Want to set up the backend server yourself?
Depending on the backend technology that you are using (Node.js, PHP, Python, etc.), there will be different ways to implement WebSockets into your webserver.

Simply Google something like “node expressjs websockets” or “php laravel websockets” and you will find a plethora of information.

WebSocket Events

There are four main types of events that happen with the WebSocket API: Open, Message, Close, and Error.

Open
Once a connection has been established between the client and the server, the open event is fired. This is also known as the “initial handshake” between the client and the server.

If you’d like to do some action when the connection is established, you want to define an onopen function.

Changing a button’s text when the WebSockets connection has been established.

Message
Whenever the server wants to send some data, it is called a “message” since it’s literally a new message that your browser will get from the server. Messages can be plain text, binary data, JSON, image data, and much more.

If you’d like to perform some action whenever your browser gets a message from the server, you want to define an onmessage function.

Updating a table based on the messages received from the WebSockets connection.

Now, whenever I send a message to the test WebSockets connection, it will immediately send a message back that just “echoes” what you said.

Sending requests with data and receiving messages as responses.

Close
Whenever the connection between the client and the server is closed, the Close event gets triggered. A connection can be closed due to poor connectivity or by defining an onclose event.

After the connection is closed, no further messages can be exchanged by the client and the server. Within the definition of the onclose function, you can add some other functionality such as saving, caching, or whatever you’d like to do before the connection is closed.

Running some code whenever the WebSockets connection closes.

Error
Whenever an error occurs during the communication between the client and the server, the Error event gets triggered. You can get detailed information about the error and how to handle it by defining the onerror function.

Adding code to append an error entry into my table whenever an error occurs.

WebSocket Actions

Events are triggered when something happens. Events are reactionary. They are reacting to certain events happening. It’s your way to deal with the data that’s being thrown at your application by the backend server.

On the other hand, there are actions. Actions are proactive and have intent. Actions are what gets run when the user wants to make something happen. Actions can only be initiated by the client using explicit calls to WebSocket functions.

There are two main actions: the send and the close.

Send
When you want your client’s application to send some data to the backend server, you are going to want to use the send() function. For example, if you are developing a live chat application, whenever the user wants to send a message to the chat, you want the user’s browser to send a request to the backend.

Sending the input’s value to the backend server through the WebSockets connection.

Close
If you’d ever like to close the WebSockets connection, you can do so with the close() function. This method is essentially the “goodbye handshake.” Once the connection has been closed, it must be re-established if any further communication is to happen.

Build out Your Application

Those are the basics of WebSockets: events and actions. Once you understand the four events and the two actions, everything else should start to fall into place.

You use send() to be able to send data to the server. This can be anything from strings, JSON objects, XML objects, or array buffers.

For example, imagine you are building a live team code editor. You want everyone to be able to see where everyone’s cursor is in the code. Whenever anyone on the team launches the code editor, you have their browser connect to your backend server through WebSockets.

You could read the position where the user’s cursor is in the code using some Javascript and use the send() function to send specific cursor position information. For example, the object you send could be structured as follows:

{
line: 192,
column: 56,
userId: 424
}

The backend server can receive this information, read it, and send your new position to everyone else that’s also connected to the WebSockets connection. Once your teammates have received your new cursor’s position through an onmessage function, their browser can update the position of your cursor on your teammate’s screen.

TL;DR — Conclusion

In my honest opinion, WebSockets is a technology worth looking into if you’re looking to add a “multiplayer” feel to your product.

There are many benefits to using a WebSockets connection over just a traditional RESTful API and HTTP. Here’s a summary of all the benefits:

  1. Bidirectional
    Traditional HTTP requests are unidirectional. They can only be made from the client to the server. The server cannot initiate a connection to the client and send data to it. With WebSockets, both the client and the server can send and receive messages from each other.
  2. Fully Duplex
    Essentially all this means is that a client and server can be sending messages to each other at the same time. The client does not have to wait for a response nor does the server.
  3. Single TCP Connection
    Whenever you’re working with a simple RESTful API, each time you want to send data, you have to create a new TCP connection to the server by sending an HTTP request. After you receive a response from the API through HTTP, the TCP connection is terminated. WebSockets on the other hand upgrades the HTTP connection and keeps the connection alive so that the client and the server can communication over the same TCP connection. This leads to huge gains in the time it takes to process large number of messages.
WebSockets smokes a RESTful API in terms of performance when streaming large amounts of data.

There are four main events to keep in mind for WebSockets: Open, Message, Error and Close.

Open (socket.onopen) gets triggered when a WebSockets connection has been established. Message (socket.onmessage) gets triggered when the client receives a message from the server. Error (socket.onerror) gets triggered when there is an error in data transmition through the WebSockets connection. Close (socket.onclose) gets triggered when the WebSockets connection closes.

There are two main action types that a user can take: Send or Close.

Send (socket.send(data)) allows the client to send data to the backend server through the WebSockets connection. Close (socket.close()) allows the client to close the WebSockets connection with the server.

Got any Questions?

If you have any questions about WebSockets, feel free to reach out to me on social media. The best way to reach me is through Twitter (sunnychopper) and I’m happy to answer any questions or even just have a conversation about technology and business.

If this guide helped you get the basic understanding of WebSockets are and what they can do for you and your business, please kindly leave some claps as it helps me determine what type of content to make next! I’m always listening to the feedback. 💯

--

--

Backend developer passionate about leveraging practical solutions. Sharing insights on using software development and AI to solve problems. Connect for more!