WhatsApp Analytics: Spatial Mapping of Users of WhatsApp Groups

Vonage Dev
Level Up Coding
Published in
7 min readApr 22, 2021

--

WhatsApp groups have served as an environment to establish collective conversations with others around the world.

In this tutorial, we’ll generate and plot analytics based on the participants of a WhatsApp Group. We’ll geocode the users’ location and generate a country-level distribution. This interface will be built with Python using Selenium, Plotly, Vonage Number Insight API, Google Maps API, and Mapbox API.

Prerequisites

To follow and fully understand this tutorial, you’ll also need to have:

Vonage API Account

To complete this tutorial, you will need a Vonage API account. If you don’t have one already, you can sign up today and start building with free credit. Once you have an account, you can find your API Key and API Secret at the top of the Vonage API Dashboard.

Below are the results of the final interface you’ll build:

File Structure

See an overview of the file directory for this project below:

The content of all files listed in the directory tree above will be created through this tutorial’s subsequent steps.

Set up a Python Virtual Environment

You’ll need an isolated environment for the python dependencies management unique to this project.

First, create a new development folder. In your terminal, run:

Next, create a new Python virtual environment. If you are using Anaconda, you can run the following command:

Then you can activate the environment using:

If you are using a standard distribution of Python, create a new virtual environment by running the command below:

To activate the new environment on a Mac or Linux computer, run:

If you are using a Windows computer, activate the environment as follows:

Regardless of the method you used to create and activate the virtual environment, your prompt should look like the following:

Requirement file

Next with the virtual environment active, install the project dependencies and their specific versions as outlined shown below:

These packages with the specific versions can be installed via the requirement file from your terminal:$ pip install -r requirements.txt or conda install --file requirements.txt (if you are on Anaconda) and voila! All of the program's dependencies will be downloaded, installed, and ready to be used.

Optionally, you can install all the packages as follows:

  • Using Pip:
  • Using Conda:

Setting up APIs and Credentials

Next, you’ll need to set up some accounts and get the required API credentials.

Google Maps API

The Google Maps API will enable the geocoding function, which is crucial to this project. The API is readily available on Google Cloud Console.
First, you need to set up a Google Cloud free tier account, where you get $300 free credits to explore the Google Cloud Platform and products. Next, with your Google Cloud Console all set up, you need to create an API key to connect the Google Maps Platform to the application.
Finally, activate the Google Maps Geocoding API to enable it for the project.

Plotly API and Mapbox Credentials

To create beautiful data visualizations, Plotly on Python will be utilized, and the aesthetics enhanced using Mapbox.

The Plotly plots are hosted online on Chart Studio (part of Plotly Enterprise); you need to sign up, generate and save your custom Plotly API key.

To achieve the desired plot enhancement, you also need to sign up for Mapbox and create a Mapbox authorization token.

Separation of Settings Parameters and Source Code

In the previous section, you’ve generated various API credentials.
It is best practice to store these credentials as environment variables instead of having them in your source code.

An environment file can easily be set up by creating a new file and naming it .env, or via the terminal as follow:

The environment file consists of key-value pair variables. For example:

You can access these environment variables in the source code using the Python Decouple built-in module.

It’s also good practice to add the .env file to the gitignore file. Doing so prevents sensitive information such as API credentials to become public.

The scripts follow the Object-Oriented Programming paradigm. The following are high-level explanations for each script.

automate.py

The first step to this project workflow is WhatsApp automation using Selenium.
Selenium is an open-source web-based automation tool that requires a driver to control the browser. Different drivers exist due to various browser configurations; some of the popular browsers’ drivers are listed below:

This tutorial uses the Chrome driver. To make it quick and easy to access, move the downloaded driver file to the same directory as the script utilizing it. See the file structure above.

This script comprises a WhatsappAutomation class that loads the web driver via its path, maximizes the browser window, and loads the Whatsapp Web application. The 30 seconds delay initiated is to provide the time to scan the QR code to access your Whatsapp account on the web.

Upon scanning your QR code with your phone, your Whatsapp account opens on the web.

The WhatsappAutomation class has two classes

  • get_contacts()
  • quit()

The browser will notify you that “ Chrome is being controlled by automated test software” to indicate that Selenium will have been activated for automation in the browser.

Next, you need to access the desired group and contacts, as shown below.

The automation step involves locating the WhatsApp web page element that contains the phone numbers as seen in the image above. There are numerous ways to select these elements, as highlighted in the Selenium documentation. For this project, use xpath.

To access these element selectors, you need to inspect the Whatsapp web page.

Next, the contact entries obtained via the Xpath need to be cleaned up and saved as a CSV file. You’ll use regular expressions to remove the ‘+’ character and any whitespaces from the phone numbers.
To promote efficient memory management, quit the selenium-powered browser upon completion of a session.

analytics.py

Next, you’ll use the Vonage Number Insights API to generate insights from the saved CSV file. This API provides information about the validity, reachability and roaming status of a phone number.

The script is made up of a WhatsappAnalytics class that first loads the Vonage credentials stored in the .env file using the Python decouple module. Next, it has a get_insight() method that takes the contact list and initiates an Advanced Number Insight to get the countries associated with the phone numbers. Finally, the list of countries is saved as a CSV file.

geocoding.py

Next, the string description of the various locations (country names) will be geocoded to create the respective geographic coordinates (latitude/longitude pairs).

This script is made of a GoogleGeocoding class that first loads the Google Maps API keys. This class has a geocode_df method with a dataframe argument-the phone numbers and countries previously saved. This method also aggregates the dataframe by countries and returns the respective latitude and longitude pairs.

plotting.py

Next, you will need to map the geospatial data created (latitude and longitude pairs).
Mapmaking is an art; to make the project results aesthetically pleasing, use the Plotly library and Mapbox maps.

This script comprises the SpatialMapping class that loads the Mapbox token and chart_studio credentials. This class has two methods, plot_map and plot_bar, that plot the distribution of the Whatsapp group's users as a map and a bar chart.

main.py

main.py is the point of execution of the program. Here, all the script classes and imported, and the various required parameters are inputted in the main() function.

Try it out

In your terminal, run the main script file as follows:

This will import the various scripts and execute the main() function to yield the desired results.

Results

I’m sure you can already think of all the possibilities and use cases for this new piece of knowledge. The possibilities are endless.

Thanks for taking the time to read this article!

Happy Learning!

References

Vonage Number Insight API

--

--

Developer content from the team at Vonage, including posts on our Java, Node.js, Python, DotNet, Ruby and Go SDKs. https://developer.vonage.com