Launch Your ML App in 4 Easy Steps

From model training to deployment in AWS

Albert Jimenez
Level Up Coding

--

Machine learning and data scientist practitioners often carry out projects and experiments in Jupyter Notebooks. This is a procedure that allows them to iterate fast and create visualizations easily. However, it is not as impactful, scalable, and easy to distribute to the world as deploying a model in the cloud. Most of the time, a good distribution network is even more important than the quality of the product itself.

This tutorial aims to bring closer how to set up a machine learning project and integrate it with MLOps practices such as creating a Docker container and deploy it in Amazon Web Services (AWS). It is suitable for beginners as a learning exercise and for seasoned practitioners that want to make the most of any project.

It does not matter how good is your product if no one can see it — Photo by Josh Calabrese on Unsplash

Our main objective will be to create a machine learning application able to classify different audio sounds and deploy it in the cloud. During this tutorial, we will perform data exploration and analysis, followed by the training and evaluation of a machine learning model, we will create our app using Flask, and finally we will deploy it on the Cloud using Amazon Elastic Beanstalk and Docker for free.

Introduction

In this repository you will find all the python code required to run the tutorial. It contains:

  1. A Colab notebook to perform Data Exploration and Analysis
  2. A Colab notebook to perform Training and Evaluation of a Machine Learning Model
  3. Instructions to create your own Flask app
  4. Instructions to create a Docker image and upload it to AWS Beanstalk to share your app with the world
Deploy your Sound Classifier — Photo by Jesman fabio on Unsplash

I recommend running the Colab Notebooks in parallel to reading, as (for brevity) there are some steps not shown in the post.

1. Data Exploration

We are going to use the well-known UrbanSound8k Dataset, which contains the following 10 sounds: Air conditioner, car horn, children playing, dog bark, drilling, engine idling, gun shot, jackhammer, siren, and street music. In this section we will perform a set of data analysis steps to observe and see patterns in the data given and choose the best modelling accordingly.

We can observe in the notebook that the data has been recorded and digitalized in different ways:

  • It has been mostly recorded using 2 channels in almost all the samples (stereo).
  • The sample rates go from 8kHz to 192kHz (mostly 44kHz, 48Khz).
  • The length of the audios goes from 0.0008s to 4s (mostly 4s).
  • The bits per sample used go from 4 to 32 (mostly 24 bits).

Due to this variability, the data will need to be standardized before to be fed to a machine learning model. We will use the Librosa library to load, standardize, plot and perform audio processing.

We can observe the different signal waveforms (amplitude w.r.t time) in the image below.

The audio waveform for a random sample of each dataset class

Short-Term Fourier Transform (STFT), Mel-Spectograms and Mel-Frequency Cepstral Coefficients (MFCCs) are all popular ways to process audio signals and generate discriminative features that we could use as input for machine learning algorithms. For a review of audio signal processing, I really recommend watching a great series of youtube videos made by Valerio Velardo.

We are going to visualize the Mel-Spectogram which is a representation of the different frequency magnitudes at different timesteps. The frequency magnitude is transformed to be on the Mel-Scale, that takes into account how humans perceive and process audio signals.

Mel-Spectrogram of the previous audio samples

After taking a look at the graphs for some different permutations, we can observe that the signals for the dataset classes are different enough to be classified correctly by a machine learning algorithm.

2. Machine Learning

In this part of the project, we are going to train a machine learning model able to categorize between the 10 sounds of our dataset. According to the data exploration and after reading about state-of-the-art on audio signal classification, I made the following design choices:

  • We will train a Convolutional Neural Network (CNN) and use either MFCCs, STFT or Mel-Spectrogram as input features.
  • As the audios duration range from 0 to 4s, we will pad the spectrogram generated, to make all the audios of equal length.

My first choice would be using STFT features as in theory CNNs could take more advantage of the frequency-temporal structure and learn filters from the raw signal rather than the representations designed by humans. However, due to computational resources and to make things faster, we will use MFCCs as features as they are more memory efficient.

To compute MFCCs as features it is usual to:

  • Compute the first 13 MFCCs, their derivatives and second derivatives.
  • Use the first 40 MFCCs (Librosa default).

Our machine learning model is going to be a Fully Convolutional Network (FCN). As our images are rectangular in shape (y axis is MFCC, x axis is time), instead of using square filters (as usual) we are going to make them rectangular so they can learn better the correlation of the MFCCs with the temporal dimension.

Our model definition is shown below:

We are facing a classic multi-classification problem, so we will use the Categorical Cross Entropy loss function. In addition, as optimizer we will use the Keras implementation of Adam with the default hyperparameter values.

The notebook contains all the training and evaluation code. You should be able to try different hyperparameters and network configurations. Once you train a model you can see the results and plot them in a confusion matrix. That way you could compare performance between models.

Confusion Matrix showing our model predictions

3. Creating the Flask App

Once we are happy with the model we trained we are going to create a little app and deploy it in the cloud. We will use Flask to build our app.

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja and has become one of the most popular Python web application frameworks.

I will provide the app.py file containing all the code needed to run it. You will need a Python installation and pip to be able to run the following code.

  1. Clone this repository:
git clone git@github.com:jsalbert/sound_classification_ml_production.git

2. Create a virtual environment using virtualenv and install library requirements:

pip install virtualenv
virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt

3. Go to the folder flask_app and run the app locally in your computer:

python app.py

4. Access it via localhost:5000

You should be able to see this screen:

At this point you should be able to run the app locally in your computer.

4. Creating a Docker Image and Uploading to AWS Elastic Beanstalk

And finally… let’s deploy our app in the cloud using Docker and AWS Elastic Beanstalk!

For this part of the tutorial you will need to install Docker and create an account in Amazon Web Services (AWS) and Docker Hub. I recommend reading and going through Docker-Curriculum for a more extensive introduction to Docker.

What is Docker and what are its benefits?

Docker is a tool that allows developers, sys-admins etc. to easily deploy their applications in a sandbox (called containers) to run on the host operating system i.e. Linux. The key benefit of Docker is that it allows users to package an application with all of its dependencies into a standardized unit for software development. Unlike virtual machines, containers do not have high overhead and hence enable more efficient usage of the underlying system and resources.

Containers offer a logical packaging mechanism in which applications can be abstracted from the environment in which they actually run. This decoupling allows container-based applications to be deployed easily and consistently, regardless of whether the target environment is a private data center, the public cloud, or even a developer’s personal laptop.

We will use Docker to create a container for our app. I will provide the necessary Dockerfile and Dockerrun.aws.json files to be able to run your app locally and deploy it in AWS.

First you should verify Docker is properly installed, by running:

docker run hello-world

and you should see:

Hello from Docker.
This message shows that your installation appears to be working correctly.

After that, we will build our own Docker image via a Dockerfile.

A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build users can create an automated build that executes several command-line instructions in succession.

You should be on the same directory where the Dockerfile is located and run:

docker build -t yourusername/sound_classification .

Now you should be able to see your image in the list when you run:

docker images

And you should be able to run the app locally:

docker run -p 8888:5000 yourusername/sound_classification

To deploy our image we will publish it on a registry which can be accessed by AWS, in our case Docker Hub. If this is the first time you are pushing an image, the client will ask you to login. Provide the same credentials that you used for logging into Docker Hub.

docker login

After that you can push your image to the registry by running:

docker push yourusername/sound_classification

Once that is done, you should be able to view your image on Docker Hub. And now that your image is online, anyone who has docker installed can access and use your app by typing just a single command!

docker run -p 8888:5000 yourusername/sound_classification

Now we will see how to carry out the deployment on AWS Elastic Beanstalk.

AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS. You can simply upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring. At the same time, you retain full control over the AWS resources powering your application and can access the underlying resources at any time.

You will need to modify the Dockerrun.aws.json file to replace the username with yours. If you leave mine, it should also work, as I am hosting a public image.

Follow the next steps:

  1. Login to your AWS console.
  2. Search Elastic Beanstalk on the search bar or the menu and click.
  3. Select Create new environment.
  4. Introduce a name for your application on application name.
  5. Choose Docker as a platform. Select Upload your code and upload the Dockerrun.aws.json file after making your changes.

The environment creation will take some minutes and after that you should be able to access the website where your app is hosted.

If you click on the link you will be directed to the website where the app is deployed. You can play with your own audios or the example ones.

And… Don’t forget to shut down the environment in AWS when you finish to not get charged for an extra use of resources!

For any feedback, comment, typo or error correction, please let me know in the Issues section.

Hope you enjoyed the tutorial and thanks for reading.

--

--

Machine Learning Engineer & Researcher @scribd. I write about ML, Data Science and MLOps.