How to Transcribe Audio Files to Text

Ng Wai Foong
Level Up Coding
Published in
5 min readNov 30, 2021

--

An alternative API for Speech Recognition

Photo by Kelly Sikkema on Unsplash

The topic for today is about transcribing audio recordings to text programmatically. With the advancement of machine learning, it is now possible to process human speech and transform it into text. Hence, the Automatic Speech Recognition (ASR) field has been one of the hottest topics in recent years and is growing exponentially day by day.

ASR is commonly used by companies for the following downstream tasks:

  • Virtual conference — transcribe presenter’s speech to convey the message better
  • Telephony — transcribe audio recording from telecommunication services to automate the response or get better insights on the underlying conversation
  • Video platforms — generate subtitles or captions for the videos

In this tutorial, you will learn to transcribe local audio files using the Speech-to-Text API provided by AssemblyAI. I have specifically chosen this API for the following reasons:

  • There already exist numerous articles on Google Speech-to-Text and AWS Transcribe.
  • It provides quite a number of features that are extremely useful. For example, speaker diarization, custom vocabulary and topic detection.
  • The official documentation contains example codes for Node.js, Python, PHP, Ruby and C#. This tutorial focuses on the Python programming language.

Let’s proceed to the next section and start installing the necessary modules.

Setup

It is highly recommended to create a new virtual environment before you continue. By default, you should have the requests package installed. If that is not the case, run the following command to install it:

pip install requests

Register a new free trial account here and obtain the corresponding API key. The trial account comes with 3 hours of transcription per month.

Image by the author

Copy down the API key as you will need it later on when calling the APIs.

--

--

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/