Here's what I've learned from reversing the Udemy rankings algorithm

Mykhailo Kushnir
Level Up Coding
Published in
5 min readJul 3, 2022

--

*reversing — reverse engineering, or at least pretending to.

Photo by Danil Shostak on Unsplash

Problem description: Having your content in the top area of the ranking tables leads to more views. Period. That's true in Google, and it's undoubtedly truthful at Udemy. But how to get to the top?

Features

The following set of features can characterize every Udemy course:

  • Course Title
  • Course SubTitle
  • Course Description
  • Number of Students
  • Number of Reviews
  • Rating
  • Length of the course
  • Price of the course
  • Lecture Names

And potentially some other features not available to users but visible to developers of the site. For simplicity's sake, let us assume that features like referrer URL or the user's country are not that relevant and examine how only those above influence the final position.

Udemy Knows Lemmatization

Lemmatization is a process of grouping together different inflected forms of a word so they can be analyzed as a single item. For example, the term "walk" can be grouped with its inflected forms, "walks", "walked", and "walking". This process is standard for text analysis because it reduces the number of unique words in a document, making it easier to identify patterns and trends. Lemmatization is also helpful for tasks such as information retrieval, and Udemy devs seem to agree with it.

Go to the Udemy search and type "web scraping". While at the TOP3, you'll see only the courses with the strict entry ("web scraping"), in the middle of the page, you'll see that algorithm highlights words like "scrape" in the description.

Notice bold "scrape" in the description.

This can be useful for optimization of your title and description as you can capture more traffic from the low-frequency requests. You can find relevant keywords to your topic either with Google Keywords Planner or the Marketing Insights Tool from Udemy:

Shares of traffic for "Web Scraping" from Marketing Insights Tool

Whales Get More

Some courses on Udemy are so good that they cover more than they were designed to cover. For example, here are a few examples where the learning material doesn't fit the search request, but Udemy still offers it because many students are taking it, and they have similar interests as those who typically make the initial request.

A great course by Dr Angela Yu regarding Python is #5 in “Web Scraping”

Good-Old SEO Works

In 2010, SEO in Google was all about optimizing your website for the search engines through creating keyword-rich titles and meta tags. While these methods could still be effective, they didn't consider the user experience. As a result, many people began to focus on creating more informative and engaging content rather than keywords. Now SEO is more about creating a great overall experience for users than optimizing for the search engines. In Udemy, old tricks are still gold.

Count how many times you see “SEO” in the TOP3 SEO courses

It seems like the algorithm has little to no understanding of keyword spamming. The more you mention the key terms, the better. Just switching from "Web Scraping" to "Web scraping with Python" moved my course from the second to the first page of the ranking for the request "python web scraping".

Disclaimer: Google, honestly, I'm not trying to optimize for specific keywords here.

If you also play around with other requests, you'll see that sometimes Udemy offers courses with a non-relevant title but with a few lectures on the subject. Probably, Udemy thinks that something is better than nothing.

No Signs Of Deep Learning

This previous hint means that there's no retraining in the algorithm and is purely statistical. While Google would take time to update its ranking after crawling something new from your site, Udemy seems to take it immediately. This can help you optimize your course metadata faster, but I'm not aware of the downturns of the frequent changes in the content. So, maybe, there's a penalty.

Size Matters

Scroll up to the list of features, take a long, appraising look and get back down here. In what attributes from there it is the easiest to make the difference? We've already learned that spamming titles, subtitles, descriptions and lecture names do the job, but there are other smart ladies and gentlemen at the Udemy. People already figured that out, and the TOP 10 in most themes will be heavily optimized. You can't do much about rating and student numbers or reviews, as you first need the traffic to get all of those. Besides the price, which arguably has no influence on the ranking positions, there's only one feature left — the length of the course.

A quick glance at Udemy’s website reveals a startling truth: the vast majority of their top-rated courses are 10 hours or longer. In a world where time is increasingly precious, it seems odd that Udemy would choose to feature courses that require such a significant time commitment. Further, many of these courses are on subjects that are rapidly changing, meaning that the material could be obsolete by the time the course is completed. Even worse, some of the courses seem to be little more than an attempt to cash in on a current trend. In an era where people are looking for quality over quantity, it’s disappointing to see that Udemy is still favouring length over substance.

Conclusion

This post wasn't conceived as a scientific one, so most conclusions here are vague and not proven by data. I'm happy to fix it in the next part, but I'd love to get the feedback first. Let me know if you need more detailed reversing of the Udemy rankings algorithm ✌️

Level Up Coding

Thanks for being a part of our community! More content in the Level Up Coding publication.
Follow: Twitter, LinkedIn, Newsletter
Level Up is transforming tech recruiting ➡️ Join our talent collective

--

--