Can data scraping get you sued?

Mykhailo Kushnir
Level Up Coding
Published in
4 min readJul 7, 2022

--

This week, Meta (Facebook) sued two web scraping companies, alleging they violated its terms of service. The company is seeking damages and filed a lawsuit against the companies, which Meta says are “scraping data from Facebook and Instagram.”

Photo by Tingey Injury Law Firm on Unsplash

Meta’s lawsuit will likely send a chill through the web scraping industry. If the company prevails, it could set a precedent to make it much easier for other sites to sue web scrapers. And that could close a vital data source for many startups and businesses.

If you want to learn the legal way of scraping in no time, I have a course that gives a basic toolkit for students in less than 1 hour.

What is data scraping, and why do people use it?

Web scraping is the process of extracting data from web pages. It can be used to gather information about a particular topic or to collect data for research purposes. While it is often used for commercial needs, scraping can also serve for noble matters and even graphics creation.

For example, I’ve made my first NFT by crawling the most viewed wiki titles across 2021. The idea was to highlight some tendencies during that gloomy year. As you can see down below, the data spokes for itself.

Most popular wiki titles in 2021 scraped by me

Is this for the first time?

There were a few cases.

One example was when Facebook sued Power Ventures for $3 million. Power Ventures was an app that allowed customers to manage their social networks from a single place. Of course, even in 2008, it included Facebook.

Facebook sent cease-and-desist letters to Power Ventures, but the company continued to scrape data. As a result, a lawsuit was filed against Power Ventures, claiming that the company had violated Computer Fraud and Abuse Act and CAN-SPAM Act. This federal law prohibits sending commercial emails with materially misleading information. The case was eventually settled out of court. However, the aftermath of the battle was such that Power Ventures has indeed ceased to exist.

The other prominent fighter with web scrapers is Linkedin. In 2013, the company filed a lawsuit against hiQ Labs. LinkedIn alleged that hiQ had scraped data from its site without authorization. hiQ argued that the data was public information and, therefore, fair game. The case ended in 2019 with a victory for the defenders.

These lawsuits illustrate the legal uncertainties surrounding web scraping. While some companies view it as a valuable tool for gathering data, others believe it to be a form of theft. As more and more companies increasingly rely on data, we will likely see more lawsuits over web scraping in the years to come.

What methods can you use to scrape legally?

Web scraping can be a useful tool for extracting data from websites. However, it is important to scrape websites in a way that does not violate any terms of service or copyright laws. There are a few different methods you should know of:

  • Publicly available APIs: Many websites make their data available through an API, which can be used to scrape the data without violating any terms of service.
  • Screen scraping: Screen scraping refers to extracting data from web pages that are publicly available. This is generally considered to be legal, as long as the web pages being scraped are not behind a paywall or login page.
  • Scrape data that is allowed for bots through robots.txt and Terms and Conditions. The general rule of thumb should be that if Google can scrape it, so could you.

Overall, be ready to submit to demands to stop when those are issued to you. Even if it makes your business hurt. We’re still leaving on a centralized internet, so it is important to respect the data ownership rights of the site that you’re scraping.

Conclusion

There’s no doubt that web scraping private data can get you in trouble. Even if you manage to avoid legal persecution, you’ll still have to deal with public opinion. The fact is that most people don’t like having their personal information collected without their knowledge or consent. And if they find out that you’ve been scraping their data, they’re likely to be angry and upset. In some cases, this can lead to public shaming and even boycotts of your business. So it’s important to be very careful about what data you collect and how you use it. Otherwise, you could find yourself in hot water very quickly.

Have you ever scraped data? Let me know in the comments below. Promise, I won’t tell anybody!

Level Up Coding

Thanks for being a part of our community! More content in the Level Up Coding publication.
Follow: Twitter, LinkedIn, Newsletter
Level Up is transforming tech recruiting ➡️ Join our talent collective

--

--