Principal Component Analysis for Dimensionality Reduction in Python

Misha Sv

Published in

Level Up Coding

9 min readMar 29, 2020

This article will focus on a walkthrough for principal component analysis in Python.

Table of Contents:

Introduction
Principal component analysis (Overview)
Principal component analysis in Python
Conclusion

Introduction

One of the main reasons for writing this article became my obsession to know the details, logic, and mathematics behind Principal Component Analysis (PCA). A majority of the online tutorials and articles about principal component analysis in Python today focus on showing learners how to apply this technique and visualize the results, rather than starting from the very beginning as to why do we even need it in the first place? What is it with our data that we need to shrink the number of features or group them?

Let’s start from the beginning. What are you going to do with the dataset you have even if you don’t do any dimensionality reduction? I guess you are trying to feed it to a machine learning algorithm right?

So that’s our first step. Our goal is to have an algorithm-friendly dataset. What do we mean by that?

When you have a lot of features, there are a few potential drawbacks:

Your model will have a high degree of complexity
They may cause a significant amount of noise

Principal Component Analysis for Dimensionality Reduction in Python

Introduction

Create an account to read the full story.

Published in Level Up Coding

Written by Misha Sv

No responses yet