Python prerequisites to kick start your Machine Learning journey

Harsh Panchal
Level Up Coding
Published in
11 min readJun 1, 2021

--

Photo by Joshua Reddekopp on unsplash

Welcome to an Illustrated Guide on python prerequisites. This walk provides a comprehensive overview of the Python programming language features you will need to master for starting a basic machine learning journey. This guide includes lists, tuples, dictionaries, for-loops, if-statements, NumPy, and MatPlotLib.

1. Variables

In Python, we store all the pieces of data — numbers, characters, strings, everything — as objects, and refer to these objects using a variable. As a simple case, we can assign a variable value using an assignment operator, which is an “equal” sign:

x = 1
y = 2
z = x + y
print(z)

Note that we can make an assignment using certain values — such as assigning 1 in variable x and 2 in variable y — and we can give a value concerning other variables — such as variable in z the sum of x and y.

2. Collections

Python Collections allow us to place multiple data objects in one object that we can access or work across. There are a variety of built-in collections, and we will discuss three of them: lists, tuples, and dictionaries. Later, we will introduce another method of data collection using arrays from a library called NumPy.

A. Lists

The list contains a sequence of objects, usually represented by square brackets with commas between objects in the order in which they are listed below:

list = ['A', 'B', 'C', 'D']

The above, list contains the order of the characters. The list, however, includes items for a variety of objects:

varied_list = ['a', 2, 'b', 3.14] # a list with elements of char, integer, and float typesnested_list = ['hello', 'Harsh', [1.618, 42]] # a list within a list!

Lists allow for so-called indexing, where a specific item of the list can be found. For example, say you wanted to grab the second element of varied_list above.

second_element = varied_list[1]
print(second_element)

Python is a so-called programming language with zero indexing. This simply means that the “first” item in the list or other data collection is displayed using “0” (zero) rather than “1”. That’s why, above, we capture the second varied_list item using the full index “1” instead of “2” as some might expect in a single reference language (such as Matlab).

Another feature of python indexing that facilitates the use of negative indexing. As we discussed above, the “first” item of the python list is indicated by the “0” index; therefore, it is probably natural to view the last item in the list as being indexed by “-1”. See the following examples of negative indexing:

last_element = list[-1] #to print the last element of list
print(last_element)
last_element_2 = list[len(list)-1] #to print also the last element of list
print(last_element_2)
second_to_last_element = list[-2] #to print second to last element of list
print(second_to_last_element)

Similar to indexing is list slicing, where the merging section of the list can be reached. A colon (:) is used to make a slice, with numbers that describe the positions that should start and end a piece. Below, we show that the beginning or end of a piece value can be omitted when a person slice from the beginning or end of a list. Also, note below that the slide start guide is included in the piece, but the slide end indicator is not included.

NFL_list = ["Charger", "Bronco", "Raider", "Chief", "Panther", "Falcon", "Cowboy", "Eagle"]AFC_west_list = NFL_list[:4] # Slice to grab list indices 0, 1, 2, 3 -- "Charger", "Bronco", "Raider", "Chief"
print(AFC_west_list)
NFC_south_list = NFL_list[4:6] # Slice list indices 4, 5 -- "Panther", "Falcon"
print(NFC_south_list)
NFC_east_list = NFL_list[6:] # Slice list indices 6, 7 -- "Cowboy", "Eagle"
print(NFC_east_list)

B. Tuples

The tuple is a Python collection that is very similar to the list, with some subtle differences. For starters, tuples are shown using parenthesis instead of square brackets:

x = 2
y = 3
coordinates = (x, y)

The dynamic coordinates above are Tuple containing the variables x and y. This example was chosen to show again the differences between the general use of tuples compared to the list. While lists are often used to contain items of the same value in some sense, tuples are often used to contain the corresponding unit attribute. For example, as above, it makes sense to treat point links as a single unit. As another example, consider the following Tuple and the list of dates:

year1 = 2022
month1 = "May"
day1 = 15
date1 = (month1, day1, year1)
year2 = 2021
month2 = "June"
day2 = 17
date2 = (month2, day2, year2)
years_list = [year1, year2]

Note above that we have collected the attributes of one date into one Tuple: those pieces of information all define one “unit”. In contrast, in the list of years, we have collected different years in the code-snippet: the list values are the same (they are both years), but they do not define the same unit.

The difference I draw between the tuples and the list here is that which most Python programmers do, but not the one that is strictly enforced (i.e., you won’t find errors if you break this assembly!). Another subtle variation of tuples and lists includes the so-called Python mutability of variables. Mutability is so complex that we do not need to discuss it in our Python guide, but interested people are encouraged to read it further if they wish!

C. Dictionaries

Now that you’ve seen the parenthesis (for tuple) and the square brackets (for lists), you may be wondering what curly brackets are used for. Answer: Python dictionaries. The descriptive feature of the Python dictionary is that it has keys and related values. When defining a dictionary, this organization can be achieved using color (:) as done below:

book_dictionary = {"Title": "Ulysses", "Author": "James Joyce", "Year": 1818}print(book_dictionary["Author"])

In addition, the keys to the book_dictionary are “Title”, “Author”, and “Year”, and each of these keys has the corresponding value. Note that key-value pairs are separated by commas. Using the keys allows us to access a dictionary clip by name, rather than needing to know the index of the passage we want, as is the case with lists and tuple. For example, above we might find Ulysses’s author using the “Author” key, instead of using the index. In fact, unlike a list or a Tuple, the order of the items in the dictionary does not matter, and dictionaries cannot be identified using numbers, which we see below when we try to reach the second item in the dictionary using the whole number:

print(book_dictionary[1])
Image by author

3. Control Flow

Like many popular programming languages, Python offers several ways to control the flow of execution within the system. Here, we will present the looping and conditional statements.

In the meantime, it’s important to let you know about a unique feature of Python language: indentation. While language like C uses curly braces to contain code statements within loops or conditionals, Python displays these statements by indentation. This feature lends readability to Python code, as you’ll see in the examples below.

A. For-Loops

Looping statements allow for repeated execution of code. For example, suppose we want to add all the numbers between zero (0) and ten (10), not including ten. Yes, we can do this in one line, but we can also use the loop to add one value at a time. Below is a simple link code that accompanies:

sum = 0
for i in range(10):
sum = sum + i
print(sum)
alternative_sum = 0+1+2+3+4+5+6+7+8+9
print(alternative_sum==sum)
Image from author

The built-in range () function produces a sequence of values that we loop over, and note that the range (10) does not include 10 itself. In addition to looping through numbers using the range () function, we can also sort items listed, shown below:

ingredients = ["flour", "sugar", "eggs", "oil", "soda"]
for ingredient in ingredients:
print(ingredient)
Image from author

In addition, the for-loop runs on top of the list of ingredients, and inside each loop of these items is called an ingredient. The use of singular/plural nouns to manage this iteration is a standard Python motif, but you do not need to use it in your program.

B. Conditionals

Usually when making a Program, one will want to execute only certain parts of the code where certain conditions are met, for example, when the variable has a certain value. This is accomplished using conditional statements: if, elif, else.

for i in range(10):
if i % 2 == 0: # % -- modulus operator -- returns the remainder after division
print("{} is even".format(i))
else:
print("{} is odd".format(i))
Image from author
# Example of elif 
# Print the meteorological season for each month
print("In the Northern Hemisphere: \n")
month_integer = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] # i.e., January is 1, February is 2, etc...
for month in month_integer:
if month < 3:
print("Month {} is in Winter".format(month))
elif month < 6:
print("Month {} is in Spring".format(month))
elif month < 9:
print("Month {} is in Summer".format(month))
elif month < 12:
print("Month {} is in Fall".format(month))
else: # This will put 12 (i.e., December) into Winter
print("Month {} is in Winter".format(month))
Image from author

4. List Comprehension

Python allows list comprehension where list elements are organized across a single line of code. Say, for example, that you wanted to add 1 to each item in the list of whole values. You can do this using the listing as follows:

even_list = [2, 4, 6, 8]
odd_list = [even+1 for even in even_list]
print(odd_list)
Image from author

Note from above the similarities between the understanding of the list compensation and for loop; Python has a list compensation as a compact, “Pythonic” method for performing tasks that can be performed within a loop.

5. NumPy Library

NumPy’s library offers Python a wide range of scientific computing skills. Central to this is the Array object, which offers a different way of arranging values of the same type. Numpy layouts allow cutting and indexing similar to lists. Most importantly, Numpy has a tremendous amount of mathematical functions that can transform arrays and perform calculations between arrays. For those familiar with MatLab, these activities should be reminiscent of multiple matrix functions.

import numpy as npx = np.array([2, 4, 6]) # create a rank 1 array
A = np.array([[1, 2, 3], [4, 5, 6]]) # create a rank 2 array
B = np.array([[1, 2, 3], [4, 5, 6]])
print("Matrix A: \n")
print(A)
print("\nMatrix B: \n")
print(B)
Image from author
# Indexing/Slicing examples
print(A[0, :]) # index the first "row" and all columns
print(A[1, 2]) # index the second row, third column entry
print(A[:, 1]) # index entire second column
Image from author
# Arithmetic Examples
C = A * 2 # multiplies every elemnt of A by two
D = A * B # elementwise multiplication rather than matrix multiplication
E = np.transpose(B)
F = np.matmul(A, E) # performs matrix multiplication -- could also use np.dot()
G = np.matmul(A, x) # performs matrix-vector multiplication -- again could also use np.dot()
print("\n Matrix E (the transpose of B): \n")
print(E)
print("\n Matrix F (result of matrix multiplication A x E): \n")
print(F)
print("\n Matrix G (result of matrix-vector multiplication A*x): \n")
print(G)
Image from author
# Broadcasting Examples
H = A * x # "broadcasts" x for element-wise multiplication with the rows of A
print(H)
J = B + x # broadcasts for addition, again across rows
print(J)
Image from author
# max operation examplesX = np.array([[3, 9, 4], [10, 2, 7], [5, 11, 8]])
all_max = np.max(X) # gets the maximum value of matrix X
column_max = np.max(X, axis=0) # gets the maximum in each column -- returns a rank-1 array [10, 11, 8]
row_max = np.max(X, axis=1) # gets the maximum in each row -- returns a rank-1 array [9, 10, 11]
# In addition to max, can similarly do min. Numpy also has argmax to return indices of maximal values
column_argmax = np.argmax(X, axis=0) # note that the "index" here is actually the row the maximum occurs for each column
print("Matrix X: \n")
print(X)
print("\n Maximum value in X: \n")
print(all_max)
print("\n Column-wise max of X: \n")
print(column_max)
print("\n Indices of column max: \n")
print(column_argmax)
print("\n Row-wise max of X: \n")
print(row_max)
Image from author
# Sum operation examples
# These work similarly to the max operations -- use the axis argument to denote if summing over rows or columns
total_sum = np.sum(X)
column_sum = np.sum(X, axis=0)
row_sum = np.sum(X, axis=1)
print("Matrix X: \n")
print(X)
print("\n Sum over all elements of X: \n")
print(total_sum)
print("\n Column-wise sum of X: \n")
print(column_sum)
print("\n Row-wise sum of X: \n")
print(row_sum)
Image from author
# Matrix reshapingX = np.arange(16) # makes a rank-1 array of integers from 0 to 15
X_square = np.reshape(X, (4, 4)) # reshape X into a 4 x 4 matrix
X_rank_3 = np.reshape(X, (2, 2, 4)) # reshape X to be 2 x 2 x 4 --a rank-3 array
# consider as two rank-2 arrays with 2 rows and 4 columns
print("Rank-1 array X: \n")
print(X)
print("\n Reshaped into a square matrix: \n")
print(X_square)
print("\n Reshaped into a rank-3 array with dimensions 2 x 2 x 4: \n")
print(X_rank_3)
Image from author

6. Plotting

Much of the plotting you will do will be in the Matplotlib library, especially within the pyplot module. Properly named, the plot function is used to edit 2-D data, as shown below:

import numpy as np
import matplotlib.pyplot as plt
# let’s start with parabola
# Compute the parabola’s x and y coordinates
x = np.arange(-4, 4, 0.1)
y = np.square(x)
# matplotlib for the plot
plt.plot(x, y, ‘b’) # color blue for the line
plt.xlabel(‘X-Axis Values’)
plt.ylabel(‘Y-Axis Values’)
plt.title(‘First Plot: A Parabola’)
plt.show() # display the plot
Image by author
Image by author

Another function of Matplotlib you will encounter is an imshow used to display images. Remember that an image can be considered the array, with the elements showing the image's pixel values. As a simple example, here is the identity matrix:

import numpy as np
import matplotlib.pyplot as plt
X = np.identity(5)
identity_matrix_image = plt.imshow(X, cmap=”Greys_r”)
plt.show()
Image by author
# plotting a random matrix, with a different colormap
A = np.random.randn(5, 5)
random_matrix_image = plt.imshow(A)
plt.show()
Image by author

Once you have reached this sentence you have gone through all the python prerequisites for your exciting machine learning journey.

Here is a summary of your accomplishment today:

  1. Variables
  2. Collections: Lists, Tuples, Dictionaries
  3. Control loops: For-loops, if-elif-else conditions
  4. List compensation
  5. Numpy
  6. Plotting

--

--