How To Get All Stock Symbols

Without Really Trying…

ZennDogg
Level Up Coding

--

Photo by M. B. M. on Unsplash

Now that I’ve retired, I have a lot of time on my hands. An interest in the stock market is occupying some of my time right now. One of the first steps in studying stocks is knowing what stocks are out there. Lists of stock symbols are available in many places as a .csv download, but I like quick and efficient. I wrote this script to do just that.

First, the import statements.

import pandas as pd
from yahoo_fin import stock_info as si

There are three major stock exchanges in the US. They are the Dow, NASDAQ and S&P 500. There is a fourth category of symbols called “others”. We download each category of lists to separate pandas dataframes.

df1 = pd.DataFrame( si.tickers_sp500() )
df2 = pd.DataFrame( si.tickers_nasdaq() )
df3 = pd.DataFrame( si.tickers_dow() )
df4 = pd.DataFrame( si.tickers_other() )

Next, we convert each dataframe to a list, then to a set.

sym1 = set( symbol for symbol in df1[0].values.tolist() )
sym2 = set( symbol for symbol in df2[0].values.tolist() )
sym3 = set( symbol for symbol in df3[0].values.tolist() )
sym4 = set( symbol for symbol in df4[0].values.tolist() )

Stock symbols may be listed on more than one exchange. We join the four sets into one. Because it is a set, there will be no duplicate symbols.

symbols = set.union( sym1, sym2, sym3, sym4 )

Most symbols are up to four letters in length, ie: MSFT for Microsoft. There are some symbols that have a fifth letter. A fifth letter is mostly added to stocks that are delinquent in certain exchange requirements. We’ll identify four of those suffixes for deletion.

my_list = ['W', 'R', 'P', 'Q']

W means there are outstanding warrants. We don’t want those.

R means there is some kind of “rights” issue. Again, not wanted.

P means “First Preferred Issue”. Preferred stocks are a separate entity.

Q means bankruptcy. We don’t want those, either.

The reason I wanted to eliminate these symbols is they take a very long time and then returns errors when downloading subsequent data.

We need to initiate two sets; a save set and a delete set.

del_set = set()
sav_set = set()

Next, we find the symbols over four characters in length AND have their last letter in my_list. When found, they are added to del_set. All other symbols are added to sav_set.

for symbol in symbols:
if len( symbol ) > 4 and symbol[-1] in my_list:
del_set.add( symbol )
else:
sav_set.add( symbol )

Lastly, we print the results.

print( f'Removed {len( del_set )} unqualified stock symbols...' )
print( f'There are {len( sav_set )} qualified stock symbols...' )

This is the output.

Removed 445 unqualified stock symbols…
There are 9193 qualified stock symbols…
Process finished with exit code 0

Now we have a complete list of stock symbols to start our data processing journey. You may want to save sav_set to your favorite database or storage format.

If you enjoy reading stories like these and want to support me as a writer, consider subscribing to Medium for $5 a month. As a member, you have unlimited access to stories on Medium. If you sign up using my link, I’ll earn a small commission.

The entire script is provided below. Happy exploring.

import pandas as pd
from yahoo_fin import stock_info as si


# gather stock symbols from major US exchanges
df1 = pd.DataFrame( si.tickers_sp500() )
df2 = pd.DataFrame( si.tickers_nasdaq() )
df3 = pd.DataFrame( si.tickers_dow() )
df4 = pd.DataFrame( si.tickers_other() )

# convert DataFrame to list, then to sets
sym1 = set( symbol for symbol in df1[0].values.tolist() )
sym2 = set( symbol for symbol in df2[0].values.tolist() )
sym3 = set( symbol for symbol in df3[0].values.tolist() )
sym4 = set( symbol for symbol in df4[0].values.tolist() )

# join the 4 sets into one. Because it's a set, there will be no duplicate symbols
symbols = set.union( sym1, sym2, sym3, sym4 )

# Some stocks are 5 characters. Those stocks with the suffixes listed below are not of interest.
my_list = ['W', 'R', 'P', 'Q']
del_set = set()
sav_set = set()

for symbol in symbols:
if len( symbol ) > 4 and symbol[-1] in my_list:
del_set.add( symbol )
else:
sav_set.add( symbol )

print( f'Removed {len( del_set )} unqualified stock symbols...' )
print( f'There are {len( sav_set )} qualified stock symbols...' )

--

--

Retired military, Retired US Postal Service, Defender of the US Constitution from all enemies, foreign and domestic, Self-taught in python