Python Geolocation

Simple Geocoding in Python

Aaron Lee
Level Up Coding
Published in
5 min readApr 1, 2021

--

Photo by Waldemar Brandt on Unsplash

Nominatim is the Latin for (‘by name’). It is also a tool to search OpenStreetMap data by address or location (geocoding). Nominatim is included in the GeoPy in the GeoPy Python library. It has saved me on a couple of recent projects where I had datasets with addresses, but none of the latitude/longitude data required for map plots and pretty much any other location based task.

Lookup a Single Address

We will start with the base code, similar to what is provided in the excellent documentation at https://geopy.readthedocs.io/en/stable/.

This code creates the Nominatim object

The code above instantiated a Nominatim object called geolocator on line 4. Note that the user_agent argument is required , and should be a referer HTTP address for your app. If you plan to do a lot of lookups, you should also add an email=’youremail@me.com’ argument as well so you can be notified if there are problems. Nominatim is a free service with a low request limit. If you need to lookup many addresses quickly, you may need to go with a pay service.

On line 5, the geocode method does its magic, and returns a geopy.location.Location object with all of the geo data we need for our address.

The Location Object

We have managed to fetch a location object. Now, what can we do with it? The code below extracts the most useful information.

The Location object (saved as location) has stored attributes containing the data we need.

The raw attribute contains all of the data stored in the object. It is a dictionary object that looks like this:

{'place_id': 99356088, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright', 'osm_type': 'way', 'osm_id': 31065707, 'boundingbox': ['41.900608', '41.9020128', '-87.6366191', '-87.634464'], 'lat': '41.90124575', 'lon': '-87.63555162319162', 'display_name': 'Walter Payton College Prep, 1034, North Wells Street, Chicago River North, Near North Side, Chicago, Cook County, Illinois, 60610, United States', 'class': 'amenity', 'type': 'school', 'importance': 0.6210731310973492, 'icon': 'https://nominatim.openstreetmap.org/ui/mapicons//education_school.p.20.png'}

There is some really interesting and useful data in there that are really useful for some projects (type, importance, icon, bounding_box).

The other attributes from location (address, latitude, longitude) are more straight forward and will give us the data we need for this example:

Walter Payton College Prep, 1034, North Wells Street, Chicago River North, Near North Side, Chicago, Cook County, Illinois, 60610, United States(41.90124575, -87.63555162319162)

Note that the address is in a string format, but the format and order will be consistent across the lookups. They could be extracted as individual fields by simply splitting and indexing the string.

Lookup a Series of Locations

Now that we have a basic idea of how to do a geocode lookup, we can automate it for a larger data set. Again, the speed and allowed downloads for Nominatim (free and open service) might be limiting for larger projects.

I am a former high school teacher in Chicago Public Schools, so I will go with something I know. We will look up the addresses and locations for the top 5 rated high schools in Chicago Public Schools.

In the code below, I make a Pandas DataFrame of the schools, and create location objects for all five of them using the Pandas Series apply method.

This code creates the dataframe with school and location

This gives me a df with two columns (school and location). The location column is a Series of geopy.location.Location objects that we looked up in line 12. We are using the same geolocator object and method that we used previously in the single address lookup section. You’ll notice that I didn’t use any addresses this time, I just used the school names in the query (similar to any Google Map search you might do). Also, similar to a Google Map search, you can get the wrong results when you aren’t specific enough.

df after the geocode lookup

Now, we go a bit further and extract the address, latitude, and longitude from our Location objects. I only added the bottom three lines to the code. Just like the single location geocode we did, we are using Location object attributes to get the data we want. This time we are creating new DataFrame columns by extracting the location attributes.

The resulting DataFrame is below.

Results after running last code

We notice something in the row for Lane Tech High School. The address and lat/long isn’t Chicago. Just like your Google Map searches, sometimes a search can go awry if you don’t provide the specificity. It’s always better to use a complete address if you have it. Let’s try the whole code again, but this time we will include Chicago, IL in the query.

Fixed with specificity

Note: Even when adding Chicago to the query, it still only returned the Lane Tech bus stop next to the school. We will go with it, but be aware this is a common. The official name is ‘Lane Technical College Prep High School’, but nobody uses that.

Now we have the correct address and geodata for my schools. I love making map plots, so let’s wrap this up by plotting the five schools using Plotly. I prefer plotly because the maps are beautiful, but sometimes I also like to use Folium which is currently a bit more versatile. (check out my other stories)

We were able to take a simple list of five school names, use GeoPy to lookup the address/location, and plot it on a map. We could easily do the same with any list of addresses, cities, building names etc. The map below is what it looks like when you plot all Chicago Public Schools and color them by Elementary or Secondary schools.

All Chicago Public Schools

Thanks for letting me share this with you. I hope it helps you on a future project. Good luck!

If you have any questions or trouble running the code, feel free to reach out and I will try to help as best I can.

--

--