In this article post we explain how to work efficiently with geodata in Python.

Geodata can symbolize different objects – the most important are the following three.

Points (e.g. individual addresses or measuring points)
Lines (e.g. routes)
Polygons (e.g. countries or postcode areas)

In addition to coordinate formats, geodata can also be stored as addresses.

We’ll explain in this post:

how addresses can be converted to coordinates (latitude and longitude) (geocoding),
how points and polygons can be displayed on an interactive map (visualization),
and how distances can be calculated and a K-Nearest Neighbors regression can be made on geodata (Distance calculation and regression on geodata)

Geocoding

Prerequisites

The libraries required for this section can be installed with PyPI as follows (here for Python3):

pip3 install geopy
pip3 install ratelimit
pip3 install tqdm # for progress bars

Geocodierung mit GeoPy und Nominatim

Geocoding refers to the conversion of addresses into coordinates and, vice versa, the conversion of coordinates into the corresponding address (reverse geocoding).

There are a number of freely available geocoding APIs that are suitable for smaller use cases, e.g. geocoding of less than 10,000 points per day.

In most cases it is not necessary to call the APIs manually. geopy is an excellent Python library for (among others) geocoding and reverse geocoding that supports many APIs. In this example we use the Nominatim API, which is based on OpenStreetMap (OSM) data. The OSM data is subject to the Open Database License (ODbL).

A single address is geocoded in geopy with minimal effort:

from geopy.geocoders import Nominatim
gc = Nominatim(user_agent="fintu-blog-geocoding-python")
gc.geocode("Unter den Linden 1, Berlin")
# Location(Kommandantenhaus, 1, Unter den Linden, Spandauer Vorstadt, Mitte, Berlin,
# 10117, Deutschland, (52.51720765, 13.3978343993255, 0.0))