Collecting Twitter user location and automatically geocoding it via Tweepy and Geopy

Jean Dinco
3 min readOct 17, 2022

In this short blog post, I’ll explain how to use the new Twitter API v2 via Tweepy, python libraries geopy and geopandas to get a Twitter user’s location and automatically convert it into a place on a map.

Here is a map of Central Europe which I just added so Medium would use this as the display photo of this blog instead of the boring codes and tables

First, let me address the elephant in the room. What do I mean when I talk about the user’s location? Twitter calls it profile location or, as Twitter puts it:

“…profile location is part of your public account profile, but you don’t have to share it. Your profile is the place where you can say what you want and show the world who you are.”

Some profile locations are named after cities or even made-up places like the Misty Mountains, Room of Requirements or Hightower where Alicent is from. For some Twitter users whose locations aren’t public, you can try to guess where they are based on what they tweet by using random forest (but that’s not what I am teaching today).

For the sake of this tutorial, let’s use my Twitter account to figure out where I am. Let’s start by bringing in the libraries we need.

As we can see here, I (@imajeanpeace) am located at -37.8136, 144.9631. Let’s check my Twitter account if it matches.

It does! But, wait where exactly is this -37.8136, 144.9631 , I hear you ask? That’s where geopy and geopandas come in.

Just so we have more data to play with. I will add -37.8136, 144.9631 to a group of geographic coordinates that I collected awhile ago.

My dataset now looks like this. Now, let’s import geopandas and geopy to help us find where these coordinates are located in the world map.

import geopandas as gpd
import geopy
import pandas as pd
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
import ssl
import certifi
ctx = ssl.create_default_context(cafile=certifi.where())
geopy.geocoders.options.default_ssl_context = ctx
locator = Nominatim(user_agent='USE YOUR EMAIL HERE')
geocode = RateLimiter(locator.geocode, min_delay_seconds=5)

loc = pd.read_csv(r'YOURCSVNAME.csv', engine = 'python')
#see the pic above on what loc looks like

for i in loc['location']:
try:
if i == ' Not Found':
print ('None')
else:
location = locator.geocode(i)
temp = str(location).split(",")
loc = temp[-1]
loc_formatted = loc.lstrip()
loc_formatted = loc.rstrip()
print(i + ":" + loc_formatted)
except Exception as e :
print(f"Cannot found due to {e}")

The printed results should look like this:

-37.8136,  144.9631 : Australia
-6.7052935,108.5407783 : Indonesia
-6.512312,106.843396 : Indonesia
19.0405882,72.9263868 : India
48.72874,9.093811 : Deutschland

For my research, my only concern is the country name, but you can easily change that by using indexing and change this line of code located under the else:

loc = temp[-1]

That’s all I have for you today.

--

--

Jean Dinco

Jean is a PhD candidate working on data, media & conflict forecasting.