Where in the World are You? Geolocation and Language Identification in Twitter

Professional Geographer, Forthcoming

17 Pages Posted: 6 Aug 2013

See all articles by Mark Graham

Mark Graham

University of Oxford - Oxford Internet Institute

Scott A. Hale

Oxford Internet Institute, University of Oxford

Devin Gaffney

University of Oxford - Oxford Internet Institute

Date Written: April 25, 2013

Abstract

The movements of ideas and content between locations and languages are unquestionably crucial concerns to researchers of the information age, and Twitter has emerged as a central, global platform on which hundreds of millions of people share knowledge and information. A variety of research has attempted to harvest locational and linguistic metadata from tweets in order to understand important questions related to the 300 million tweets that flow through the platform each day. However, much of this work is carried out with only limited understandings of how best to work with the spatial and linguistic contexts in which the information was produced. Furthermore, standard, well-accepted practices have yet to emerge. As such, this paper studies the reliability of key methods used to determine language and location of content in Twitter. It compares three automated language identification packages to Twitter’s user interface language setting and to a human coding of languages in order to identify common sources of disagreement. The paper also demonstrates that in many cases user-entered profile locations differ from the physical locations users are actually tweeting from. As such, these open-ended, user-generated, profile locations cannot be used as useful proxies for the physical locations from which information is published to Twitter.

Keywords: Geography, Language, Twitter

Suggested Citation

Graham, Mark and Hale, Scott A. and Gaffney, Devin, Where in the World are You? Geolocation and Language Identification in Twitter (April 25, 2013). Professional Geographer, Forthcoming, Available at SSRN: https://ssrn.com/abstract=2224233

Mark Graham (Contact Author)

University of Oxford - Oxford Internet Institute ( email )

1 St. Giles
University of Oxford
Oxford, Oxfordshire OX1 3JS
United Kingdom

HOME PAGE: http://www.geospace.co.uk

Scott A. Hale

Oxford Internet Institute, University of Oxford ( email )

1 St. Giles
University of Oxford
Oxford, Oxfordshire OX1 3JS
United Kingdom

HOME PAGE: http://www.scotthale.net/

Devin Gaffney

University of Oxford - Oxford Internet Institute ( email )

1 St. Giles
University of Oxford
Oxford OX1 3PG Oxfordshire, Oxfordshire OX1 3JS
United Kingdom

HOME PAGE: http://www.devingaffney.com/

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
604
Abstract Views
5,002
Rank
82,138
PlumX Metrics