Doing Computational Social Science with Python: An Introduction
144 Pages Posted: 27 Feb 2016 Last revised: 23 Jan 2019
Date Written: January 21, 2018
Social scientists are more and more confronted with the analysis of largescale datasets. Often, these are data from online sources, and often, they contain some form of textual data. Think of data from social media, but also large archives. Often, this development if referred to as a move towards “computational social science” (Kitchin, 2014; Lazer et al., 2009).
There is a certain overlap with the term “Big Data”. But although the latter sounds sexy, it is a somewhat problematic term, because people use it for all kind of data. As scientists, we want a clear definition, but it is hard to tell what the term Big Data actually entails. This book is not about really Big Data requiring a whole server farm, but it is about data that is too big to handle manually — think of one million tweets for example. It is about data sets that are small enough to be handled by an ordinary laptop, but often too big to be processed by ordinary programs. Excel does not have an unlimited number of rows, and SPSS and STATA start complaining (or simply stop working) once you have too many cases and variables. If you know R, you are better off, but for some of the tasks we will discuss in this book, Python has a bit more to offer.
This book introduces the reader to automated content analysis of data that typically comes in amounts that are too voluminous for manual coding and for traditional point-and-click applications: tweets, blogposts, articles from RSS-feeds, etc. It introduces the programming language Python to social scientists, because Python is very flexible and highly suitable for this end. It also scales very nicely — meaning that it can be used for some smaller projects, but it can also be used on immense data sets.
Keywords: computational social science, Python, automated content analysis
JEL Classification: C80
Suggested Citation: Suggested Citation