Scraping EDGAR With Python
Journal of Education for Business, 2017, 92:4, 179-185
21 Pages Posted: 13 Dec 2019
Date Written: June 1, 2017
This paper presents Python codes that can be used to extract data from SEC filings. The Python program web crawls to obtain URL paths for company filings of required reports, such as 10-K. The program then performs a textual analysis and counts the number of occurrences of words in the filing that reflect, for example, uncertainty (or any other quality specified by the researcher). The program can be easily modified to conduct other searches by changing the word list, company names, or SEC filings. The Python program could be used in an introductory graduate data analytics course in finance that has a web crawling or textual analysis component.
Keywords: Education, Higher Education, Data Collection, Computer Programs
JEL Classification: I20, I23, C80
Suggested Citation: Suggested Citation