OpenEDGAR: Open Source Software for SEC EDGAR Analysis
12 Pages Posted: 27 Jun 2018
Date Written: June 12, 2018
Abstract
OpenEDGAR is an open source Python framework designed to rapidly construct research databases based on the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system operated by the US Securities and Exchange Commission (SEC). OpenEDGAR is built on the Django application framework, supports distributed compute across one or more servers, and includes functionality to (i) retrieve and parse index and filing data from EDGAR, (ii) build tables for key metadata like form type and filer, (iii) retrieve, parse, and update CIK to ticker and industry mappings, (iv) extract content and metadata from filing documents, and (v) search filing document contents. OpenEDGAR is designed for use in both academic research and industrial applications, and is distributed under MIT License.
Keywords: SEC, EDGAR, Legal, Regulatory, Finance, Accounting, Data, Opensource, Corpora, Python, Natural Language Processing, Machine Learning
JEL Classification: C19, C53, C55, C38, C45, C63, C88
Suggested Citation: Suggested Citation