Emerging the U.S. Firm Size Distribution Using 4.2 Billion Individual Tax Records

9 Pages Posted: 15 May 2019

Date Written: April 25, 2019


The firm size distribution describes important economic and labor properties of any economy. Government entities must expend enormous resources in data collection, cleaning, and analysis in order to construct this and other important distributions describing the aggregate properties large economies. In the U.S., this process can be cumbersome and relies on querying multiple databases and utilizing significant computational resources. I show that construction of the U.S. firm size distribution is plausible using only individual income tax records (W2s) drawn directly from Internal Revenue Service tax records (micro data) and that the emergent distribution is statistically identical to what is reported by the United States Census Bureau. The methodology represents an incremental advance for population-scale studies in economic analysis — specifically firm and labor analysis. Finally, this paper acts as a re-validation of earlier work in fitting the firm size distribution.

Keywords: firm size, labor, taxation, data policy, economic analysis, data science

Suggested Citation

Shaheen, Joseph A.E., Emerging the U.S. Firm Size Distribution Using 4.2 Billion Individual Tax Records (April 25, 2019). RAIS Conference Proceedings - The 12th International RAIS Conference on Social Sciences & Humanities, Available at SSRN: https://ssrn.com/abstract=3387584 or http://dx.doi.org/10.2139/ssrn.3387584

Joseph A.E. Shaheen (Contact Author)

George Mason University ( email )

4400 University Drive
Fairfax, VA 22030
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics