Transforming Naturally Occurring Text Data into Economic Statistics: The Case of Online Job Vacancy Postings

36 Pages Posted: 22 May 2019 Last revised: 5 Feb 2025

See all articles by Arthur Turrell

Arthur Turrell

Office for National Statistics

Bradley Speigner

Bank of England

Jyldyz Djumalieva

NESTA

David Copple

Bank of England

James Thurgood

Bank of England

Date Written: May 2019

Abstract

Using a dataset of 15 million UK job adverts from a recruitment website, we construct new economic statistics measuring labour market demand. These data are ‘naturally occurring’, having originally been posted online by firms. They offer information on two dimensions of vacancies—region and occupation—that firm-based surveys do not usually, and cannot easily, collect. These data do not come with official classification labels so we develop an algorithm which maps the free form text of job descriptions into standard occupational classification codes. The created vacancy statistics give a plausible, granular picture of UK labour demand and permit the analysis of Beveridge curves and mismatch unemployment at the occupational level.

Suggested Citation

Turrell, Arthur and Speigner, Bradley and Djumalieva, Jyldyz and Copple, David and Thurgood, James, Transforming Naturally Occurring Text Data into Economic Statistics: The Case of Online Job Vacancy Postings (May 2019). NBER Working Paper No. w25837, Available at SSRN: https://ssrn.com/abstract=3390984

Arthur Turrell (Contact Author)

Office for National Statistics ( email )

London, SW1A 2AA
United Kingdom

Bradley Speigner

Bank of England ( email )

Threadneedle Street
London, EC2R 8AH
United Kingdom

Jyldyz Djumalieva

NESTA ( email )

1 Plough Place
London, EC4 1DE
United Kingdom

David Copple

Bank of England ( email )

Threadneedle Street
London, EC2R 8AH
United Kingdom

James Thurgood

Bank of England ( email )

Threadneedle Street
London, EC2R 8AH
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
48
Abstract Views
512
PlumX Metrics