A novel data fusion method to leverage passively-collected mobility data in generating spatially-heterogeneous synthetic population
30 Pages Posted: 21 Nov 2023
Date Written: August 6, 2023
Abstract
Conventional methods to synthesize population use household travel survey (HTS) data. They generate many infeasible attribute values due to sequentially generating sociodemographics and spatial attributes and encounter a low spatial heterogeneity issue due to a low sampling rate of the HTS data. Passively collected mobility (PCM) data (e.g., cellular traces) provides extensive spatial coverage but poses integration challenges with HTS data due to differences in spatial resolution and attributes. This study introduces a novel cluster-based data fusion method to address these limitations and simultaneously generate synthetic populations with accurate sociodemographics and homework locations at high spatial heterogeneity. Spatial clustering is adopted to align the spatial resolution of HTS and PCM data, facilitating effective data integration. The data fusion process is reformulated into cluster-specific low-dimensional optimization subproblems to ensure computational tractability. Analytical properties are derived to retain essential distributional characteristics from both datasets in the fused distribution. The spatial clustering process is optimized to ensure such distributional consistencies while maintaining a balance between feasibility and heterogeneity of the synthetic population. The data fusion properties are validated using HTS and LTE/5G cellular signaling data from Seoul, South Korea. Validation against census data confirms the method's efficacy in maintaining distributional consistency while increasing spatial heterogeneity, with 97% of the generated population being unobserved in the HTS data. This research advances methods to synthesize a population by leveraging the complementary strengths of HTS and PCM data, providing a robust framework for generating spatially diverse synthetic populations essential for urban planning.
Keywords: Population synthesis, Data fusion, Spatial heterogeneity, Passively collected mobility data, Cellphone data
Suggested Citation: Suggested Citation