A Multimodal Machine Learning Fused Global 0.1° Daily Evapotranspiration Dataset from 1950-2022
53 Pages Posted: 17 Apr 2024
There are 2 versions of this paper
A Multimodal Machine Learning Fused Global 0.1° Daily Evapotranspiration Dataset from 1950-2022
A Multimodal Machine Learning Fused Global 0.1° Daily Evapotranspiration Dataset from 1950-2022
Abstract
Evapotranspiration (ET) is the second largest hydrological flux over land surface and connects water, energy, and carbon cycles. However, large uncertainties exist among current ET products due to their coarse spatial resolutions, short temporal coverages, and reliance on assumptions. This study introduces a multimodal machine learning framework to generate a high-resolution (0.1°, daily), long-term (1950-2022) global ET dataset by fusing 13 ET products encompassing remote sensing, machine learning, land surface models, and reanalysis data relying on extensive flux tower observations (462 sites). The framework reconstructs the individual ET products to consistent spatiotemporal resolutions and time ranges using Light Gradient Boosting Machine (LightGBM) models, and Automated Machine Learning (AutoML) technique was used to fuse ET using 13 reconstructed ET products, ERA5-land atmospheric forcings and ancillary data as predictors. In-situ observations are utilized for model training and validation. Results demonstrate significant improvements over existing datasets, with our product achieving the highest accuracy (KGE = 0.857, RMSE = 0.726 mm/day) against in situ measurements across ecosystems and regions. The fused ET dataset realistically captures spatiotemporal variability and corrects the systematic underestimation bias prevalent in other datasets, particularly in wet regions. This novel high spatial-temporal ET dataset enables more robust assessments for water, energy, and carbon cycle applications on regional hydrology and ecology. The introduced data integration methodology also provides a valuable framework for fusing multiple geoscience datasets with disparate properties.
Keywords: evapotranspiration, automated machine learning, multimodal, prolonged data reconstruction
Suggested Citation: Suggested Citation