A Multimodal Machine Learning Fused Global 0.1° Daily Evapotranspiration Dataset from 1950-2022

53 Pages Posted: 17 Apr 2024

See all articles by Qingchen Xu

Qingchen Xu

affiliation not provided to SSRN

Lu Li

Southern Marine Science and Engineering Guangdong Laboratory - Zhuhai

Zhongwang Wei

Southern Marine Science and Engineering Guangdong Laboratory - Zhuhai

Xuhui Lee

Yale University - School of the Environment

Yongjiu Dai

Southern Marine Science and Engineering Guangdong Laboratory - Zhuhai

Multiple version iconThere are 2 versions of this paper

Abstract

Evapotranspiration (ET) is the second largest hydrological flux over land surface and connects water, energy, and carbon cycles. However, large uncertainties exist among current ET products due to their coarse spatial resolutions, short temporal coverages, and reliance on assumptions. This study introduces a multimodal machine learning framework to generate a high-resolution (0.1°, daily), long-term (1950-2022) global ET dataset by fusing 13 ET products encompassing remote sensing, machine learning, land surface models, and reanalysis data relying on extensive flux tower observations (462 sites). The framework reconstructs the individual ET products to consistent spatiotemporal resolutions and time ranges using Light Gradient Boosting Machine (LightGBM) models, and Automated Machine Learning (AutoML) technique was used to fuse ET using 13 reconstructed ET products, ERA5-land atmospheric forcings and ancillary data as predictors. In-situ observations are utilized for model training and validation. Results demonstrate significant improvements over existing datasets, with our product achieving the highest accuracy (KGE = 0.857, RMSE = 0.726 mm/day) against in situ measurements across ecosystems and regions. The fused ET dataset realistically captures spatiotemporal variability and corrects the systematic underestimation bias prevalent in other datasets, particularly in wet regions. This novel high spatial-temporal ET dataset enables more robust assessments for water, energy, and carbon cycle applications on regional hydrology and ecology. The introduced data integration methodology also provides a valuable framework for fusing multiple geoscience datasets with disparate properties.

Keywords: evapotranspiration, automated machine learning, multimodal, prolonged data reconstruction

Suggested Citation

Xu, Qingchen and Li, Lu and Wei, Zhongwang and Lee, Xuhui and Dai, Yongjiu, A Multimodal Machine Learning Fused Global 0.1° Daily Evapotranspiration Dataset from 1950-2022. Available at SSRN: https://ssrn.com/abstract=4797287 or http://dx.doi.org/10.2139/ssrn.4797287

Qingchen Xu

affiliation not provided to SSRN ( email )

No Address Available

Lu Li

Southern Marine Science and Engineering Guangdong Laboratory - Zhuhai ( email )

Zhongwang Wei (Contact Author)

Southern Marine Science and Engineering Guangdong Laboratory - Zhuhai ( email )

Xuhui Lee

Yale University - School of the Environment ( email )

United States

Yongjiu Dai

Southern Marine Science and Engineering Guangdong Laboratory - Zhuhai ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
67
Abstract Views
209
Rank
653,303
PlumX Metrics