Lake by Lake, Globally: Enhancing Water Quality Remote Sensing with Multi-Task Learning Models

14 Pages Posted: 17 Mar 2024

See all articles by Olivier Graffeuille

Olivier Graffeuille

affiliation not provided to SSRN

Moritz K. Lehmann

The University of Waikato

Mathew Allan

Waikato Regional Council

Jörg Wicker

affiliation not provided to SSRN

Yun Sing Koh

affiliation not provided to SSRN

Abstract

The estimation of water quality from satellite remote sensing data in inland and coastal waters is an important yet challenging problem. Recent collaborative efforts have produced large global datasets with sufficient data to train machine learning models with high accuracy. In this work, we investigate global water quality remote sensing models at the granularity of individual water bodies. We introduce Multi-Task Learning (MTL), a machine learning technique that learns a distinct model for each water body in the dataset from few data points by sharing knowledge between models. This approach allows MTL to learn water body differences, leading to more accurate predictions. We train and validate our model on the GLORIA dataset of in situ measured remote sensing reflectance and three water quality indicators: chlorophyll$a$, total suspended solids and coloured dissolved organic matter. MTL outperforms other machine learning models by 8-31\% in Root Mean Squared Error (RMSE) and 12-34\% in Mean Absolute Percentage Error (MAPE). Training on a smaller dataset of chlorophyll$a$ measurements from New Zealand lakes with simultaneous Sentinel-3 OLCI remote sensing reflectance further demonstrates the effectiveness of our model when applied regionally. Additionally, we investigate the performance of machine learning models at estimating the variation in water quality indicators within individual water bodies. Our results reveal that overall performance metrics overestimate the quality of model fit of models trained on a large number of water bodies due to the large between-water body variability of water quality indicators. In our experiments, when estimating TSS or CDOM, all models excluding multi-task learning fail to learn within-water body variability, and fail to outperform a naive baseline approach, suggesting that these models may be of limited usefulness to practitioners monitoring water quality. Overall, our research highlights the importance of considering water body differences in water quality remote sensing research for both model design and evaluation.

Keywords: Water quality, remote sensing, Inland and coastal waters, machine learning, Multi-task learning

Suggested Citation

Graffeuille, Olivier and Lehmann, Moritz K. and Allan, Mathew and Wicker, Jörg and Koh, Yun Sing, Lake by Lake, Globally: Enhancing Water Quality Remote Sensing with Multi-Task Learning Models. Available at SSRN: https://ssrn.com/abstract=4762429 or http://dx.doi.org/10.2139/ssrn.4762429

Olivier Graffeuille (Contact Author)

affiliation not provided to SSRN ( email )

No Address Available

Moritz K. Lehmann

The University of Waikato ( email )

Mathew Allan

Waikato Regional Council ( email )

401, Gray Street
Hamilton East
Hamilton, 3210
New Zealand

Jörg Wicker

affiliation not provided to SSRN ( email )

No Address Available

Yun Sing Koh

affiliation not provided to SSRN ( email )

No Address Available

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
27
Abstract Views
80
PlumX Metrics