Leveraging Landsat and Google Earth Engine for long-term chlorophyll-a monitoring : A case study of Lake Balaton's water quality
(2025) In Ecological Informatics 90.- Abstract
Chlorophyll-a (Chl-a) is one of the critical water quality indicators that shows the eutrophication status of aquatic ecosystems. As the largest lake and a well-known attraction in middle Europe, Lake Balaton contributes 70 % or more of local economy through tourism, while also maintaining a unique biodiversity. Therefore, long-term monitoring of water quality is essential for its effective management. With the longest global environmental record and a preferable spatial resolution, the satellite constellation Landsat is used for retrieving Chl-a in this study. However, the common low-frequent in-situ samplings and ∼ 16-day revisit of Landsat have limited both the quality and applicability of Landsat to Chl-a retrieval. Initially, we... (More)
Chlorophyll-a (Chl-a) is one of the critical water quality indicators that shows the eutrophication status of aquatic ecosystems. As the largest lake and a well-known attraction in middle Europe, Lake Balaton contributes 70 % or more of local economy through tourism, while also maintaining a unique biodiversity. Therefore, long-term monitoring of water quality is essential for its effective management. With the longest global environmental record and a preferable spatial resolution, the satellite constellation Landsat is used for retrieving Chl-a in this study. However, the common low-frequent in-situ samplings and ∼ 16-day revisit of Landsat have limited both the quality and applicability of Landsat to Chl-a retrieval. Initially, we trained both linear and several machine learning models using matchups between in-situ measurements and satellite data from Landsat 4–9 missions during 1984 and 2023. To address the imbalanced data problem, which lacks high concentration samples due to the rare blooming events, we extend the time tolerance, incorporate temporal information, which connotes the phenology information, and apply an oversampling technique during the training process. Validated on Lake Balaton, which has a spatiotemporal amplitude of Chl-a concentration ranging from 5 to 260 μg/L since 1980s, Random Forest model has the best accuracy, which shows an R-square 0.86 and RMSE 8.16 μg/L. The oversampling technique improves the accuracy by 9.5 % than the non-oversampled. Leveraging all strategies improves overall accuracy by 21 %. The result also shows a reasonable trade-off via increasing the number of matchups 3 to 8 times by extending the time tolerance from the same day to 3 days regardless of the high variability of Chl-a due to the sinking and floating movement of algae. The enhancement framework can be applied to other lakes, especially for lakes with coarse samplings and wide Chl-a fluctuations. We present an open-source online tool for historical and real-time Chl-a mapping, designed for both experts and the public. With customizable code for global lakes, results are continuously showcased on the HUN-REN Balaton Limnological Research Institute's website and social media.
(Less)
- author
- Li, Huan ; Somogyi, Boglárka ; Chen, Xiaona ; Luo, Zengliang ; Blix, Katalin ; Wu, Sirui ; Duan, Zheng LU and Tóth, Viktor R.
- organization
- publishing date
- 2025-12
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Applicability of Landsat, Integrating temporal information, Machine learning, Oversampling imbalanced dataset, Remote sensing, Water quality
- in
- Ecological Informatics
- volume
- 90
- article number
- 103245
- publisher
- Elsevier
- external identifiers
-
- scopus:105007031555
- ISSN
- 1574-9541
- DOI
- 10.1016/j.ecoinf.2025.103245
- language
- English
- LU publication?
- yes
- id
- 41335b03-b18d-47b1-8b19-8773b7cffca2
- date added to LUP
- 2025-07-14 10:27:49
- date last changed
- 2025-07-14 10:28:43
@article{41335b03-b18d-47b1-8b19-8773b7cffca2, abstract = {{<p>Chlorophyll-a (Chl-a) is one of the critical water quality indicators that shows the eutrophication status of aquatic ecosystems. As the largest lake and a well-known attraction in middle Europe, Lake Balaton contributes 70 % or more of local economy through tourism, while also maintaining a unique biodiversity. Therefore, long-term monitoring of water quality is essential for its effective management. With the longest global environmental record and a preferable spatial resolution, the satellite constellation Landsat is used for retrieving Chl-a in this study. However, the common low-frequent in-situ samplings and ∼ 16-day revisit of Landsat have limited both the quality and applicability of Landsat to Chl-a retrieval. Initially, we trained both linear and several machine learning models using matchups between in-situ measurements and satellite data from Landsat 4–9 missions during 1984 and 2023. To address the imbalanced data problem, which lacks high concentration samples due to the rare blooming events, we extend the time tolerance, incorporate temporal information, which connotes the phenology information, and apply an oversampling technique during the training process. Validated on Lake Balaton, which has a spatiotemporal amplitude of Chl-a concentration ranging from 5 to 260 μg/L since 1980s, Random Forest model has the best accuracy, which shows an R-square 0.86 and RMSE 8.16 μg/L. The oversampling technique improves the accuracy by 9.5 % than the non-oversampled. Leveraging all strategies improves overall accuracy by 21 %. The result also shows a reasonable trade-off via increasing the number of matchups 3 to 8 times by extending the time tolerance from the same day to 3 days regardless of the high variability of Chl-a due to the sinking and floating movement of algae. The enhancement framework can be applied to other lakes, especially for lakes with coarse samplings and wide Chl-a fluctuations. We present an open-source online tool for historical and real-time Chl-a mapping, designed for both experts and the public. With customizable code for global lakes, results are continuously showcased on the HUN-REN Balaton Limnological Research Institute's website and social media.</p>}}, author = {{Li, Huan and Somogyi, Boglárka and Chen, Xiaona and Luo, Zengliang and Blix, Katalin and Wu, Sirui and Duan, Zheng and Tóth, Viktor R.}}, issn = {{1574-9541}}, keywords = {{Applicability of Landsat; Integrating temporal information; Machine learning; Oversampling imbalanced dataset; Remote sensing; Water quality}}, language = {{eng}}, publisher = {{Elsevier}}, series = {{Ecological Informatics}}, title = {{Leveraging Landsat and Google Earth Engine for long-term chlorophyll-a monitoring : A case study of Lake Balaton's water quality}}, url = {{http://dx.doi.org/10.1016/j.ecoinf.2025.103245}}, doi = {{10.1016/j.ecoinf.2025.103245}}, volume = {{90}}, year = {{2025}}, }