Prediction of chlorophyll a concentration and analyzing its relationship with environmental factors in lake Vänern using random forest algorithm
(2026) In Limnology 27(2). p.235-250- Abstract
Chlorophyll a concentration is a crucial indicator of marine primary productivity and its accurate prediction is vital for the early warning systems in marine ecosystem. Present work focused to examine the relationship between chlorophyll a concentration and environmental factors in Lake Vänern and to predict the chlorophyll a concentration using Random Forest model and Generalized linear model during the period 2003 to 2023. The random forest model considers the optical (diffuse attenuation coefficient at 490 nm, Normalized Fluorescence Line Height) and meteorological parameters (Precipitation, surface air temperature, wind speed etc.) of Lake Vänern as inputs and further used feature importance ranking and partial dependence plots to... (More)
Chlorophyll a concentration is a crucial indicator of marine primary productivity and its accurate prediction is vital for the early warning systems in marine ecosystem. Present work focused to examine the relationship between chlorophyll a concentration and environmental factors in Lake Vänern and to predict the chlorophyll a concentration using Random Forest model and Generalized linear model during the period 2003 to 2023. The random forest model considers the optical (diffuse attenuation coefficient at 490 nm, Normalized Fluorescence Line Height) and meteorological parameters (Precipitation, surface air temperature, wind speed etc.) of Lake Vänern as inputs and further used feature importance ranking and partial dependence plots to identify the dominant drivers of chlorophyll a concentration. Ranking of feature importance indicates the close relationship between response variable (chlorophyll a concentration) and high importance feature in random forest model. From results it is observed that the random forest model achieved a high prediction accuracy, with a coefficient of determination (R2) of 0.98 and a root mean square error (RMSE) of 0.005 mg m− 3. From Random Forest model and Generalized linear model it is observed that particulate inorganic carbon and water temperature are the dominant drivers in limiting the chlorophyll a concentration in Lake Vänern. This study offers a valuable approach for accurately predicting chlorophyll a concentration, aiding in the control and prevention of harmful lake blooms thereby reducing their adverse impacts on the marine ecosystem and the surrounding environment.
(Less)
- author
- Budakoti, Sachin LU and Pal, Mahendra LU
- organization
- publishing date
- 2026
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Chlorophyll a, Feature important ranking, Generalized linear model, Lake blooms, Random forest model
- in
- Limnology
- volume
- 27
- issue
- 2
- pages
- 235 - 250
- publisher
- Springer
- external identifiers
-
- scopus:105022833135
- ISSN
- 1439-8621
- DOI
- 10.1007/s10201-025-00822-8
- language
- English
- LU publication?
- yes
- id
- 9f59bcdc-7f13-4e49-8386-9cb853ad7a8c
- date added to LUP
- 2026-02-05 13:42:15
- date last changed
- 2026-06-10 09:10:59
@article{9f59bcdc-7f13-4e49-8386-9cb853ad7a8c,
abstract = {{<p>Chlorophyll a concentration is a crucial indicator of marine primary productivity and its accurate prediction is vital for the early warning systems in marine ecosystem. Present work focused to examine the relationship between chlorophyll a concentration and environmental factors in Lake Vänern and to predict the chlorophyll a concentration using Random Forest model and Generalized linear model during the period 2003 to 2023. The random forest model considers the optical (diffuse attenuation coefficient at 490 nm, Normalized Fluorescence Line Height) and meteorological parameters (Precipitation, surface air temperature, wind speed etc.) of Lake Vänern as inputs and further used feature importance ranking and partial dependence plots to identify the dominant drivers of chlorophyll a concentration. Ranking of feature importance indicates the close relationship between response variable (chlorophyll a concentration) and high importance feature in random forest model. From results it is observed that the random forest model achieved a high prediction accuracy, with a coefficient of determination (R<sup>2</sup>) of 0.98 and a root mean square error (RMSE) of 0.005 mg m<sup>− 3</sup>. From Random Forest model and Generalized linear model it is observed that particulate inorganic carbon and water temperature are the dominant drivers in limiting the chlorophyll a concentration in Lake Vänern. This study offers a valuable approach for accurately predicting chlorophyll a concentration, aiding in the control and prevention of harmful lake blooms thereby reducing their adverse impacts on the marine ecosystem and the surrounding environment.</p>}},
author = {{Budakoti, Sachin and Pal, Mahendra}},
issn = {{1439-8621}},
keywords = {{Chlorophyll a; Feature important ranking; Generalized linear model; Lake blooms; Random forest model}},
language = {{eng}},
number = {{2}},
pages = {{235--250}},
publisher = {{Springer}},
series = {{Limnology}},
title = {{Prediction of chlorophyll a concentration and analyzing its relationship with environmental factors in lake Vänern using random forest algorithm}},
url = {{http://dx.doi.org/10.1007/s10201-025-00822-8}},
doi = {{10.1007/s10201-025-00822-8}},
volume = {{27}},
year = {{2026}},
}