Improved TDS forecasting in data-scarce regions using CEEMDAN and AI-driven hydro-climatic analysis
(2025) In Environmental Modelling and Software 192.- Abstract
Total dissolved solids (TDS) are a key water quality parameter, reflecting the concentration of dissolved salts in aquatic systems. Accurate TDS forecasting is essential for sustainable water resource management, particularly in data-scarce regions. This study proposes a novel and generalized AI-based framework to forecast TDS up to six months ahead using a limited set of hydro-climatic input variables. The methodology combines Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for signal denoising and pattern extraction with advanced machine learning models, including Random Forest (RF) and a hybrid Grey Wolf Optimization–Support Vector Machine (GWO-SVM). To enhance model transferability, only four widely... (More)
Total dissolved solids (TDS) are a key water quality parameter, reflecting the concentration of dissolved salts in aquatic systems. Accurate TDS forecasting is essential for sustainable water resource management, particularly in data-scarce regions. This study proposes a novel and generalized AI-based framework to forecast TDS up to six months ahead using a limited set of hydro-climatic input variables. The methodology combines Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for signal denoising and pattern extraction with advanced machine learning models, including Random Forest (RF) and a hybrid Grey Wolf Optimization–Support Vector Machine (GWO-SVM). To enhance model transferability, only four widely available input variables—precipitation, evaporation, discharge, and chloride concentration—were used. Historical data from 1975 to 2016 were collected from three hydrometric stations representing distinct climatic conditions. Forecasting was conducted both with and without the inclusion of lagged TDS values. The CEEMDAN-GWO-Linear SVM model achieved high accuracy (R2 = 0.70–0.96) across different forecast horizons. Additionally, CEEMDAN significantly improved the predictive performance of both SVM and RF models. Feature importance analysis using RF ranked chloride concentration, discharge, precipitation, and evaporation as the most influential variables in TDS prediction. The proposed framework offers a robust, data-efficient solution for mid-term water quality forecasting.
(Less)
- author
- Sayadi, Maryam LU ; Hessari, Behzad ; Montaseri, Majid and Naghibi, Amir LU
- organization
- publishing date
- 2025-08
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- CEEMDAN, Random forests, Support vector machine, Time series modeling, Water quality
- in
- Environmental Modelling and Software
- volume
- 192
- article number
- 106560
- publisher
- Elsevier
- external identifiers
-
- scopus:105009020796
- ISSN
- 1364-8152
- DOI
- 10.1016/j.envsoft.2025.106560
- language
- English
- LU publication?
- yes
- id
- 972227fb-0286-456e-bcaa-4781ba541fce
- date added to LUP
- 2025-11-05 09:43:22
- date last changed
- 2025-11-05 09:43:44
@article{972227fb-0286-456e-bcaa-4781ba541fce,
abstract = {{<p>Total dissolved solids (TDS) are a key water quality parameter, reflecting the concentration of dissolved salts in aquatic systems. Accurate TDS forecasting is essential for sustainable water resource management, particularly in data-scarce regions. This study proposes a novel and generalized AI-based framework to forecast TDS up to six months ahead using a limited set of hydro-climatic input variables. The methodology combines Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for signal denoising and pattern extraction with advanced machine learning models, including Random Forest (RF) and a hybrid Grey Wolf Optimization–Support Vector Machine (GWO-SVM). To enhance model transferability, only four widely available input variables—precipitation, evaporation, discharge, and chloride concentration—were used. Historical data from 1975 to 2016 were collected from three hydrometric stations representing distinct climatic conditions. Forecasting was conducted both with and without the inclusion of lagged TDS values. The CEEMDAN-GWO-Linear SVM model achieved high accuracy (R<sup>2</sup> = 0.70–0.96) across different forecast horizons. Additionally, CEEMDAN significantly improved the predictive performance of both SVM and RF models. Feature importance analysis using RF ranked chloride concentration, discharge, precipitation, and evaporation as the most influential variables in TDS prediction. The proposed framework offers a robust, data-efficient solution for mid-term water quality forecasting.</p>}},
author = {{Sayadi, Maryam and Hessari, Behzad and Montaseri, Majid and Naghibi, Amir}},
issn = {{1364-8152}},
keywords = {{CEEMDAN; Random forests; Support vector machine; Time series modeling; Water quality}},
language = {{eng}},
publisher = {{Elsevier}},
series = {{Environmental Modelling and Software}},
title = {{Improved TDS forecasting in data-scarce regions using CEEMDAN and AI-driven hydro-climatic analysis}},
url = {{http://dx.doi.org/10.1016/j.envsoft.2025.106560}},
doi = {{10.1016/j.envsoft.2025.106560}},
volume = {{192}},
year = {{2025}},
}