Estimation of Water Quality Parameters Using an Ensemble Learning Model Optimized with Levy Flight and Sparrow Search Algorithms
Author:
Affiliation:

1.School of Geo-Science and Technology, Zhengzhou University, Zhengzhou 450001, China;2.School of Water Conservancy and Transportation, Zhengzhou University, Zhengzhou 450001, China

Clc Number:

TP751.1;TP79

  • Article
  • | |
  • Metrics
  • |
  • Reference [23]
  • | | | |
  • Comments
    Abstract:

    Due to the optical complexity of water bodies and the interactions among various water quality parameters, utilizing ensemble machine learning methods for estimating water quality parameters offers advantages. However, selecting hyperparameters in the modeling process remains challenging. The sparrow search algorithm (SSA) can rapidly search for optimal parameters of ensemble machine learning models, while the Levy flight algorithm prevents SSA from being trapped in local optima, thereby improving the accuracy and efficiency of the model. In this paper, the Levy flight algorithm and SSA were used to optimize three ensemble learning models: random forest (RF), AdaBoost regression (ABR), and CatBoost regression (CBR). Taking Zhengzhou Dongfeng Canal and Xiong’er River as the study area, estimation models (LSSA-RF, LSSA-ABR, and LSSA-CBR) were developed based on measured chlorophyll-a and total suspended solids concentrations. The experimental results show that after optimization, various indicators show improvements to varying degrees. Among them, the LSSA-CBR model exhibits the best performance. The CBR model, which is modeled under the gradient boosting framework, demonstrates higher learning capability compared to RF and ABR models. For the estimation of chlorophyll-a, the root mean square error (RMSE) of the LSSA-CBR estimation model is 2.325 μg·L-1, and the coefficient of determination (R2) is 0.896. For the estimation of total suspended solids, the RMSE of the LSSA-CBR model is 1.598 mg·L-1, and R2 is 0.882. Finally, the LSSA-CBR model, demonstrating strong accuracy, was applied to Planet images to evaluate the spatial distribution of chlorophyll-a and total suspended solids in rivers, providing a valuable reference for quickly understanding the distribution of urban river water quality and conducting water quality assessment and management.

    Reference
    [1] KIM Y W, KIM T, SHIN J, et al. Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters[J]. Ecological Indicators, 2022,137:108737.
    [2] WERTHER M, ODERMATT D, SIMIS S, et al. A Bayesian approach for remote sensing of chlorophyll-a and associated retrieval uncertainty in oligotrophic and mesotrophic lakes[J]. Remote Sensing of Environment, 2022,283:113295.
    [3] 李爱民,王海隆,许有成.优化随机森林算法的城市湖泊DOC质量浓度遥感估算[J].郑州大学学报(工学版),2022,43(6):90.LI Aimin, WANG Hailong, XU Youcheng, et al. Remote sensing retrieval of urban lake DOC concentration based on optimized random forest algorithm[J]. Journal of Zhengzhou University(Engineering Science), 2022,43(6):90.
    [4] CHEN B, MU X, CHEN P, et al. Machine learning-based inversion of water quality parameters intypical reach of the urban river by UAV multispectral data[J]. Ecological Indicators,2021,133:108434.
    [5] 嵇晓燕,杨凯,陈亚男,等.基于ARIMA和Prophet的水质预测集成学习模型[J].水资源保护,2022,38(6):111.JI Xiaoyan, YANG Kai, CHEN Yanan, et al. An ensemble learning model for water quality forecast based on ARIMA and Prophet[J]. Water Resources Protection. 2022, 38(6): 111.
    [6] 陈点点,陈芸芝,冯险峰,等.基于超参数优化CatBoost算法的河流悬浮物浓度遥感估算[J].地球信息科学学报,2022,24(4):780.CHEN Diandian, CHEN Yunzhi, FENG Xianfeng, et al. Retrieving suspended matter concentration in rivers based on hyperparameter optimized catBoost algorithm[J]. Journal of Geo-information Science, 2022, 24(4): 780.
    [7] XU S, LI S, TAO Z, et al. Remote sensing of Chlorophyll-a in Xinkai lake using machine learning and GF-6 WFV images[J]. Remote Sensing. 2022, 14(20): 5136.
    [8] 盛辉,池海旭,许明明,等.改进SVR的内陆水体COD高光谱遥感估算[J].光谱学与光谱分析,2021,41(11):3565.SHENG Hui, CHI Haixu, XU Mingming, et al. Inland water chemical oxygen demand estimation based on improved SVR for hyperspectral data [J]. Spectroscopy and Spectral Analysis, 2021,41(11):3565.
    [9] GUO Q, WU H, JIN H, et al. Remote sensing inversion of suspended matter concentration using a neural network model optimized by the partial least squares and particle swarm optimization algorithms[J]. Sustainability 2022, 14: 2221.
    [10] XUE J K, SHEN B. A novel swarm intelligence optimization approach: sparrow search algorithm[J]. Systems Science & Control Engineering, 2020,8(1):22.
    [11] 王秋燕,陈仁喜,徐佳,等.环境一号卫星影像中水体信息提取方法研究[J]. 科学技术与工程, 2012, 12(13): 3051.WANG Qiuyan, CHEN Renxi, XU Jia, et al. Research on methods for extracting water body information from HJ—1A/B data[J]. Science Technology and Engineering. 2012, 12(13): 3051.
    [12] 李爱民,范猛,秦光铎,等.卷积神经网络模型的遥感估算水质参数COD[J].光谱学与光谱分析,2023,43(2):651.LI Aimin, FAN Meng, QIN Guangduo, et al. Remote sensing inversion of water quality parameter COD of convolutional neural network model[J]. Spectroscopy and spectral analysis. 2023, 43(2): 651.
    [13] 杭鑫,曹云,杭蓉蓉,等.基于随机森林算法与高分观测的太湖叶绿素a浓度估算模型[J].气象,2021,47(12):1525.HANG Xin, CAO Yun, HANG Rongrong, et al. Estimation model of Chlorophyll-a concentration in Taihu lake based on random forest algorithm and Gaofen observations [J]. Meteorological Monthly,2021,47(12):1525.
    [14] 方馨蕊,温兆飞,陈吉龙,等.随机森林回归模型的悬浮泥沙浓度遥感估算[J].遥感学报,2019,23(4):756.FANG Xinrui, WEN Zhaofei, CHEN Jilong, et al. Remote sensing estimation of suspended sediment concentration based on Random Forest Regression Model[J]. National Remote Sensing Bulletin, 2019,23(4):756.
    [15] LIN N, JIANG R Z, LI G J, et al. Estimating the heavy metal contents in farmland soil from hyperspectral images based on Stacked AdaBoost ensemble learning[J]. Ecological Indicators, 2022,143. DOI:doi.org/10.1016/j.ecolind.2022.109330.
    [16] BENTEJAC C, CSORGO A, MARTINEZ-MUNOZ G. A comparative analysis of gradient boosting algorithms[J]. Artificial Intelligence Review, 2021,54(3):1937.
    [17] PROKHORENKOVA L, GUSEV G, VOROBEV A, et al. CatBoost: unbiased boosting with categorical features[M]. Dolgoprudny:[S.n.], 2018.
    [18] LI H M, ZHANG G L, ZHONG Q C, et al. Prediction of urban forest aboveground carbon using machine learning based on Landsat 8 and Sentinel-2: A case study of Shanghai, China[J]. Remote Sensing, 2023,15(1).
    [19] LIU Y H, CAO B Y. A novel ant colony optimization algorithm with Levy flight[J]. IEEE Access, 2020,8:67205.
    [20] 张少卿,雷莉萍,宋豪,等.一种基于大气CO2浓度时空特征的碳排放分区估算方法[J].中国环境科学,2023,43(10):5604.ZHANG Shaoqing, LEI Liping, SONG Hao, et al. A neural network partitioning method for carbon emission estimation based on spatial-temporal clustering of atmospheric CO2 concentration[J]. China Environmental Science, 2023, 43(10): 5604.
    [21] 余成,唐毅,潘杨,等.基于无人机遥感和集成学习的苏州市河流悬浮物浓度估算[J].中国环境科学,2023,43(10):5235.YU Cheng, TANG Yi, PAN Yang, et al. Inversion of suspended sediment concentration in rivers of Suzhou based on UAV remote sensing and ensemble learning[J]. China Environmental Science, 2023, 43(10): 5235.
    [22] ZHOU Z H. Ensemble methods: foundations and algorithms[M]. Cambridge: CRC press, 2012.
    [23] 方韬.基于神经网络的近地面臭氧估算和预测研究[D].上海:上海师范大学,2020.FANG Tao. Study on estimation and prediction of near-surface ozone based on neural network[D]. Shanghai Normal University,2020.
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

LI Aimin, KANG Xuan, YUAN Zheng, WANG Hailong, YAN Xiangyu, XU Youcheng. Estimation of Water Quality Parameters Using an Ensemble Learning Model Optimized with Levy Flight and Sparrow Search Algorithms[J].同济大学学报(自然科学版),2025,53(3):450~461

Copy
Share
Article Metrics
  • Abstract:6
  • PDF: 27
  • HTML: 3
  • Cited by: 0
History
  • Received:August 09,2023
  • Online: April 02,2025
Article QR Code