
Risk assessment of groundwater arsenic in Hetao Basin base on ensemble learning optimization
Yu FU, Wengeng CAO, Chunju ZHANG, Wenhua ZHAI, Yu REN, Tian NAN, Zeyan LI
Risk assessment of groundwater arsenic in Hetao Basin base on ensemble learning optimization
The shallow groundwater arsenic pollution in Hetao Basin seriously exceeds the standard, and its potential pollution risk poses a serious health threat to local residents. At present, the perception of the risk distribution of high arsenic groundwater is still insufficient on the macroscopic scale. Based on 605 shallow groundwater samples and environmental factors such as sedimentary environment, climate, human activities, soil physical and chemical characteristics, and hydrogeological conditions as data sources, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Support Vector Machine (SVM) were selected as the base learners, and Linear Discriminant Analysis (LDA) was selected as the meta-learner to construct a Stacking ensemble learning model for high arsenic groundwater. The ensemble learning model was used to predict the risk distribution of high arsenic groundwater and identify the key environmental factors affecting the risk distribution of high arsenic groundwater in the region. The research showed that the groundwater arsenic concentration exceeded the standard rate (>10 μg/L) was 49.59%, mainly concentrated in the paleochannel zone and flood fans of the Yellow River. The Stacking ensemble model had higher reliability than the RF model with the best performance in the single model, and the Area Under the ROC Curve (AUC) and accuracy were increased by 1.1% and 3.2%, respectively. The high-risk area reached 5257 km2, accounting for 38.44% of the total area of the study area. The sedimentary environment is the key environmental factor affecting the risk distribution of high arsenic groundwater, contributing up to 25.06% to the accuracy of the model. The results of this study can provide a method and reference for mapping the spatial distribution of high arsenic groundwater pollution and have important implications for the safety of drinking water and human health in the region.
Stacking ensemble learning / groundwater / high arsenic / risk distribution / Hetao Basin
[1] |
|
[2] |
WORLD HEALTH ORGANIZATION. Guidelines for Drinking-water Quality[S]. 4th ed. Geneva: World Health Organization, 2011.
|
[3] |
|
[4] |
|
[5] |
|
[6] |
曹文庚, 董秋瑶, 谭俊, 等. 河套盆地晚更新世以来黄河改道对高砷地下水分布的控制机制[J]. 南水北调与水利科技, 2021, 19(1): 140-150.
|
[7] |
金银龙, 梁超轲, 何公理, 等. 中国地方性砷中毒分布调查(总报告)[J]. 卫生研究, 2003, 32(6): 519-540.
|
[8] |
|
[9] |
郭华明, 唐小惠, 杨素珍, 等. 土著微生物作用下含水层沉积物砷的释放与转化[J]. 现代地质, 2009, 23(1): 86-93.
|
[10] |
高存荣, 刘文波, 冯翠娥, 等. 干旱、半干旱地区高砷地下水形成机理研究: 以中国内蒙古河套平原为例[J]. 地学前缘, 2014, 21(4): 13-29.
|
[11] |
|
[12] |
|
[13] |
张庆卜. 国家级地下水位监测数据分析研究: 以民勤盆地为例[D]. 北京: 中国地质大学(北京), 2020.
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
|
[29] |
|
[30] |
|
[31] |
|
[32] |
|
[33] |
|
[34] |
|
[35] |
高存荣. 河套平原地下水砷污染机理的探讨[J]. 中国地质灾害与防治学报, 1999(2): 25-32.
|
[36] |
|
[37] |
|
[38] |
|
[39] |
|
[40] |
|
[41] |
|
[42] |
|
[43] |
|
[44] |
|
[45] |
|
[46] |
|
[47] |
|
[48] |
|
[49] |
|
[50] |
|
[51] |
|
[52] |
|
[53] |
|
[54] |
周志华, 王珏. 机器学习及其应用[M]. 北京: 清华大学出版社, 2007: 63-72.
|
[55] |
|
[56] |
|
[57] |
付宇, 曹文庚, 张娟娟. 基于随机森林建模预测河套盆地高砷地下水风险分布[J]. 岩矿测试, 2021, 40(6): 860-870.
|
[58] |
郭华明, 高志鹏, 修伟. 地下水典型氧化还原敏感组分迁移转化的研究热点和趋势[J]. 地学前缘, 2022, 29(3): 64-75.
|
/
〈 |
|
〉 |