|
|
Predicting the spatial distribution of soil organic matter using the model consisting of the Boruta algorithm and the optimized GA combined with the geostatistical method |
GAO Peng-Li1( ), REN Da-Lu2, LI Chao-Hui3, FENG Zhi-Qiang1,4( ), MIAO Hong-Yun2, QIAO Lin2, WANG Jian-Wu4, YANG Yong-Liang4, ZHANG Li-Ming4, LI Guang-Hui5 |
1. Shanxi Province Key Laboratory of Metallogeny and Assessment of Strategic Mineral Resources, Department of Earth Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China 2. No. 213 Geology Team of Shanxi Provincial Geological Prospecting Bureau, Linfen 041000, China 3. The Third Geolodical Exploration Institute, General Administration of Metallurgical Geology of China, Taiyuan 030006, China 4. Shanxi Institute of Geological Survey Co., Ltd., Taiyuan 030006, China 5. College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China |
|
|
Abstract Establishing a spatial prediction model for soil organic matter (SOM) can accurately predict the spatial distribution of SOM content, playing a significant role in scientific soil management and ecosystem service enhancement. Focusing on the soils in Yonghe County, Linfen City, Shanxi Province, this study extracted topographic factors and vegetation indices from the digital elevation model (DEM) and vegetation remote sensing data. With soil attributes as variable factors, this study, using the Boruta algorithm, selected the characteristic variablescorrelating strongly with SOM from variable factors as auxiliary variables. These auxiliary variables were used as model inputand the measured SOM values as model output.The SOM content in samples in the training set was predicted usingthe ordinary Kriging (OK)method, the back propagation neural network (BPNN), the genetic algorithm-optimized BPNN (GA-BPNN), and the improved BPNN combined with the geostatistical method (the GA-BPNN-OK method) separately. The prediction accuracy was comparatively analyzed based on samples in the validation set. The results show that: (1)The Boruta algorithm ranked the selected characteristic variables in order of importance, obtaining the sequence of total nitrogen >topographic wetness index (TWI) > elevation > slope > normalized difference vegetation index (NDVI) > enhanced vegetation index (EVI); (2)Despite local differences,the SOM prediction results obtained using the four methods exhibited roughly the same overall spatial distribution: low in the western and southwestern portions of the study areabut high in the eastern and southeastern portions;(3)Compared to the other three models, the GA-BPNN-OK model demonstrated more distinct low- and high-value areas in the predicted SOM distribution. (4) As revealed by the comparison of prediction accuracy indices, the GA-BPNN-OK method yielded a minimum root mean square error (RMSE) of 0.059, a minimum mean absolute error (MAE) of 0.240,a minimum mean relative error (MRE) of 0.165, and a maximum fitting coefficient (R2) of 0.78. To verify the effects of the Boruta algorithm in improving model accuracy, global variables, as well as the variables determined through characteristic selection, were used as the model inputof the GA-BPNN method. The comparison of the prediction results indicates that the Boruta algorithm reduced the model error. Therefore, the Boruta algorithm and the GA-BPNN-OK method constitute the optimal prediction model for the spatial distribution of SOM content.
|
Received: 14 March 2023
Published: 27 June 2024
|
|
|
|
|
|
General situation of the study area and distribution of soil samples
|
|
GA-BPNN (a) and BPNN (b) flow chart
|
|
Technology roadmap
|
数据集 | 样本数量 | 最大值/10-3 | 最小值/10-3 | 平均值/10-3 | 标准差/10-3 | 偏度 | K-S检验 | 变异系数/% | 全部数据 | 439 | 16.36 | 2.24 | 8.34 | 2.97 | 0.35 | 0.012 | 35.61 | 训练集 | 351 | 16.26 | 2.24 | 8.42 | 3.03 | 0.32 | 0.014 | 35.99 | 验证集 | 88 | 15.29 | 2.83 | 8.02 | 2.68 | 0.40 | 0.200 | 33.42 |
|
Descriptive statistics of SOM content
|
|
Feature selection results of Boruta algorithm
|
数据项 | 理论模型 | 块金值(C0) | 基台值(C) | 块基比[C0/(C0+C)]/% | 变程/m | 决定系数(r) | SOM实测值 | 指数 | 0.11 | 0.28 | 61.90 | 6985.92 | 0.23 | GA-BPNN残差实测值 | 球状 | 6.73 | 14.36 | 83.10 | 1110.00 | 0.74 | GA-BPNN残差 | 球状 | 4.66 | 12.27 | 62.00 | 1570.00 | 0.60 |
|
SOM,GA-BPNN and residual semi-variogram parameters
|
预测方法 | 最大值/10-3 | 最小值/10-3 | 平均值/10-3 | OK | 13.38 | 4.89 | 8.21 | BPNN | 15.74 | 3.24 | 8.40 | GA-BPNN | 16.76 | 2.62 | 8.31 | GA-BPNN-OK | 15.95 | 2.28 | 8.26 |
|
Statistical characteristics of prediction results of soil organic matter content
|
|
Spatial distribution of soil organic matter content in Yonghe County, Shanxi Province
|
预测方法 | 均方误差 (MSE) | 均方根误差 (RMSE) | 平均绝对 误差(MAE) | 拟合系数 (R2) | OK | 0.560 | 1.016 | 0.551 | 0.29 | BPNN | 0.104 | 0.281 | 0.182 | 0.69 | GA-BPNN | 0.063 | 0.251 | 0.170 | 0.74 | GA-BPNN-OK | 0.059 | 0.240 | 0.165 | 0.78 | 全变量GA-BPNN | 0.082 | 0.287 | 0.206 | 0.65 |
|
Statistical analysis of accuracy of each prediction model
|
[1] |
He S F, Zhou Q. Local wavelet packet decomposition of soil hyperspectral for SOM estimation[J]. Infrared Physics & Technology, 2022,125:104285.
|
[2] |
Vahedi A A. Monitoring soil carbon pool in the Hyrcanian coastal plain forest of Iran:Artificial neural network application in comparison with developing traditional models[J]. Catena, 2017,152:182-189.
|
[3] |
Megan B, Marc G. Emerging land use practices rapidly increase soil organic matter[J]. Nature Communications, 2015,6:6995.
|
[4] |
连纲, 郭旭东, 傅伯杰, 等. 黄土高原县域土壤养分空间变异特征及预测——以陕西省横山县为例[J]. 土壤学报, 2008(4):577-584.
|
[4] |
Lian G, Guo X D, Fu B J, et al. Spatial variation of soil nutrients in Loess Plateau:A case study of Hengshan County,Shaanxi Province[J]. Acta Pedologica Sinica, 2008(4):577-584.
|
[5] |
张素梅, 王宗明, 张柏, 等. 利用地形和遥感数据预测土壤养分空间分布[J]. 农业工程学报, 2010,(5):188-194.
|
[5] |
Zhang S M, Wang Z M, Zhang B, et al. Prediction of spatial distribution of soil nutrients using topographic and remote sensing data[J]. Transactions of the Chinese Society of Agricultural Engineering, 2010,(5):188-194.
|
[6] |
李启权, 王昌全, 岳天祥, 等. 基于定性和定量辅助变量的土壤有机质空间分布预测:以四川三台县为例[J]. 地理科学进展, 2014,(2):259-269.
|
[6] |
Li Q Q, Wang C Q, Yue T X, et al. Prediction of spatial distribution of soil organic matter based on qualitative and quantitative auxiliary variables:A case study of Santai County,Sichuan Province[J]. Progress in Geography, 2014,(2):259-269.
|
[7] |
Dai F Q, Zhou Q G. Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau[J]. Ecological Indicators, 2014, 45(1):184-194.
|
[8] |
Dharumarajan S, Hegde R. Spatial prediction of major soil properties using Random Forest techniques-A case study in semi-arid tropics of South India(Article)[J]. Geoderma Regional, 2017,10:154-162.
|
[9] |
韩杏杏, 陈杰, 王海洋, 等. 基于随机森林模型的耕地表层土壤有机质含量空间预测——以河南省辉县市为例[J]. 土壤, 2019, 51(1):152-159.
|
[9] |
Han X X, Chen J, Wang H Y, et al. Spatial prediction of surface soil organic matter content based on stochastic forest model:A case study of Huixian City,Henan Province[J]. Soil Science, 2019, 51 (1):152-159.
|
[10] |
卢宏亮, 赵明松, 刘斌寅, 等. 基于随机森林模型的安徽省土壤属性空间分布预测[J]. 土壤, 2019, 51(3):602-608.
|
[10] |
Lu H L, Zhao M S, Liu B Y, et al. Prediction of spatial distribution of soil properties in Anhui Province based on Random forest model[J]. Soil Science, 2019, 51 (3):602-608.
|
[11] |
Yu Q, Yao T C. Improving estimation of soil organic matter content by combining Landsat 8 OLI images and environmental data:A case study in the river valley of the southern Qinghai-Tibet Plateau[J]. Computers & Electronics in Agriculture, 2021,185:106144.
|
[12] |
周银, 刘丽雅, 卢艳丽, 等. 星地多源数据的区域土壤有机质数字制图[J]. 遥感学报, 2015,(6):998-1006.
|
[12] |
Zhou Y, Liu L Y, Lu Y L, et al. Digital mapping of regional soil organic matter with multi-source data from satellite and ground[J]. Journal of Remote Sensing, 2015,(6):998-1006.
|
[13] |
Liu Q, He L. Digital mapping of soil organic carbon density using newly developed bare soil spectral indices and deep neural network[J]. Catena, 2022,219:106603.
|
[14] |
Tajgardan T, Ayoubi S. Soil surface salinity prediction using ASTER data:Comparing statistical and geostatistical models[J]. Australian Journal of Basic and Applied Sciences, 2011, 4(3):457-467.
|
[15] |
姜赛平, 张怀志, 张认连, 等. 基于三种空间预测模型的海南岛土壤有机质空间分布研究[J]. 土壤学报, 2018, 55(4):1007-1017.
|
[15] |
Jiang S P, Zhang H Z, Zhang X L, et al. Spatial distribution of soil organic matter in Hainan Island based on three spatial prediction models[J]. Acta Pedologica Sinica, 2018, 55 (4):1007-1017.
|
[16] |
沈掌泉, 施洁斌, 王珂, 等. 应用集成BP神经网络进行田间土壤空间变异研究[J]. 农业工程学报, 2004, 20(3):35-39.
|
[16] |
Shen Z Q, Shi J B, Wang K, et al. Application of integrated BP neural network to spatial variation of field soil[J]. Transactions of the Chinese Society of Agricultural Engineering, 2004, 20(3):35-39.
|
[17] |
Vitharana U W A, Mishra U. National soil organic carbon estimates can improve global estimates[J]. Geoderma, 2019, 337(1):55-64.
|
[18] |
George K J, Kumar S. Soil organic carbon prediction using visible-near infrared reflectance spectroscopy employing artificial neural network modelling[J]. Current Science, 2020, 119(2):377-381.
|
[19] |
吴俊, 郭大千, 李果, 等. 基于CARS-BPNN的江西省土壤有机碳含量高光谱预测[J]. 中国农业科学, 2022, 55(19):3738-3750.
|
[19] |
Wu J, Guo D Q, Li G, et al. Hyperspectral prediction of soil organic carbon content in Jiangxi Province based on CARS-BPNN[J]. Scientia Agricultura Sinica, 202, 55(19):3738-3750.
|
[20] |
Odebiri O, Mutanga O. Deep learning-based national scale soil organic carbon mapping with Sentinel-3 data[J]. Geoderma, 2022,411.
|
[21] |
赖雨晴, 孙孝林, 王会利. 人工神经网络及其与地统计的混合模型在小面积丘陵区土壤有机碳预测制图上的应用研究[J]. 土壤通报, 2020, 51(6):1313-1322.
|
[21] |
Lai Y Q, Sun X L, Wang H L. Application of artificial neural network and its mixed model with geostatistics on soil organic carbon prediction mapping in small hilly area[J]. Chinese Journal of Soil Science, 2017, 51(6):1313-1322.
|
[22] |
张宏帅, 朱高龙, 吴家煜, 等. 基于BP神经网络与Kriging结合的土壤有机质空间分布模拟——以福建省华安县为例[J]. 亚热带农业研究, 2021, 17(1):40-47.
|
[22] |
Zhang H S, Zhu G L, Wu J Y, et al. Spatial distribution simulation of soil organic matter based on BP neural network and Kriging:A case study of Hua'an County,Fujian Province[J]. Subtropical Agricultural Research, 2021, 17 (1):40-47.
|
[23] |
Song Y Q, Sun N, Zhang L. Using multispectral variables to estimate heavy metals content in agricultural soils:A case of suburban area in Tianjin,China[J]. Geoderma Regional, 2022,29:e00540.
|
[24] |
何红艳. MODIS数据植被指数的提取方法研究[C]// 2006遥感科技论坛暨中国遥感应用协会2006年年会,2006.
|
[24] |
He H Y. Research on extraction method of vegetation index from MODIS data [C]// 2006 Remote Sensing Science and Technology Forum and China Association of Remote Sensing Applications Annual Meeting,2006.
|
[25] |
Rigol S J P, Stuart N. ArcGeomorphometry:A toolbox for geomorphometric characterisation of DEMs in the ArcGIS environment[J]. Computers & Geosciences, 2015,85:155-163.
|
[26] |
Alireza A, Fatemeh R. Modelling of piping collapses and gully headcut landforms:Evaluating topographic variables from different types of DEM[J]. Geoscience Frontiers, 2021,12:135-152.
|
[27] |
Hamid G, Aliakbar M. Using the Boruta algorithm and deep learning models for mapping land susceptibility to atmospheric dust emissions in Iran[J]. Aeolian Research, 2021,50:100682.
|
[28] |
Mahamed L G, Muhammad H K. Potential of Vis-NIR to measure heavy metals in different varieties of organic-fertilizers using Boruta and deep belief network[J]. Ecotoxicology and Environmental Safety, 2021,228:112996.
|
[29] |
卢宏亮, 赵明松, 刘斌寅, 等. 基于Boruta-支持向量回归的安徽省土壤pH值预测制图[J]. 地理与地理信息科学, 2019, 35(5):66-72.
|
[29] |
Lu H L, Zhao M S, Liu B Y, et al. Prediction mapping of soil pH value based on Boruta-Support vector regression in Anhui Province[J]. Geography and Geo-Information Science, 2019, 35 (5):66-72.
|
[30] |
孙孝林, 赵玉国, 刘峰, 等. 数字土壤制图及其研究进展[J]. 土壤通报, 2013, 44(3):752-759.
|
[30] |
Sun X L, Zhao Y G, Liu F, et al. Digital soil mapping and its research progress[J]. Chinese Journal of Soil Science, 2013, 44 (3):752-759.
|
[31] |
江叶枫, 孙凯, 郭熙, 等. 基于环境因子和邻近信息的土壤属性空间分布预测[J]. 环境科学研究, 2017, 30(7):1059-1068.
|
[31] |
Jiang Y F, Sun K, Guo X. et al. Spatial distribution prediction of soil attributes based on environmental factors and proximity information[J]. Research of Environmental Science, 2017, 30 (7):1059-1068.
|
[32] |
张万涛, 吉静怡, 李彬彬, 等. 黄土高原不同地貌区农田土壤有机质预测方法研究[J]. 植物营养与肥料学报, 2021, 27(4):583-594.
|
[32] |
Zhang W T, Ji J Y, Li B B, et al. Study on prediction method of soil organic matter in different geomorphic regions of Loess Plateau[J]. Plant Nutrition and Fertilizer Journal, 2021, 27(4):583-594.
|
[33] |
王雨雪, 杨柯, 高秉博, 等. 基于两点机器学习方法的土壤有机质空间分布预测[J]. 农业工程学报, 2022, 38(12):65-73.
|
[33] |
Wang Y X, Yang K, Gao B B, et al. Prediction of spatial distribution of soil organic matter based on two-point machine learning[J]. Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (12):65-73.
|
[34] |
Li J H, Zhu D S. Comparative analysis of BPNN,SVR,LSTM,Random Forest,and LSTM-SVR for conditional simulation of non-Gaussian measured fluctuating wind pressures[J]. Mechanical Systems and Signal Processing, 2022,178:109285.
|
[35] |
赵建辉, 张晨阳, 闵林, 等. 基于特征选择和GA-BP神经网络的多源遥感农田土壤水分反演[J]. 农业工程学报, 2021, 37(11):112-120.
|
[35] |
Zhao J H, Zhang C Y, Min L, et al. Multi-source Remote Sensing soil moisture retrieval based on feature selection and GA-BP neural network[J]. Transactions of the Chinese Society of Agricultural Engineering, 2021, 37 (11):112-120.
|
[36] |
杨梅花, 赵小敏. 基于可见—近红外光谱变量选择的土壤全氮含量估测研究[J]. 中国农业科学, 2014,(12):2374-2383.
|
[36] |
Yang M H, Zhao X M. Estimation of soil total nitrogen content based on variable selection of vision-near-infrared spectroscopy[J]. Scientia Agricultura Sinica, 2014,(12):2374-2383.
|
[37] |
Zhou P, Sudduth Kenneth A. Extraction of reflectance spectra features for estimation of surface,subsurface,and profile soil properties[J]. Computers & Electronics in Agriculture, 2022,196.
|
[38] |
Song Y Q, Zhu A X. Spatial variability of selected metals using auxiliary variables in agricultural soils[J]. Catena, 2019,174:499-513.
|
[39] |
张子璐, 左昕弘, 刘峰, 等. 渝西丘陵区土壤速效钾空间异质性及影响因素[J]. 土壤学报, 2020, 57(2):307-315.
|
[39] |
Zhang Z L, Zuo X H, Liu F, et al. Spatial heterogeneity of soil available potassium and its influencing factors in the hilly region of western Chongqing[J]. Acta Pedologica Sinica, 20, 57(2):307-315.
|
[40] |
徐清风, 于茹月, 勾宇轩, 等. 基于云遗传BP神经网络的黄淮海旱作区土壤有机质预测精度分析[J]. 中国农业大学学报, 2021, 26(4):167-173.
|
[40] |
Xu Q F, Yu R Y, Gou Y X, et al. Prediction accuracy of soil organic matter based on cloud genetic BP neural network in Huang-Huai-hai dry area[J]. Journal of China Agricultural University, 2021, 26 (4):167-173.
|
[41] |
徐剑波, 宋立生, 夏振, 等. 基于GARBF神经网络的耕地土壤有效磷空间变异分析[J]. 农业工程学报, 2012,(16):158-165.
|
[41] |
Xu J B, Song L S, Xia Z, et al. Spatial variation analysis of soil available phosphorus based on GARBF neural network[J]. Transactions of the Chinese Society of Agricultural Engineering, 2012,(16):158-165.
|
[42] |
谢梦姣, 王洋, 康营, 等. 人工神经网络与普通克里金插值法对土壤属性空间预测精度影响研究[J]. 生态与农村环境学报, 2021, 37(7):934-942.
|
[42] |
Xie M J, Wang Y, Kang Y, et al. Effects of artificial neural network and common Kriging interpolation method on spatial prediction accuracy of soil attributes[J]. Journal of Ecology and Rural Environment, 2021, 37 (7):934-942.
|
[43] |
江叶枫, 郭熙, 叶英聪, 等. 基于辅助变量和神经网络模型的土壤有机质空间分布模拟[J]. 长江流域资源与环境, 2017, 26(8):1150-1158.
|
[43] |
Jiang Y F, Guo X, Ye Y C, et al. Spatial distribution simulation of soil organic matter based on auxiliary variables and neural network model[J]. Resources and Environment in the Yangtze Basin, 2017, 26 (8):1150-1158.
|
[44] |
尉芳, 刘京, 夏利恒, 等. 陕西渭北旱塬区农田土壤有机质空间预测方法[J]. 环境科学, 2022, 43(2):1097-1107.
|
[44] |
Wei F, Liu J X, Xia L H, et al. Spatial prediction method of farmland soil organic matter in Weibei Arid Table-land of Shaanxi Province[J]. Environmental Science, 2022, 43 (2):1097-1107.
|
[1] |
ZHANG Jun, TAO Nai, QI Shang-Xing, WANG Zhi-Qiang, DA Hao-Xiang. Petrogenesis and rubidium enrichment indication of the Fuling rock mass in southern Anhui Province[J]. Geophysical and Geochemical Exploration, 2024, 48(3): 584-596. |
[2] |
ZHOU Xue-Ni, CAO Ya-Ting, JI Yang. Element geochemical characteristics of weathering crust profiles of the Wenchuan section in the upper arid valley of the Minjiang River[J]. Geophysical and Geochemical Exploration, 2024, 48(3): 597-608. |
|
|
|
|