Prediction and comparison of organic carbon content in topsoils based on geostatistics and machine learning models: A case study of Baoqing County
LIU Hong-Bo1,2(), SHI Jia-Hui3, WANG Si-Yin3, PEI Jiu-Bo3()
1. Mudanjiang Natural Resources Comprehensive Survey Center, China Geological Survey, Mudanjiang 157000, China 2. Hulunbuir Black Soil Critical Zone Scientific Observation and Research Station, Hulunbuir 021599, China 3. College of Land and Environment, Shenyang Agricultural University, Shenyang 110866, China
This study aims to accurately predict the organic carbon content in black soils at the county level, thereby supporting county-level agricultural production and carbon peak and neutrality goals. This study examined 427 soil samples obtained from a surface substrate survey of the black soil area in Baoqing County. Employing deterministic interpolation (inverse distance weighting, IDW), geostatistics (ordinary Kriging method, OK), and machine learning (random forest, RF), this study constructed assessment models to predict the organic carbon content in topsoils in Baoqing County and to compare their prediction accuracy and performance. The results show that the IDW, OK, and RF models yielded average organic carbon contents of 27.21×10-3, 26.33×10-3, and 32.05×10-3, respectively. The RF model outperformed the other two models in terms of root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2). Specifically, the RF model achieved R2 values of 0.73 and 0.53 on training and validation sets, respectively, suggesting significantly higher accuracy. This superior performance demonstrates that the RF model can more fully explore potential patterns in data through the nonlinear interaction of environmental variables. Overall, the RF model, incorporating multiple environmental variables, proved to be the optimal approach for predicting the organic carbon content in topsoils in Baoqing County, demonstrating high prediction accuracy. This study provides valuable theoretical and methodological insights for assessing the spatial variations in soil organic matter relevant to county-level agricultural production and regional differences in carbon peak and neutrality goals within black soil areas.
刘洪博, 史佳卉, 王思引, 裴久渤. 基于地统计学与机器学习模型的宝清县表层有机碳含量预测及比较研究[J]. 物探与化探, 2025, 49(5): 1243-1250.
LIU Hong-Bo, SHI Jia-Hui, WANG Si-Yin, PEI Jiu-Bo. Prediction and comparison of organic carbon content in topsoils based on geostatistics and machine learning models: A case study of Baoqing County. Geophysical and Geochemical Exploration, 2025, 49(5): 1243-1250.
Tang K Y. Study on the spatial variation of soil organic carbon and its driving factors in the Inner Mongolian steppes[D]. Hohhot: Inner Mongolia University, 2023.
Xiang T. Spatial distribution characteristics and driving factors of soil organic carbon in small watershed in Loess Hilly Region[D]. Yan’an: Yan’an University, 2024.
Hou H X, Ge L S, Sun X, et al. A study on the application of ground substrate in the survey and evaluation of China’s black soil resources:Based on ground substrate survey in Baoqing,Heilongjiang Province[J]. Journal of Natural Resources, 2022, 37(9):2264-2276.
Mei S, Tong T, Ying C Y, et al. Advances in digital soil mapping based on machine learning[J]. Journal of Agricultural Resources and Environment, 2024, 41(4):744-756.
Han X X, Chen J, Wang H Y, et al. Spatial prediction of SOM content in topsoil based on random forest algorithm:A case study of Huixian City,Henan Province[J]. Soils, 2019, 51(1):152-159.
Qiao T, Yao C Y, Yu D S, et al. Optimal interpolation method for spatial-temporal evolution of soil organic carbon in paddy fields[J]. Journal of Fujian Agriculture and Forestry University:Natural Science Edition, 2020, 49(5):683-694.
Chen L, Ren C Y, Wang Z M, et al. Prediction of spatial distribution of topsoil organic matter content in cultivated land using Kriging methods[J]. Arid Zone Research, 2017, 34(4):798-805.
Yang Q P, Wu W, Liu H B. Prediction of spatial distribution of soil available iron in a typical hilly farmland using terrain attributes and random forest model[J]. Chinese Journal of Eco-Agriculture, 2018, 26(3):422-431.
Wang Y X, Yang K, Gao B B, et al. Prediction of the spatial distribution of soil organic matter based on two-point machine learning method[J]. Transactions of the Chinese Society of Agricultural Engineering, 2022, 38(12):65-73.
Shen C C, Xiao W F, Zhu J H, et al. Characterization of soil organic carbon and key influencing factors of natural forests in Central China based on machine learning algorithms[J]. Scientia Silvae Sinicae, 2024, 60(3):65-77.
Wang Z Y, Tang Z, Zhou P, et al. Comparison of four machine learning in predicting soil organic carbon content in a small watershed in the subtropical hilly area[J]. Research of Agricultural Modernization, 2023, 44(3):558-566.
Xia X Y, Li S Y, Wang J, et al. Effects of topographic factors on soil organic carbon in Picea schrenkiana forest on the northern slope of Tianshan Mountain[J]. Xinjiang Agricultural Sciences, 2023, 60(4):965-973.
[13]
Lal R. Soil carbon sequestration impacts on global climate change and food security[J]. Science, 2004, 304(5677):1623-1627.
doi: 10.1126/science.1097396
pmid: 15192216
[14]
Minasny B, McBratney A B. Digital soil mapping:A brief history and some lessons[J]. Geoderma, 2016,264:301-311.
Liu H B, Kong F P, Zhao J, et al. Exploration and experiment of surface substrate investigation technique:A case study of black soil investigation in Baoqing County,Heilongjiang Province[J]. Geomatics World, 2022, 29(6):1-5.
[16]
Lu J N, Feng S, Wang S K, et al. Patterns and driving mechanism of soil organic carbon,nitrogen,and phosphorus stoichiometry across northern China’s desert-grassland transition zone[J]. Catena, 2023,220:106695.
Du K, Wang L, Zhang S X, et al. Spatial distribution characteristics and influence factors of soil nutrients in black soil region counties[J]. Journal of Plant Nutrition and Fertilizers, 2018, 24(6):1465-1474.
Ling W, Wang X J, Wu W H. Comparison on spatial interpolation methods of average annual precipitation in Xinjiang[J]. Liaoning Forestry Science and Technology, 2020(4):5-9,58.
[20]
Breiman L. Random forests[J]. Machine Learning, 2001, 45(1):5-32.
Li Z J, Wang H F, Hou Y, et al. Comparison of spatial interpolation methods for precipitation in regions lacking data[J]. Hydro Science and Cold Zone Engineering, 2022, 5(8):68-71.
Qi W H, Peng L, Gao L T, et al. Study on spatial variability of soil nutrients and pH values based on GIS geostatistical analysis module—Taking Xundian Country of Yunnan Province as an example[J]. Jiangsu Agricultural Sciences, 2018, 46(23):287-291.
Zhang Z J, Liu Y Q, Wu C S, et al. Spatial distribution characteristics of forest soil nutrients in Jiangxi Province based on geostatistics and GIS[J]. Research of Soil and Water Conservation, 2018, 25(1):38-46.