Abstract:
This study aims to accurately predict the organic carbon content in black soils at the county level, thereby supporting county-level agricultural production and carbon peak and neutrality goals. This study examined 427 soil samples obtained from a surface substrate survey of the black soil area in Baoqing County. Employing deterministic interpolation (inverse distance weighting, IDW), geostatistics (ordinary Kriging method, OK), and machine learning (random forest, RF), this study constructed assessment models to predict the organic carbon content in topsoils in Baoqing County and to compare their prediction accuracy and performance. The results show that the IDW, OK, and RF models yielded average organic carbon contents of 27.21×10
-3, 26.33×10
-3, and 32.05×10
-3, respectively. The RF model outperformed the other two models in terms of root mean square error (
RMSE), mean absolute error (
MAE), and the coefficient of determination (
R2). Specifically, the RF model achieved
R2 values of 0.73 and 0.53 on training and validation sets, respectively, suggesting significantly higher accuracy. This superior performance demonstrates that the RF model can more fully explore potential patterns in data through the nonlinear interaction of environmental variables. Overall, the RF model, incorporating multiple environmental variables, proved to be the optimal approach for predicting the organic carbon content in topsoils in Baoqing County, demonstrating high prediction accuracy. This study provides valuable theoretical and methodological insights for assessing the spatial variations in soil organic matter relevant to county-level agricultural production and regional differences in carbon peak and neutrality goals within black soil areas.