Abstract:
Establishing a spatial prediction model for soil organic matter (SOM) can accurately predict the spatial distribution of SOM content, playing a significant role in scientific soil management and ecosystem service enhancement. Focusing on the soils in Yonghe County, Linfen City, Shanxi Province, this study extracted topographic factors and vegetation indices from the digital elevation model (DEM) and vegetation remote sensing data. With soil attributes as variable factors, this study, using the Boruta algorithm, selected the characteristic variablescorrelating strongly with SOM from variable factors as auxiliary variables. These auxiliary variables were used as model inputand the measured SOM values as model output.The SOM content in samples in the training set was predicted usingthe ordinary Kriging (OK)method, the back propagation neural network (BPNN), the genetic algorithm-optimized BPNN (GA-BPNN), and the improved BPNN combined with the geostatistical method (the GA-BPNN-OK method) separately. The prediction accuracy was comparatively analyzed based on samples in the validation set. The results show that: (1)The Boruta algorithm ranked the selected characteristic variables in order of importance, obtaining the sequence of total nitrogen > topographic wetness index (TWI) > elevation > slope > normalized difference vegetation index (NDVI) > enhanced vegetation index (EVI); (2)Despite local differences, the SOM prediction results obtained using the four methods exhibited roughly the same overall spatial distribution: low in the western and southwestern portions of the study areabut high in the eastern and southeastern portions; (3)Compared to the other three models, the GA-BPNN-OK model demonstrated more distinct low- and high-value areas in the predicted SOM distribution. (4) As revealed by the comparison of prediction accuracy indices, the GA-BPNN-OK method yielded a minimum root mean square error (RMSE) of 0.059, a minimum mean absolute error (MAE) of 0.240, a minimum mean relative error (MRE) of 0.165, and a maximum fitting coefficient (R
2) of 0.78. To verify the effects of the Boruta algorithm in improving model accuracy, global variables, as well as the variables determined through characteristic selection, were used as the model inputof the GA-BPNN method. The comparison of the prediction results indicates that the Boruta algorithm reduced the model error. Therefore, the Boruta algorithm and the GA-BPNN-OK method constitute the optimal prediction model for the spatial distribution of SOM content.