|
|
|
| Log-based lithology identification using the SMOTE-LSTM hybrid model |
HUANG Liang1( ), CHEN Xuan-Yi2, JIANG Zhen-Jiao1( ), WANG Jin-Xin1, ZHANG Chen-Yu1, SONG Gen-Fa1 |
1. College of New Energy and Environment, Jilin University, Changchun 130021, China 2. China Yangtze Power Co., Ltd., Wuhan 100032, China |
|
|
|
|
Abstract Artificial intelligence algorithms have been developed to automatically identify the spatial structures of formation lithologies from multivariate log data. They represent a promising approach to reducing lithology logging costs and mitigating the subjectivity inherent in lithology identification. Considering the imbalanced distribution of lithology sample data and the spatialtemporal variability in the relationships between log attributes and lithologies, this study constructed a synthetic minority oversampling technique (SMOTE)-long short-term memory (LSTM) hybrid model. The SMOTE algorithm effectively balances the sample distributions of different lithologies, while the LSTM algorithm, using its deep learning architecture, extracts lithological characteristics from the log sequence data. With the borehole log data and lithology records from a sandstone uranium deposit as training data, the SMOTE-LSTM hybrid model achieved a prediction accuracy exceeding 85% in lithology classification. Compared to several other machine learning methods, the SMOTE-LSTM hybrid model demonstrated significantly improved accuracy and reliability in lithology identification.
|
|
Received: 26 December 2024
Published: 30 December 2025
|
|
|
|
Corresponding Authors:
JIANG Zhen-Jiao
E-mail: h_liangn@163.com;zjjiang@jlu.edu.cn
|
|
|
|
|
Distribution of logging positions in the study area (a), distribution of logging data frequency (b) and correlation coefficient between logging (c)
|
|
Comparison between SMOTE-LSTM algorithm flow chart and main machine learning algorithms
|
|
The accuracy of SMOTE-LSTM model training set and verification set varies with (a) the number of LSTM network layers, (b) the type of optimizer, (c) the number of neural units in hidden layer, and (d) Batch size parameters
|
|
The trend chart of accuracy of model training set and verification set (a) and classification result of model confusion matrix (b)
|
|
Comparison of accuracy of test wells before and after SMOTE algorithm processing
|
|
Comparison of accuracy and accuracy of lithologic identification algorithm model (a) and ten-fold cross-validation results of comparison model (b)
|
|
Comparison between predicted lithology and real lithology of BC-10 test well
|
| 岩性 | 精确率/% | 准确率/% | 数目 | | 泥岩粉砂质泥岩 | 88 | 87 | 267 | | 含砾细砂岩 | 62 | 77 | 167 | | 砂砾岩含砾中砂岩 | 96 | 88 | 425 | | 权重平均 | 87 | 85 | 859 |
|
Parameter characteristics of prediction results of Smote-LSTM joint model
|
| [1] |
许振浩, 马文, 李术才, 等. 岩性识别:方法、现状及智能化发展趋势[J]. 地质论评, 2022, 68(6):2290-2304.
|
| [1] |
Xu Z H, Ma W, Li S C, et al. Lithology identification: Method,research status and intelligent development trend[J]. Geological Review, 2022, 68(6):2290-2304.
|
| [2] |
安鹏, 曹丹平. 基于深度学习的测井岩性识别方法研究与应用[J]. 地球物理学进展, 2018, 33(3):1029-1034.
|
| [2] |
An P, Cao D P. Research and application of logging lithology identification based on deep learning[J]. Progress in Geophysics, 2018, 33(3):1029-1034.
|
| [3] |
Antariksa G, Muammar R, Lee J. Performance evaluation of machine learning-based classification with rock-physics analysis of geological lithofacies in Tarakan Basin,Indonesia[J]. Journal of Petroleum Science and Engineering, 2022, 208: 109250.
|
| [4] |
范宜仁, 黄隆基, 代诗华. 交会图技术在火山岩岩性与裂缝识别中的应用[J]. 测井技术, 1999(1):53-56,64.
|
| [4] |
Fan Y R, Huang L J, Dai S H. Application of crossplot technique to the determination of lithology composition and fracture identification of igneous rock[J]. Well Logging Technology, 1999(1):53-56,64.
|
| [5] |
Moscatelli M, Piscitelli S, Piro S, et al. Integrated geological and geophysical investigations to characterize the anthropic layer of the Palatine hill and Roman Forum[J]. Bulletin of Earthquake Engineering, 2014, 12(3):1319-1338.
|
| [6] |
Sun J J, Li Y G. Multidomain petrophysically constrained inversion and geology differentiation using guided fuzzy c-means clustering[J]. Geophysics, 2015, 80(4):ID1-ID18.
|
| [7] |
张永刚. 地震波阻抗反演技术的现状和发展[J]. 石油物探, 2002, 41(4):385-390.
|
| [7] |
Zhang Y G. The present and future of wave impedance inversion technique[J]. Geophysical Prospecting for Petroleum, 2002, 41(4):385-390.
|
| [8] |
Liu W, Du W F, Guo Y L, et al. Lithology prediction method of coal-bearing reservoir based on stochastic seismic inversion and Bayesian classification:A case study on Ordos Basin[J]. Journal Geophysics Engineering, 2022, 19(3):494-510.
|
| [9] |
匡立春, 刘合, 任义丽, 等. 人工智能在石油勘探开发领域的应用现状与发展趋势[J]. 石油勘探与开发, 2021, 48(1):1-11.
|
| [9] |
Kuang L C, Liu H, Ren Y L, et al. Application and development trend of artificial intelligence in petroleum exploration and development[J]. Petroleum Exploration and Development, 2021, 48(1):1-11.
|
| [10] |
毋雪雁, 王水花, 张煜东. K最近邻算法理论与应用综述[J]. 计算机工程与应用, 2017, 53(21):1-7.
|
| [10] |
Wu X Y, Wang S H, Zhang Y D. Survey on theory and application of K-Nearest-neighbors algorithm[J]. Computer Engineering and Applications, 2017, 53(21):1-7.
|
| [11] |
Wang X D, Yang S C, Zhao Y F, et al. Lithology identification using an optimized KNN clustering method based on entropy-weighed cosine distance in Mesozoic strata of Gaoqing field,Jiyang depression[J]. Journal of Petroleum Science and Engineering, 2018, 166:157-174.
|
| [12] |
吕红燕, 冯倩. 随机森林算法研究综述[J]. 河北省科学院学报, 2019, 36(3):37-41.
|
| [12] |
Lyu H Y, Feng Q. A review of random forests algorithm[J]. Journal of the Hebei Academy of Science, 2019, 36(3):37-41.
|
| [13] |
Xi Y T, Mohamed Taha A M, Hu A Q, et al. Accuracy comparison of various remote sensing data in lithological classification based on random forest algorithm[J]. Geocarto International, 2022, 37(26):14451-14479.
|
| [14] |
丁世飞, 齐丙娟, 谭红艳. 支持向量机理论与算法研究综述[J]. 电子科技大学学报, 2011, 40(1):2-10.
|
| [14] |
Ding S F, Qi B J, Tan H Y. An overview on theory and algorithm of support vector machines[J]. Journal of University of Electronic Science and Technology of China, 2011, 40(1):2-10.
|
| [15] |
朱怡翔, 石广仁. 火山岩岩性的支持向量机识别[J]. 石油学报, 2013, 34 (2):312-322.
|
| [15] |
Zhu Y X, Shi G R. identification of lithologic characteristics of volcanic rocks by support vector machine[J]. Acta PetroleiI Sinica, 2013, 34 (2):312-322.
|
| [16] |
李占山, 刘兆赓. 基于XGBoost的特征选择算法[J]. 通信学报, 2019, 40(10):101-108.
|
| [16] |
Li Z S, Liu Z G. Feature selection algorithm based on XGBoost[J]. Journal on Communications, 2019, 40(10):101-108.
|
| [17] |
闫星宇, 顾汉明, 肖逸飞, 等. XGBoost算法在致密砂岩气储层测井解释中的应用[J]. 石油地球物理勘探, 2019, 54(2):447-455,241.
|
| [17] |
Yan X Y, Gu H M, Xiao Y F, et al. XGBoost algorithm applied in the interpretation of tight-sand gas reservoir on well logging data[J]. Oil Geophysical Prospecting, 2019, 54(2):447-455,241.
|
| [18] |
苏高利, 邓芳萍. 论基于MATLAB语言的BP神经网络的改进算法[J]. 科技通报, 2003, 19(2):130-135.
|
| [18] |
Su G L, Deng F P. On the improving backpropagation algorithms of the neural networks based on MATLAB language:A review[J]. Bulletin of Science and Technology, 2003, 19(2):130-135.
|
| [19] |
Luo H, Lai F Q, Dong Z, et al. A lithology identification method for continental shale oil reservoir based on BP neural network[J]. Journal of Geophysics and Engineering, 2018, 15(3):895-908.
|
| [20] |
刘武生, 康世虎, 贾立城, 等. 二连盆地中部古河道砂岩型铀矿成矿特征[J]. 铀矿地质, 2013, 29(6):328-335.
|
| [20] |
Liu W S, Kang S H, Jia L C, et al. Characteristics of paleo-valley sandstone-type uranium mineralization in the middle of erlian basin[J]. Uranium Geology, 2013, 29(6):328-335.
|
| [21] |
樊嵘, 孟大志, 徐大舜. 统计相关性分析方法研究进展[J]. 数学建模及其应用, 2014, 3(1):1-12.
|
| [21] |
Fan R, Meng D Z, Xu D S. Survey of research process on statistical correlation analysis[J]. Mathematical Modeling and Its Applications, 2014, 3(1):1-12.
|
| [22] |
Fernandez A, Garcia S, Herrera F, et al. SMOTE for learning from imbalanced data:Progress and challenges,marking the 15-year anniversary[J]. Journal of Artificial Intelligence Research, 2018, 61:863-905.
|
| [23] |
Gers F A, Schmidhuber J, Cummins F. Learning to forget:Continual prediction with LSTM[J]. Neural Computation, 2000, 12(10):2451-2471.
|
| [24] |
Yu Y, Si X S, Hu C H, et al. A review of recurrent neural networks:LSTM cells and network architectures[J]. Neural Computation, 2019, 31(7):1235-1270.
|
| [25] |
刘建伟, 宋志妍. 循环神经网络研究综述[J]. 控制与决策, 2022, 37(11):2753-2768.
|
| [25] |
Liu J W, Song Z Y. Overview of recurrent neural networks[J]. Control and Decision, 2022, 37(11):2753-2768.
|
| [26] |
Milligan G W, Cooper M C. A study of standardization of variables in cluster analysis[J]. Journal of Classification, 1988, 5(2):181-204.
|
| [1] |
CHEN Geng-Hu, LANG Xing-Hai, WANG Zhao-Shuai, DONG Wei-Cai, WANG Deng-Ke, XIANG Zuo-Peng, LI Zhuang, YE Zi-Feng, WU Chang-Yi, WANG Xu-Hui, WU Tian-Wen, LUO Chao. Geochemical characteristics and anomaly assessments of soils in the Songshunangou gold mining area, Qinghai Province[J]. Geophysical and Geochemical Exploration, 2025, 49(6): 1281-1290. |
| [2] |
TANG Chuan-Zhang, WANG Jin-Kuan, WEI Tao, HUANG Xin-Ya, CHENG Wan-Li, WANG Shou-Dong, LI Ying. Estimation of pre-stack Q-values in the radial trace transform domain[J]. Geophysical and Geochemical Exploration, 2025, 49(6): 1363-1371. |
|
|
|
|