Title: Lithology identification technology based on the stacking fusion model
Authors: Chong Hu; Rui Deng; Yuanpeng Zhang; Jie Chen; Ming Li; Lixia Dang; Feng Zhou; Huibing Shi; Hongyan Lu; Kun Xiong
Addresses: Yangtze University, No. 111 Daxue Road, Caidian District, Wuhan City, Hubei Province, China ' Yangtze University, No. 111 Daxue Road, Caidian District, Wuhan City, Hubei Province, China ' PetroChina Tuha Oilfield Company, No. 1 Chuangye Road, Yizhou District, Hami City, Xinjiang, China ' PetroChina Tuha Oilfield Company, No. 1 Chuangye Road, Yizhou District, Hami City, Xinjiang, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' Yangtze University, No. 111 Daxue Road, Caidian District, Wuhan, Hubei Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China
Abstract: A lithology identification method based on stacking multi-model fusion was studied, which solved the problem of poor recognition performance of traditional single machine learning models. In the experiment, logging data underwent preprocessing using outlier and linear analysis. Nine data features were filtered to identify valid features. Classification and regression tree, K-nearest neighbour algorithm, random forest, and extreme gradient boosting were used as base models. Principal component analysis calculated the weights of each model and applied them to the light gradient boosting machine metamodel in the second layer, constructing a multi-layer ensemble learning model. The fusion model improved the F1-score by 1.63 percentage points compared to random forest. In the siltstone with the best average recognition performance, the improvement was 9.24 percentage points over the K-nearest neighbour algorithm. These results verify the higher accuracy and F1-score of the fusion model as compared to traditional single algorithms, demonstrating the effectiveness of the fusion model method. [Received: April 28, 2023; Accepted: July 28, 2023]
Keywords: lithology identification; stacking algorithm; principal component analysis; fusion model.
DOI: 10.1504/IJOGCT.2023.135057
International Journal of Oil, Gas and Coal Technology, 2023 Vol.34 No.4, pp.337 - 358
Received: 25 Apr 2023
Accepted: 28 Jul 2023
Published online: 29 Nov 2023 *