Title: Lithology identification technology based on the stacking fusion model

Authors: Chong Hu; Rui Deng; Yuanpeng Zhang; Jie Chen; Ming Li; Lixia Dang; Feng Zhou; Huibing Shi; Hongyan Lu; Kun Xiong

Addresses: Yangtze University, No. 111 Daxue Road, Caidian District, Wuhan City, Hubei Province, China ' Yangtze University, No. 111 Daxue Road, Caidian District, Wuhan City, Hubei Province, China ' PetroChina Tuha Oilfield Company, No. 1 Chuangye Road, Yizhou District, Hami City, Xinjiang, China ' PetroChina Tuha Oilfield Company, No. 1 Chuangye Road, Yizhou District, Hami City, Xinjiang, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' Yangtze University, No. 111 Daxue Road, Caidian District, Wuhan, Hubei Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China ' PetroChina Qinghai Oilfield Company, No. 9 Kunlun Middle Road, Qili Town, Dunhuang City, Gansu Province, China

Abstract: A lithology identification method based on stacking multi-model fusion was studied, which solved the problem of poor recognition performance of traditional single machine learning models. In the experiment, logging data underwent preprocessing using outlier and linear analysis. Nine data features were filtered to identify valid features. Classification and regression tree, K-nearest neighbour algorithm, random forest, and extreme gradient boosting were used as base models. Principal component analysis calculated the weights of each model and applied them to the light gradient boosting machine metamodel in the second layer, constructing a multi-layer ensemble learning model. The fusion model improved the F1-score by 1.63 percentage points compared to random forest. In the siltstone with the best average recognition performance, the improvement was 9.24 percentage points over the K-nearest neighbour algorithm. These results verify the higher accuracy and F1-score of the fusion model as compared to traditional single algorithms, demonstrating the effectiveness of the fusion model method. [Received: April 28, 2023; Accepted: July 28, 2023]

Keywords: lithology identification; stacking algorithm; principal component analysis; fusion model.

DOI: 10.1504/IJOGCT.2023.135057

International Journal of Oil, Gas and Coal Technology, 2023 Vol.34 No.4, pp.337 - 358

Received: 25 Apr 2023
Accepted: 28 Jul 2023

Published online: 29 Nov 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article