Title: Application of data mining techniques in building predictive models for oil and gas problems: a case study on casing corrosion prediction
Authors: Mazda Irani; Rick Chalaturnyk; Mohsen Hajiloo
Addresses: Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB, Canada; Subsurface Engineering Group (SEG), Suncor Energy, Calgary, AB, Canada ' Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB, Canada ' Department of Computer Science, University of Alberta, Edmonton, AB, Canada; Department of Computer Science, University of Toronto, Toronto, ON, Canada
Abstract: This paper describes the use of (supervised) data mining to predict casing corrosion in carbon geological storage projects. This study discusses: 1) data pre-processing such as missing value handling and discretisation; 2) feature selection methods such as correlation coefficient, signal-to-noise ratio, information gain, Gini index, and the k-nearest neighbour (KNN) approach; 3) classification techniques including decision trees (C4.5 and CART) and Bayesian networks; 4) evaluation methods like cross-validation as four successive steps of supervised learning. The experimental analysis of the casing corrosion problem based on the given supervised learning framework shows the effectiveness of data mining techniques in finding features relevant to the problem under study and in building models to predict and identify casing corrosion. [Received: June 19, 2013; Accepted: December 24, 2013]
Keywords: casing corrosion; classification; data mining; feature selection; predictive modelling; oil and gas industry; case study; corrosion prediction; carbon geological storage; carbon storage; data pre-processing; correlation coefficient; signal-to-noise ratio; SNR; information gain; Gini index; k-nearest neighbour; KNN; decision trees; Bayesian networks; supervised learning.
DOI: 10.1504/IJOGCT.2014.066304
International Journal of Oil, Gas and Coal Technology, 2014 Vol.8 No.4, pp.369 - 398
Received: 22 Jun 2013
Accepted: 24 Dec 2013
Published online: 23 Dec 2014 *