Title: XGBoost-PCA-BPNN prediction model and its application on predicting the effectiveness of non-surgical periodontal treatment
Authors: Jinxiang Chen; Dong Shi; Jinqi Fan; Huanxin Meng; Jian Jiao; Ruifang Lu
Addresses: State Key Laboratory of Hybrid Process Industry Automation Systems and Equipment Technology, Automation Research and Design Institute of Metallurgical Industry, China Iron & Steel Research Institute Group, Beijing, 100081, China ' Department of Periodontology, National Engineering Laboratory for Digital and Material Technology of Stomatology, Beijing Key Laboratory of Digital Stomatology, Peking University School and Hospital of Stomatology, Beijing, 100081, China ' State Key Laboratory of Hybrid Process Industry Automation Systems and Equipment Technology, Automation Research and Design Institute of Metallurgical Industry, China Iron & Steel Research Institute Group, Beijing, 100081, China ' Department of Periodontology, National Engineering Laboratory for Digital and Material Technology of Stomatology, Beijing Key Laboratory of Digital Stomatology, Peking University School and Hospital of Stomatology, Beijing, 100081, China ' Department of Periodontology, National Engineering Laboratory for Digital and Material Technology of Stomatology, Beijing Key Laboratory of Digital Stomatology, Peking University School and Hospital of Stomatology, Beijing, 100081, China ' Department of Periodontology, National Engineering Laboratory for Digital and Material Technology of Stomatology, Beijing Key Laboratory of Digital Stomatology, Peking University School and Hospital of Stomatology, Beijing, 100081, China
Abstract: An XGBoost-PCA-BPNN classification and predication model for the big data set with uncertain and coupling multidimensional characteristics is presented to solve the problem that the existing machine learning algorithms cannot get the high prediction performances. The clinical data set of the NSPT for Chinese population with CP is a typical big data set with uncertain and coupling multidimensional characteristics. The XGBoost-PCA-BPNN model is constructed to predict the effectiveness of NSPT for Chinese population with CP in this paper. The model is verified by applying it to predict the effectiveness of NSTP based on 45,000 clinical sites data set with eight characteristics. Prediction results show that the R2-score obtained by applying the Xgboost-PCA-BPNN model is 0.943, which is higher than those obtained by using Logistic regression, Xgboost, and Xgboost-Logistic regression, respectively. In addition, it is found that the effectiveness of NSTP is mainly influenced by PD, BI, sites and age.
Keywords: classification prediction; XGBoost; BPNN; non-surgical periodontal treatment; big data analysis; chronic periodontitis; effectiveness.
DOI: 10.1504/IJMIC.2020.111627
International Journal of Modelling, Identification and Control, 2020 Vol.34 No.3, pp.284 - 291
Received: 24 Feb 2020
Accepted: 18 Mar 2020
Published online: 04 Dec 2020 *