Title: Big data analytics: an improved method for large-scale fabrics detection based on feature importance analysis from cascaded representation
Authors: Ming-Hu Wu; Song Cai; Chun-Yan Zeng; Zhi-Feng Wang; Nan Zhao; Li Zhu; Juan Wang
Addresses: Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China ' Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan, China; Department of Digital Media Technology, Central China Normal University, Wuhan, China
Abstract: Aiming at the dimensional disaster and data imbalance in large-scale fabrics data, this paper proposes a classification method of fabrics images based on feature fusion and feature selection. The model of representation learning using transfer learning idea was firstly established to extract semantic features from fabrics images. Then, the features generated from the different models were cascaded on the purpose of features complement. Furthermore, the extremely randomised trees (Extra-Trees) were used to analyse the importance of the cascaded representation and reduce the computation time of the classification model with big data and high-dimensional representation. Finally, the multilayer perceptron completed the classification of selected features. Experimental results demonstrate that the method can detect fabrics with high accuracy. Moreover, feature importance analysis effectively accelerates the detection speed when the model processes big data.
Keywords: big data; representation learning; feature fusion; feature selection.
DOI: 10.1504/IJGUC.2021.112483
International Journal of Grid and Utility Computing, 2021 Vol.12 No.1, pp.81 - 93
Received: 14 Jan 2020
Accepted: 01 May 2020
Published online: 19 Jan 2021 *