Title: Using unstructured logs generated in complex large-scale micro-service-based architecture for data analysis
Authors: Anukampa Behera; Sitesh Behera; Chhabi Rani Panigrahi; Tien-Hsiung Weng
Addresses: Department of Computer Science and Engineering, ITER, S'O'A University, Odisha, India; Department of Computer Science, Rama Devi Women's University, Odisha, India ' Plivo Inc., Bengaluru, India ' Department of Computer Science, Rama Devi Women's University, India ' Department of Computer Science and Information Engineering, Providence University, Taichung 43301, Taiwan
Abstract: With deployments of complicated or complex large-scale micro-service architectures the kind of data generated from all those systems makes a typical production infrastructure huge, complicated and difficult to manage. In this scenario, logs play a major role and can be considered as an important source of information in a large-scale secured environment. Till date, many researchers have contributed various methods towards conversion of unstructured logs to structured ones. However, post conversion, the dimension of the dataset generated increases many folds which are too complex for data analysis. In this paper, we have discussed techniques and methods to deal with extraction of all features from a produced structured log, reducing N-dimensional features to fixed dimensions without compromising the quality of data in a cost-efficient manner that can be used for any further machine learning-based analysis.
Keywords: json data; micro services; data parsing; principal component analysis; PCA; multivariate data; unstructured data; tagged data; feature reduction; reverse indexed database; profiling.
DOI: 10.1504/IJBIDM.2023.127294
International Journal of Business Intelligence and Data Mining, 2023 Vol.22 No.1/2, pp.248 - 263
Received: 03 Oct 2020
Accepted: 03 Aug 2021
Published online: 30 Nov 2022 *