Title: Analysing research collaboration through co-authorship networks in a big data environment: an efficient parallel approach
Authors: Carlos Roberto Valêncio; José Carlos De Freitas; Rogéria Cristiane Gratão De Souza; Leandro Alves Neves; Geraldo Francisco Donegá Zafalon; Angelo Cesar Colombini; William Tenório
Addresses: Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Fluminense Federal University (UFF), Niterói, Rio de Janeiro, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil
Abstract: Bibliometry is the quantitative study of scientific productions and enables the characterisation of scientific collaboration networks. However, with the development of science and the increase of scientific production, large collaborative networks are formed, which makes it difficult to extract bibliometrics. In this context, this work presents an efficient parallel optimisation of three bibliometrics for co-authorship network analysis using multithread programming: transitivity, average distance, and diameter. Our experiments found that the time taken to calculate the transitivity value using the sequential approach grows 4.08 times faster than the parallel proposed approach when the size of co-authorship network grows. Similarly, the time taken to calculate the average distance and diameter values using the sequential approach grows 5.27 times faster than the parallel proposed approach when the size of co-authorship network grows. In addition, we report relevant values of speed up and efficiency for the developed algorithms.
Keywords: bibliometrics; graphs; knowledge extraction; co-authorship network; NoSQL; parallel computing.
DOI: 10.1504/IJCSE.2020.106061
International Journal of Computational Science and Engineering, 2020 Vol.21 No.3, pp.364 - 374
Received: 10 May 2018
Accepted: 07 Nov 2018
Published online: 27 Mar 2020 *