Forthcoming and Online First Articles

International Journal of Data Mining, Modelling and Management

International Journal of Data Mining, Modelling and Management (IJDMMM)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Data Mining, Modelling and Management (11 papers in press)

Regular Issues

  • Improving Intrusion Detection in the IoT with African Vultures Optimization Algorithm-Based Feature Selection   Order a copy of this article
    by Mohammed Alweshah, Ghadeer Ahmad Alhebaishan, Sofian Kassaymeh, Saleh Alkhalaileh, Mohammed Ababneh 
    Abstract: he security of the system may be jeopardised by unsecured data transmitted through IoT devices, and ensuring the reliability of data is critical to maintaining the integrity of information over the internet. To enhance the intrusion detection rate, several investigations have been conducted to develop methodologies capable of identifying the minimum required secure features. One such method is the use of the feature selection procedure with metaheuristic algorithms. In this study, the African vulture optimisation algorithm was used in two wrapper FS approaches to select the most secure features in IoT. The first approach used AVO, while the second employed OBL-AVO, a hybrid model combining AVO with opposition-based learning (OBL) to enhance exploration. Based on the outcomes, it was found that the OBL-AVO is superior to the AVO in enhancing FS. Furthermore, the proposed methods’ were evaluated and compared to four recent approaches.
    Keywords: intrusion detection; internet of things; IoT; feature selection; hybrid metaheuristics; African vultures optimisation algorithm; AVO; opposition-based learning; OBL.
    DOI: 10.1504/IJDMMM.2024.10060965
     
  • An Irregular CLA-based Novel Frequent Pattern Mining Approach   Order a copy of this article
    by Moumita Ghosh, Sourav Mondal, Harshita Moondra, Dina Tri Utari, Anirban Roy, Kartick Chandra Mondal 
    Abstract: Frequent itemset mining has received a lot of attention in the field of data mining. Its main objective is to find groups of items that consistently appear together in datasets. Even while frequent itemset mining is useful, the algorithms for mining frequent itemsets have quite high resource requirements. In order to optimise the time and memory needs, a few improvements have been made in recent years. This study proposes CellFPM, a straightforward yet effective cellular learning automata-based method for finding frequent itemset occurrences. It works efficiently with large datasets. The efficiency of the proposed approach in time and memory requirements has been evaluated using benchmark datasets explicitly designed for performance measure. The varying size and density of the test datasets have confirmed the scalability of the suggested method. The findings show that CellFPM consistently surpasses the leading algorithms in terms of runtime and memory usage, particularly memory usage mostly.
    Keywords: cellular learning automata; CLA; frequent itemsets; data mining; knowledge discovery.
    DOI: 10.1504/IJDMMM.2024.10061507
     
  • A Comparative Analysis of User Attitudes towards ICO and IEO in Blockchain Projects: Insights from Social Media Big Data   Order a copy of this article
    by ShengJuan Zhao, GyooGun Lim 
    Abstract: This study conducts a comparative analysis of two popular crowdfunding methods in the blockchain market, the initial coin offering (ICO) and the initial exchange offering (IEO) models. Using project names as keywords, we collected and analysed big data, applying techniques such as TF-IDF, LDA, social network analysis, and sentiment analysis. Our findings show that the attitude of target groups towards ICO and IEO projects is not significantly different, although IEO targets exhibit more interest in entertainment-related topics. Social network analysis reveals that the ICO target group is more sensitive to popular elements, such as pop singers, while the IEO target group is more interested in soccer competitions. Both projects show a strong interest in the US election. Our study suggests that IEO, as an upgraded financing model of ICO, does not yet enjoy high levels of trust from the market crowd. By identifying the preferences of the target groups for both models through multiple analyses, we recommend that these preferences be taken into consideration to improve the efficiency of targeted marketing.
    Keywords: blockchain; big data; token issuance; initial coin offering; ICO; initial exchange offering; initial exchange offering; IEO.
    DOI: 10.1504/IJDMMM.2024.10062229
     
  • A Node sets based Fast and Scalable Frequent Itemset (FSFIM) Algorithm for Mining Big Data using MapReduce Paradigm   Order a copy of this article
    by Borra Sivaiah, R. .Rajeswara Rao 
    Abstract: Big Data is rapidly growing, making traditional tools inefficient for handling large amounts of data. Existing algorithms for frequent itemset mining struggle with scalability due to limitations in parallel processing power. In this paper, we proposed a fast and scalable frequent itemset mining (FSFIM) algorithm used to generate frequent item sets from huge data. Preorder coding (POC) trees and Nodeset data structures save half the memory of node-lists and N-lists. The FSFIM uses Cloudera’s CDH Map Reduce framework. With a maximum speedup value of 1.85 when minimal support is set to 1, The experimental results reveal that FSFIM outperforms the state-of-the-art methods such as HBPFP, Mlib PFP, and Big FIM. Fast and scalable frequent itemset mining algorithm is more scalable and faster for mining frequent item sets from big data.
    Keywords: big data; frequent itemset mining; FIM; MapReduce paradigm; fast and scalable frequent itemset mining; FSFIM.
    DOI: 10.1504/IJDMMM.2024.10062349
     
  • Data mining techniques along with fuzzy logic control to find solutions to road traffic accidents: Case study in Morocco   Order a copy of this article
    by Halima Drissi Touzani, Sanaa Faquir, Ali Yahyaouy 
    Abstract: Collecting data on road accidents is important. However, it is equally important to analyse and process this data to prevent future accidents. Data analysis can provide valuable insights and help identify patterns, contributing to the development of effective strategies and interventions to improve road safety. Over years, many efforts in research have tackled several causes related to traffic accidents trying to identify risk factors. Different statistics identified that most accidents are due to human errors. In Morocco, a lot of studies have been applied to cars system to become automatic or semi-automatic to avoid serious injuries due to poor driving practices. This paper presents data mining techniques applied on real traffic accidents data using statistical analysis, K-means clustering algorithm and fuzzy logic. The data represents accidents that happened in Morocco during 2014. Results showed important features that caused previous accidents which was used to implement an algorithm based on fuzzy logic to train a semi-autonomous car to make right decisions whenever needed and therefore, prevent accidents from happening.
    Keywords: data analysis; data mining techniques; road traffic accidents; semi-autonomous cars; fuzzy logic control; decision algorithm; statistical methods; Morocco.
    DOI: 10.1504/IJDMMM.2024.10063889
     
  • Discrete Cuckoo Search for 0-1 knapsack problem   Order a copy of this article
    by Aziz Ouaarab 
    Abstract: This paper presents a resolution of a space management optimisation problem such as 0-1 knapsack problems (KP) by discrete cuckoo search algorithm (DCS). The proposed approach includes an adaptation process of three main components: the objective function, the solution representation, and the step move operator. A simplified conception of these three components is designed without introducing an additional technique, especially in the search process for the optimal solution. Three sets of benchmark instances have been taken from the literature to test the performance of DCS. Experimental results prove that DCS is effective in solving different types of 0-1 KP instances. The result comparisons with other state-of-the-art algorithms show that DCS is a competitive approach that outperforms most of them.
    Keywords: 0-1 knapsack problem; discrete cuckoo search; DCS; combinatorial optimisation; L´evy flights; approximate algorithm.
    DOI: 10.1504/IJDMMM.2024.10064048
     
  • Early Stage Analysis of Breast Cancer Using Intelligent System   Order a copy of this article
    by Arpita Nath Boruah, Mrinal Goswami 
    Abstract: Breast cancer (BC) poses a considerable global health concern for women which makes a significant issue for women's well-being worldwide. It is crucial to develop a system that can proactively identify the critical risk factors associated with BC. The present study introduces an intelligent system for BC by analysing risk factors (IS-BC-analysing-RF) which utilises decision tree rules to identify the primary risk factors underlying BC accurately. The rules are processed based on the proposed score function to get the most relevant ones. Finally, using the sequential search approach, the critical risk factors are identified along with their respective ranges. Based on the simulation results using University of California at Irvine (UCI) repository BC dataset, the findings indicate that the proposed IS-BC-analysing-RF system is highly significant and has the potential to effectively mitigate the risk of BC by targeting and managing one or two crucial risk factors.
    Keywords: decision system; breast cancer; decision tree; machine learning; risk factor.
    DOI: 10.1504/IJDMMM.2024.10064214
     
  • A Novel LWT-based Robust Watermark Strategy for Colour Images   Order a copy of this article
    by Prachee Dewangan, Debabala Swain, Monalisa Swain 
    Abstract: With the progress of information technology, digital data larceny and duplicity become very easier. Image watermarking in cryptography is a major domain that provides manifold security features like confidentiality, authenticity, integrity, etc. This research introduces a robust watermarking scheme for colour images. The proposed technique segments the colour image into three layers red, green and blue. The lifting wavelet transform (LWT) and differential histogram shifting are used to embed text watermark information into the R layer. The performance of the proposed technique was assessed using the SIPI image dataset. Test outputs show that the proposed scheme maintains the balance between imperceptibility and robustness. This scheme has a better resistance against all types of attacks like different noises, filter effects, image compressions, etc. Besides, the text watermark can be successfully extracted for different types of tampering like content removal attacks, and content addition attacks.
    Keywords: robust watermarking; geometric attack; fragile attack; dual watermark; lifting wavelet transform.
    DOI: 10.1504/IJDMMM.2024.10064256
     
  • Detecting Driver Mutations in Colorectal Cancer through Big Data Analysis   Order a copy of this article
    by Amna Sethi, Muhammad Saad Khan, Fatima Hashmi, Saim Ali Akber 
    Abstract: Colorectal cancer (CRC) is a complex disease causing a significant challenge to global health with profound impacts on morbidity and mortality. There is a need to identify genetic biomarkers for early diagnosis of disease. In this study, a comprehensive analysis of CRC genomes was conducted to identify consistent mutations in both coding and non-coding highlighting their pivotal role in CRC pathogenesis. The results of this study revealed consistent mutations in coding regions that validated known CRC driver genes. The consistent non-coding mutations were also identified within transcription factors binding sites (TFBS) in CRC cell lines. The statistical significance of these mutations suggests their potential impact on gene regulation leading to the development and progression of CRC. They might act as potential biomarkers for early diagnosis of the disease. To conclude, the findings of this study might provide novel therapeutic targets and diagnostic markers for personalised medicine.
    Keywords: colorectal cancer; CRC; driver mutations; driver genes; biomarkers; transcription factors binding sites; TFBS.
    DOI: 10.1504/IJDMMM.2024.10064784
     
  • Enhancing Link Prediction in Dynamic Social Networks: A Novel Algorithm Integrating Global and Local Topological Structures   Order a copy of this article
    by Shambhu Kumar, Arti Jain, DINESH BISHT 
    Abstract: The link prediction problem has gained significant importance due to the emergence of many social networks. Existing link prediction algorithms in social networks often prioritise local or global attributes, yielding satisfactory performance on specific network types but with limitations like reduced accuracy or higher computational burden. This paper presents a novel link prediction approach that integrates global and local topological structures, assessing node similarity through a similarity index formula between two node pairs that is based on three key features: the number of common neighbours between nodes with some penalty factor introduced for each common node, node influence, and the shortest path distance between unconnected nodes. Evaluation using AUC has been performed against seven datasets and demonstrates significant improvement over baseline and state-of-the-art methods, enhancing accuracy by 30% and 6.75%. This highlights the efficacy of integrating global and local features for more accurate link prediction.
    Keywords: social network; link prediction; common neighbour; similarity measure; degree centrality; node distance.
    DOI: 10.1504/IJDMMM.2025.10064902
     
  • Comparative Analysis of Distance Measures in Stock Network Construction and Cluster Analysis   Order a copy of this article
    by Serkan Alkan 
    Abstract: The mutual information (MI) metric and the Pearson correlation metric are both widely used in cluster analysis and stock network construction. This paper presents a detailed comparison between the MI metric and the Pearson correlation metric. To detect nonlinear relationships, polynomial and natural cubic spline regressions are proposed as alternatives to the MI metric. The methodology for computing model-fitting indices for determining network adjacencies is explained in detail, along with a comparison of the results with the MI methodology. This study employs two data sets derived from the log returns of the daily adjusted closing prices of 402 stocks in the S&P500 index to measure the impact of a financial crisis on nonlinearity: one covering the crisis period from January 2007 to December 2009, and the other covering the non-crisis period between January 2012 and December 2015. The local and global properties of hierarchical stock networks are compared using the minimum spanning tree for each distance measure. The graph-theoretic internal cluster validity indices and external indices are also used to investigate the relationship between the performance of the community detection algorithm and the selection of metrics.
    Keywords: financial networks; mutual information; Pearson correlation; regression models; community detection.
    DOI: 10.1504/IJDMMM.2025.10065097