Forthcoming and Online First Articles

International Journal of Data Analysis Techniques and Strategies

International Journal of Data Analysis Techniques and Strategies (IJDATS)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Data Analysis Techniques and Strategies (19 papers in press)

Regular Issues

  • Sentiment analysis of customer reviews for Algerian dialect using DziriBERT model   Order a copy of this article
    by Fateh Bougamouza, Samira Hazmoune 
    Abstract: The increasing volume of daily comments and tweets presents a valuable resource for improving various processes, from business strategies to service management. However, the Algerian Dialect, despite its growing presence on social media, has been overlooked in sentiment analysis. This study addresses this gap by proposing an approach for sentiment analysis of Algerian Dialect feedback, specifically from customers of Algerian telephone operators (Djezzy, Mobilis, and Ooredoo). Leveraging Transfer Learning, the pre-trained DziriBERT model was fine-tuned, with experiments refining data preprocessing techniques and hyperparameters. The outcome is an impressive 82.01% accuracy rate, offering promising insights into sentiment analysis in the Algerian Dialect and highlighting its potential significance for companies and researchers in the field.
    Keywords: Sentiment Analysis; Algerian Arabic Dialect; DziriBERT; Transfer Learning; Algerian telephone operators; Emoji categorization.
    DOI: 10.1504/IJDATS.2024.10062272
     
  • Boosting CNN Network Performance for Face Recognition in an Authentication System   Order a copy of this article
    by Hamza Benyezza, Reda Kara, Mounir Bouhedda, Zine Eddine Safar Zitoun, Samia Rbouh 
    Abstract: Face recognition technology has made significant advancements through the utilisation of Convolutional Neural Networks (CNN) in various applications. However, accurately identifying individuals from similar backgrounds remains a notable challenge due to inherent similarities in facial features among individuals with shared genetic ancestry or cultural heritage. This paper addresses the limitations of traditional CNN in accurately identifying individuals from the same origins and presents an approach to enhance the performance of CNN networks and improve the reliability of face recognition in authentication systems. The proposed approach incorporates advanced face detection and identification algorithm based on the VGG-Face CNN descriptor model, along with the cosine distance algorithm. Promising results were obtained through a prototype implementation on a Raspberry Pi 4. Comparative evaluations against alternative face recognition strategies showcased exceptional performance, achieving an accuracy rate of 96.33% for positive pairs and 95.38% for negative pairs at an optimal threshold of 20.
    Keywords: Smart Authentication system; Face detection and identification; VGG-Face CNN descriptor; IoT; Cosine distance algorithm.
    DOI: 10.1504/IJDATS.2024.10062942
     
  • Optimizing IPL Squad Composition: A Mathematical Framework for Efficient Team Selection on a Limited Budget in a Multi-Criteria, Multi-Objective Environment   Order a copy of this article
    by Pabitra Kumar Dey, Abhijit Banerjee, Dipendra Nath Ghosh 
    Abstract: Selection of the finest cricket squads for Twenty-20 cricket while considering multiple criteria and a limited budget is indeed a challenging problem for team management. For the formation of the best team squads, the objectives could include maximising batting and bowling strength, considering player performances, experiences, age, and captaincy capabilities while spending the minimum amount. To tackle this problem, a multi-objective optimisation approach can be valuable to find the best possible team composition. A comprehensive approach for the selectors was proposed by combining the multi-objective genetic algorithm in a multi-criteria environment. Overall, the aims of this research work are to provide selectors with a mathematical framework that can assist them in choosing the best cricket squad with a lower budget. This approach can help automate the process of selecting teams in a multi-criteria environment, such as player auctions, and provide selectors with a range of optimal options to consider.
    Keywords: Optimum Team Selection; Modified Group Decision Algorithm (MGDA); Modified Multi-Objective Genetic Algorithm (MMOGA); Non-Dominated Sorting Genetic Algorithm-II; IPL T-20 Cricket; Strategy Planning.
    DOI: 10.1504/IJDATS.2024.10062994
     
  • CEVAB: NIR-VIS Face Recognition using Convolutional Encoder-based Visual Attention Block   Order a copy of this article
    by Patil Jayashree Madhukara, Ashok Kumar P. M, Raju Anitha 
    Abstract: Recent research in night vision face recognition has spiked due to the rise of night-time surveillance in public areas, where cameras often use near infr-red (NIR) images. This paper presents a new face recognition method, the convolutional encoder-based visual attention block (CEVAB), optimised for NIR and visible spectrum (VIS) images. CEVAB combines a convolutional encoder with an attention-based architecture, focusing on critical facial features to enhance accuracy against watchlists. Tested on the FaceSurv dataset with over 132,000 images, CEVAB outshines traditional methods in VIS, achieving 95.08% Rank 1 accuracy at close distances, and in NIR, with 74.00% Rank 1 accuracy, surpassing competitors like Verilook and ResNet-50. These results prove CEVABs exceptional adaptability and performance in various imaging conditions, significantly advancing night vision face recognition technology.
    Keywords: Deep learning; Face recognition; NIR Images; Visual Attention.
    DOI: 10.1504/IJDATS.2024.10063484
     
  • Integrated Cyber Security Risk Management-Insurance and Investment Cost Analysis   Order a copy of this article
    by Thomas Lee 
    Abstract: An insurer offers cyber insurance coverage to several firms with risk averse decision makers. The cyber insurance premium offered depends on the cyber security level implemented at the firm. Each firm faces attacks by multiple types of hackers and decides on the level of investment for cyber security counter measures. We address the software monoculture issue by considering that there is common, popular software used by all firms and it is a source of correlated risk. We analysed two types of cyber security interdependence breaching process due to the software monoculture risk. We derive the probability distribution for the number of breaches and develop the cyber insurance pricing model. We also introduce the concept of cyber security defence level. Furthermore, we proposed to determine the optimal cyber insurance price given a targeted defence level. Finally, we demonstrate the use of our model through several numerical examples.
    Keywords: Cyber Insurance; Breaching probability; Cyber security; Correlated Risks; software monoculture Risk; Defense Level; Integrated Risk Management Strategy.
    DOI: 10.1504/IJDATS.2024.10063783
     
  • A Machine Learning based Food Recommendation System with Nutrition Estimation   Order a copy of this article
    by Anupama Nandeppanavar, Medha Kudari, Prasanna Bammigatti, Kaveri Vakkund 
    Abstract: The human body needs energy to perform various activities which are provided by calories. The proposed work is an efficient, user-friendly tool to assist Calorie calculation. The system takes inputs such as height, weight, age, gender, and daily exercise level to estimate the recommended daily caloric intake. To achieve this, three machine learning models, K-Nearest Neighbors (KNN), Decision Tree and Random Forest algorithms, are employed to enhance the accuracy of predictions. Model accuracy achieved is 96.4% for KNN, 97.1 % for Decision Tree and 96.8% using Random Forest algorithms. In addition to providing personalized caloric intake recommendations, the proposed system also offers diet recipes for breakfast, lunch and dinner tailored to the individual's specific needs and preferences. Through the integration of machine learning algorithms, a user-friendly GUI, and personalized diet recommendations, project aims to promote healthier eating habits and overall well-being for users.
    Keywords: Body Mass Index; Calorie; Dietary; Recipes; Data processing; Visualization; User interface.
    DOI: 10.1504/IJDATS.2024.10064354
     
  • Novel Approach for Depression Detection on Reddit Post   Order a copy of this article
    by Tushtee Varshney, Sonam Gupta, Lipika Goel, Ishaan Saxena, Arjun Singh, Arun Yadav, Pradeep Gupta 
    Abstract: Psychotic disorder is one of the major health problems found in humans. Mostly every age group of the population is affected by a psychotic disorder called depression. Depression causes a person with low mood and loss of interest, ideal in working time, and irregularities in sleep and eating habits. The analysis of emotional feelings behind the text is detected by machine learning technology called Sentimental analysis or Psychological analysis. In this study, we took Reddit as the social platform to collect datasets and studied to know the hidden behaviour of the individual using machine learning algorithm logistic regression, naive Bayes Decision Tree, XGBoost, and deep learning classifier CNN, maximum Entropy. The classifiers are first studied individually on the dataset then the proposed model is created using the classifier logistic regression, multilayer perceptron, and XGBoost with an accuracy of approx. 93% and Precision of 95%.
    Keywords: Machine learning; depression; XGBoost; Reddit; Multilayer perceptron; Logistic regression ; Psychotic disorder; Deep Learning.
    DOI: 10.1504/IJDATS.2024.10064390
     
  • Question optimisation: Building quiz bowl tournament sets   Order a copy of this article
    by Kara Combs, Trevor Bihl 
    Abstract: Quiz Bowl is an activity in which players test their knowledge against others in tournaments. Quiz Bowl set organisation is a lengthy and involved process involving many expectations related to the set's content and quality. Current techniques to address question placement rely on lengthy manually-edited databases if any. Ensuring a set meets all expectations is vital to producing a high-quality set that is suitable for competition. We propose a repeatable methodology for optimising question placement implemented in both Python and Excel to be compared to the traditional manual method. On the initial data, the baseline manually-produced set was matched qualitatively by the other methods, which also had repeatability, traceability, and reduction of time spent. These results were furthermore supported by a three-way comparison of a portion of the real-world 2022 state competition questions by the Head Editor who recommended the Python version for future use.
    Keywords: quiz bowl; quizbowl; optimisation; simplex linear programming; placement.
    DOI: 10.1504/IJDATS.2024.10064492
     
  • Machine Learning Made Easy A Beginner's Guide for Causal Inference and Discovery Methods using Python   Order a copy of this article
    by Irfan Saleem, Ali Irfan 
    Abstract: Machine learning is widely recognised and extensively used for data modelling and prediction across fields, including business and healthcare, to name a few of them, for informed decision-making. Numerous machine learning algorithms have been devised and deployed across multiple programming languages throughout the preceding decades for causal inference and discovery. This research, however, briefly introduces causal inference and discovery methods, accompanied by Python code for beginners. First, this study talks about machine learning in brief. Then, this study differentiates between causal discovery and causal inference. Thirdly, the study aims to describe popular machine-learning methods. Finally, this paper demonstrates the practical uses of these causal inference and discovery packages in Python. The study has recommended future research and implications for using machine learning methods.
    Keywords: Python; Machine Learning; Causal discovery (CD); CausalInference (CI); Linear Regression; Peter-Clark (PC) algorithm.
    DOI: 10.1504/IJDATS.2025.10064732
     
  • Brain Tumour Detection and Multi Classification Using GNB-Based Machine Learning Architecture   Order a copy of this article
    by Satish N. Gujar, Ashish Gupta, Sanjaykumar P. Pingat, Rashmi Pandey, Atul Kumar, Deepak Gupta, Priya Pise 
    Abstract: Brain tumours are abnormal tissues with rapidly reproducing cells, posing significant challenges for identification and treatment. This study proposes a multimodal approach using machine learning and medical techniques for early diagnosis and segmentation of brain tumours. Noisy magnetic resonance imaging (MRI) are processed with a geometric mean to simplify noise removal. Fuzzy c-means algorithms segment the images, aiding in the detection of specific areas of interest. The grey-level co-occurrence matrix (GLCM) algorithm is used for dimension reduction and feature extraction. Various machine learning techniques, including Convolutional Neural Networks (CNN), Artificial Neural Networks (ANN), Support Vector Machine (SVM), Gaussian Naive Bayes (NB), and Adaptive Boosting, classify the images. Among these methods, Gaussian NB is particularly effective for identifying and classifying brain tumours. This approach leverages advanced AI and neural network techniques to enhance early diagnosis and improve treatment outcomes.
    Keywords: Machine Learning; GLCM; Gaussian Naive Bayes; Adaptive boosting; MRI.
    DOI: 10.1504/IJDATS.2025.10064741
     
  • Establishing the Significance of Spiritual Environment with the Effects of Herbals: An Empirical Approach on Students' QoL in South Asian Continent   Order a copy of this article
    by Rohit Rastogi, Mamta Saxena, Richa Singh, Yati Varshney, Pranav Sharma, Vaibhav Aggarwal 
    Abstract: In past few years the humanity had faced too many challenges due to the outrage of Corona virus pandemic. But even in such difficulty, the world cannot stop growing so we humans had adopted various Lifestyle changes like work from home, online education etc. in order to make our jobs done. But these changes had affected us physically and mentally in many ways and the major impact of these changes is seen on Adolescents (age group of 1018 years) resulting them to be diagnosed with many physical and mental health ailments like obesity, lack of mobility, stress, mental trauma etc. due to less exposure to outside environment and sudden increase of Social media. These all aspects are hindering the overall growth needed in that particular age group. So, this study is based on the impact of some techniques based on the Indian Vedic Science to rejuvenate the physical and mental health of Adolescents or to counter the post covid effects.
    Keywords: Adolescents; Obesity; Mental Trauma; Saraswati Panchak; Yoga; Yajna; Control Group; Experimental Group.
    DOI: 10.1504/IJDATS.2024.10064819
     
  • Application of Text Mining Analysis in Understanding GameFi Adoption   Order a copy of this article
    by Yimiao Zhang, Jing Ren, Wenting Liu, Ding Ding 
    Abstract: Blockchain-based gaming industry has been expanding over the past two years, but the GameFi sector has yet to solve its biggest problem the lack of mass gamer adoption. In this work, text mining was leveraged to study the adoption status of GameFi and explore the possible requirements and concerns of game players regarding blockchain games. Quora questions relating to GameFi were collected to examine the key topics discussed by GameFi users or potential users. Our findings disclosed that GameFi is in the early stage of the innovation diffusion process and has not been widely adopted by the public. Individuals are concerned about the risk and return of play-to-earn (P2E) games, and some potential users are deterred by the high entry barriers of GameFi. Through studying the opinions of players or potential players, this study sheds some light on the possible strategies for improving blockchain game design in the near future.
    Keywords: GameFi; P2E; Mass Adoption; Text Analysis.
    DOI: 10.1504/IJDATS.2025.10064876
     
  • Tackling Data Sparsity: A Hybrid Filtering Paradigm for Robust Recommender Systems   Order a copy of this article
    by Umarani Srikanth, Lijetha C. Jaffrin, Sushmitha Srikanth, Shyam Ramesh 
    Abstract: This paper introduces a hybrid recommender system approach that aims to tackle the problems associated with data sparsity, also referred to as the cold start problem, Recommender systems use user preferences to filter information. To improve recommendation accuracy, our method combines user-based and content-based collaborative filtering techniques. More specifically, content-based filtering takes over when there is little data. When there is a high degree of user similarity, user-based collaborative filtering is used to maximise accuracy by suggesting diverse items. This strategy can be used in a variety of fields, including e-commerce, music, books, and film.
    Keywords: hybrid filtering; recommender systems; collaborative filtering; Singular Value Decomposition(SVD); machine learning; k-nearest neighbors.
    DOI: 10.1504/IJDATS.2025.10064959
     
  • Using the BIRCH Algorithm and Affinity Propagation, an Advanced Descriptor for Video Processing   Order a copy of this article
    by Jayanta Mondal, Jitendra Pramanik, Satyajit Pattnaik, Bijay Paikaray 
    Abstract: Video summarisation is the most preferred approach to administer the augmentation of video content. In the area of video surveillance and object and intrusion detection, Video Summarization has been the most popular as it provides concise and less redundant information. As video content continues to expand quickly, an automatic video summary would be helpful for anyone who wants to learn more quickly and with less effort. Most existing methods depend on various network architectures to train a single score predictor for shot rating and selection. This study addresses the issue of video summarisation, which involves selecting significant frames to succinctly and comprehensively express the material of the original film. The current paper presents a comparative study of the application of advanced texture descriptors Local Phase Quantization (LPQ), Local Ternary Pattern (LTP), and Local Binary Pattern (LBP) in the process of Video Summarization. Clusters of key frames have been extracted by unsupervised learning algorithms - Affinity Propagation & BIRCH. The performance of the proposed video summarising method has shown good trial results.
    Keywords: Local Ternary Pattern; Local Binary Pattern; Affinity Propagation; Local Phase Quantization; BIRCH; Key Feature.
    DOI: 10.1504/IJDATS.2025.10065080
     
  • Streamlining Checks Processing: Advancing Arabic Handwriting Verification with a CNN-Based System   Order a copy of this article
    by Hamza Benyezza, Reda Kara, Mounir BOUHEDDA, Mosaab Benhadjer, Patrice Wira, Samia Rebouh 
    Abstract: Arabic handwriting analysis and verification pose challenges due to their unique characteristics. Deep learning techniques have gained prominence in computer vision for their ability to learn from data. This study proposes a high-speed and precise solution using a convolutional neural network (CNN) to automate the verification process of Algerian postal checks written in Arabic handwriting. The solution consists of hardware and software components. The software includes four CNN models to identify the check's ID number (CID), user's signature (US), handwriting courtesy (HCA), and legal amount (HLA). The hardware setup involves a camera connected to a Raspberry PI 3. Test results demonstrate the proposed approach's effectiveness, achieving accuracies of 100% for CID, 98.61% for US, 99.28% for HCA, and 96.35% for HLA. This comprehensive system offers a promising solution for efficient verification of Arabic handwritten postal checks.
    Keywords: Deep learning; CNN; computer vision; Arabic handwriting; check verification.
    DOI: 10.1504/IJDATS.2024.10065082
     
  • Prediction of Success Factors for Mobile Application using Machine Learning Technique   Order a copy of this article
    by Jyoti Deone, Nilima Dongre, Mohammad Atique Junaid 
    Abstract: The remarkable boom in the mobile market has attracted many developers to build mobile apps. However, the majority of developers are suffering to generate earnings. For those developers, knowing the characteristics of successful apps may be very vital. We propose an approach which examines the categories of apps by two factors. First, the correlation is measured between app features and secondly, concepts are extracted from apps to understand the common theme present in them. For this, we selected 3,000 applications available in the Google Play Store. The observations specify that there may be a strong correlation among purchaser rating and the quantity of app downloads, though there may be no correlation between rate and downloads, nor among charge and rating. Moreover, we find standards unique to excessive rated apps and low rated apps. The correlation along with the concepts proves useful for application developers to understand the market trend and customer demand more easily than earlier approaches.
    Keywords: Android; Latent Semantic Analysis; Correlation Analysis; Concept Extraction.
    DOI: 10.1504/IJDATS.2025.10065137
     
  • Nutritional Cluster Analysis of Leguminous Food Sources Across West Africa   Order a copy of this article
    by Donald D. Atsa'am, Gabriel S. Iorundu, Moses T. Ukeyima 
    Abstract: The present form of the data on West African legumes reported in the West Africa Food Composition Table (WAFCT) do not reflect sub-groupings based on (dis)similarity in nutritive value. A possible consequence is that an uninformed user interested in leguminous food could randomly pick any from the data since all are summarily classified as one family in the WAFCT. To resolve this, the objective of this study was to apply the clustering technique to form sub-groups based on similarity in nutritional content. Three clusters were extracted, and unique properties have been established for food sources in each cluster at the granular level of nutrients. Going by the clustering, users who are interested/not interested in a particular content could look up the cluster with a lower, moderate, or higher content of the desired/non-desired element. The results are useful in the selection of raw materials, formulation of nutritional guidelines, and food labelling.
    Keywords: Legumes; nutritional analysis; legumes food sources; West Africa food composition table; k-means clustering.
    DOI: 10.1504/IJDATS.2025.10065146
     
  • Jasminum Grandiflorum Flower Images Classification: Deep Learning and Transfer Learning Models with the Influence of Preprocessing via Contours and Convex Hull in Agritech 4.0   Order a copy of this article
    by A. Anushya, Savita Shiwani, Ayush Shrivastava 
    Abstract: This study specifically centres on classifying Jasminum Grandiflorum flowers through the utilisation of deep learning and transfer learning techniques. To achieve this, the research leverages advanced deep learning models such as CNNs, along with transfer learning using pre-trained architectures like VGG16, VGG19, ResNet18, and Vision Transformer. CNN stood out, excelling after extensive iterations. VGG 16 and 19 showed solid performance with fewer iterations, indicating competence in shorter training times. ResNet18 achieved the highest accuracy with fewer iterations but took longer (about 8 minutes per epoch), balancing efficiency and accuracy. ViT impressed with high accuracy despite needing more iterations, showcasing prowess in intricate learning and pattern recognition in the Jasminum Grandiflorum flower image dataset. The intended outcome of this research is to contribute significantly to the advancement of Agritech 4.0 by establishing a robust methodology for accurate Jasminum Grandiflorum flower classification without human participation.
    Keywords: Convolutional Neural Network; VGG16; VGG19; ResNet18; Vision Transformer; Jasminum Grandiflorum; AgriTech 4.0.
    DOI: 10.1504/IJDATS.2025.10065343
     
  • Prediction Model for AQI through Indian Vedic Science: Knowledge Management Technique to Control Pollution and for Sustainable Society   Order a copy of this article
    by Rohit Rastogi, Saransh Chauhan, Yash Rastogi, Vaibhav Aggarwal, Utkarsh Agrawal, Richa Singh 
    Abstract: The paper provides an essence of how Indian Vedic Sciences can be used for preventing and predicting the ill effects of pollution on the human body and nature through adopting simple methods of Yajna and Hawan in daily routine. With respect to any other resource like land and water, air is considered as the most important resource. Evidence shows that Indian Vedic Sciences primarily focus on prana vayu which means air that we breathe. The authors team and the Central Pollution Control Board (CPCB) have gathered the data and reading of the last four months through installed sensors in an isolated as well as non-isolated environment that was continuously under the effects of Yajna and Hawan.
    Keywords: AQI; PM 2.5; PM 10; Climate Change; Yajna; Mantra; Human Health; Economic Growth; Knowledge Management; Knowledge Pyramid; Sustainable Society; Knowledge Levels and Extractions.
    DOI: 10.1504/IJDATS.2025.10065356