Forthcoming and Online First Articles

International Journal of Biometrics

International Journal of Biometrics (IJBM)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Biometrics (32 papers in press)

Regular Issues

  • GenVeins: An Artificially Generated Hand Vein Database   Order a copy of this article
    by Emile Beukes, Hanno Coetzer 
    Abstract: An artificially generated dorsal hand vein database called "GenVeins" (see Beukes (2023)) is developed in this study for the purpose of acquiring sets of fictitious training and validation individuals which are large enough to represent the entire population. The development of said database is motivated by experimental results which indicate that system proficiency is severely impaired when training on an insufficient number of different individuals. A number of dorsal hand vein-based authentication systems are proposed in this study for the purpose of determining whether or not the utilisation of the GenVeins database may increase system proficiency when compared to training and validating the proposed systems on small sets of different individuals. The results clearly indicate that the utilisation of the GenVeins database significantly increases system proficiency when compared to the scenario in which an insufficient number of different individuals are utilised for training and validation.
    Keywords: biometric authentication; hand vein; deep learning; similarity measure networks; siamese networks; two-channel networks; segmentation; artificial data; convolutional neural networks.
    DOI: 10.1504/IJBM.2024.10062266
     
  • An Intelligent Approach to Detect Facial Retouching using Fine Tuned VGG16   Order a copy of this article
    by Kinal Sheth 
    Abstract: It is a common practice to digitally edit or ‘retouch’ facial images for various purposes, such as enhancing one’s appearance on social media, matrimonial sites, or even as an authentic proof. When regulations are not strictly enforced, it becomes easy to manipulate digital data, as editing tools are readily available. In this paper, we apply a transfer learning approach by fine-tuning a pre-trained VGG16 model with ImageNet weight to classify the retouched face images of standard ND-IIITD faces dataset. Furthermore, this study places a strong emphasis on the selection of optimisers employed during both the training and fine-tuning stages of the model to achieve quicker convergence and enhanced overall performance. Our work achieves impressive results, with a training accuracy of 99.54% and a validation accuracy of 98.98% for the TL vgg16 and RMSprop optimiser. Moreover, it attains an overall accuracy of 97.92% in the two-class (real and retouching) classification for the ND-IIITD dataset.
    Keywords: Adam; retouching; RMSprop; transfer learning; TL; VGG16.
    DOI: 10.1504/IJBM.2024.10062315
     
  • Multi-view multi-input CNN-based architecture for diagnosis of Alzheimer's disease in its prodromal stages   Order a copy of this article
    by Mohamed Amine Zayene, Hend Basly, Fatma Ezahra Sayadi 
    Abstract: Alzheimer’s disease (AD) is a progressive neurodegenerative brain disorder, the leading cause of dementia, characterised by memory loss and cognitive decline affecting daily life. Early detection is crucial for effective treatment. 18F-FDG-PET is the most accurate clinical test for AD diagnosis, yet current methods often involve laborious data preprocessing. Thus, we propose utilising deep learning techniques, known for their effectiveness. Our study introduces a 3D convolutional neural network (3D CNN) capable of learning inter and intra-slice information simultaneously. We evaluated our method on 540 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, including normal controls (CN), early and late mild cognitive impairment (EMCI, LMCI), and AD subjects. Results demonstrate an 85.71% accuracy in CN vs. EMCI vs. LMCI vs. AD classification on the ADNI database.
    Keywords: Alzheimer’s disease diagnosis; FDG-PET neuroimaging data; convolutional neural networks; CNN; multi-view; multi-input.
    DOI: 10.1504/IJBM.2024.10063995
     
  • Enabling secure authentication using fingerprint and visual cryptography   Order a copy of this article
    by Sneha Annappanavar, Pallavi Chavan 
    Abstract: Biometrics is the form of information associated with an individual that helps in unique identification and verification at different platforms. The fingerprint is an important biometric information and has become most popular for authentication and authorisation. However, maintaining the secrecy of fingerprint data becomes a challenging task over the cloud. This paper presents a novel encryption approach using visual cryptography that encrypts fingerprints and stores them in the form of shares. The visual cryptography scheme implemented in this paper is expansionless and has no greying effect. The authors have collected fingerprint data from residential societies. The algorithm achieves 100% of the peak signal-to-noise ratio. This ratio is highest when compared with state-of-the-art methods of visual cryptography schemes. The mean square error achieved is zero which helps in 100% correct identification of the fingerprint. The paper also presents a secure voting mechanism using fingerprint authentication for general elections.
    Keywords: biometrics; fingerprint recognition; confidentiality; authentication; visual cryptography; secret sharing; shares.
    DOI: 10.1504/IJBM.2024.10064524
     
  • Enabling secure authentication using fingerprint and visual cryptography   Order a copy of this article
    by Sneha Manohar Annappanavar, Pallavi Vijay Chavan  
    Abstract: Biometrics is the form of information associated with an individual that helps in unique identification and verification at different platforms. The fingerprint is an important biometric information and has become most popular for authentication and authorisation. However, maintaining the secrecy of fingerprint data becomes a challenging task over the cloud. This paper presents a novel encryption approach using visual cryptography that encrypts fingerprints and stores them in the form of shares. The visual cryptography scheme implemented in this paper is expansionless and has no greying effect. The authors have collected fingerprint data from residential societies. The algorithm achieves 100% of the peak signal-to-noise ratio. This ratio is highest when compared with state-of-the-art methods of visual cryptography schemes. The mean square error achieved is zero which helps in 100% correct identification of the fingerprint. The paper also presents a secure voting mechanism using fingerprint authentication for general elections.
    Keywords: biometrics; fingerprint recognition; confidentiality; authentication; visual cryptography; secret sharing; shares.
    DOI: 10.1504/IJBM.2024.10063996
     
  • American sign language classification using deep learning   Order a copy of this article
    by Harsh Parikh, Nisarg Panchal, Vraj Patel, Ankit Sharma 
    Abstract: Image classification is a process that incorporates analyzing and extracting useful information from an image. It addresses a wide range of real-world issues and has applications in the fields of artificial intelligence, robotics, biomedical imaging, motion recognition, among many others. In this paper, we have utilized Support Vector Machine (SVM), Decision Trees (DT), K Nearest Neighbor (kNN), Convolutional Neural Networks (CNN), VGG-16, ResNet-50, MobileNet-V2, and DenseNet-201 on American sign language dataset. This paper describes a system that uses deep learning and machine learning to recognize gestures in images and assign an English alphabet corresponding to the gesture. The results will be useful for classification of American sign language as the necessary comparison metrics and performance of all these models are studied and documented in this paper.
    Keywords: image classification; convolutional neural network; American sign language; ASL; decision trees; k-nearest neighbour; support vector machine; SVM; transfer learning; VGG-16; ResNet-50; DenseNet-201; MobileNet-V2.
    DOI: 10.1504/IJBM.2024.10064116
     
  • Feature ranking for effective continuous user authentication using keystroke and mouse dynamics with the cat recurrent neural model   Order a copy of this article
    by Princy Ann Thomas, Preetha Mathew Keerikkattil 
    Abstract: Behavioural biometric modalities such as keystroke and mouse dynamics are ideal for continuous user authentication due to their non-intrusive quality. The success of the authentication framework is largely determined by the discriminative power of the features used. It is critical to be able to select the necessary discriminative features for optimal authentication performance. In this research, we implement multiple ranking algorithms on features derived from temporal information of keystroke and mouse dynamics to distinguish their discriminative capacity. The ranked features are then employed for continuous authentication using the cat recurrent neural model (CRNM) to optimise the search space and authenticate users. The experimental results given in this work propose a strategy for developing commercially deployable continuous authentication systems with broad applicability. Experiments are carried out with filter, wrapper, and embedded feature ranking approaches, and authentication outcomes are compared with the CRNM framework. The findings indicate that discrimination is manifested in uncommon rather than normal user conduct. Furthermore, it is discovered that applying feature ranking reduces authentication time from 198 seconds to 138 seconds and improves accuracy from 98.25% to 99.21%.
    Keywords: ranking; temporal features; keystroke dynamics; mouse dynamics; cat swarm optimisation; recurrent neural model.
    DOI: 10.1504/IJBM.2024.10064403
     
  • A systematic literature review and bibliometric analysis of signature verification spanning four decades   Order a copy of this article
    by Sameera Khan, Dileep Kumar Singh 
    Abstract: This article conducts a systematic literature review and bibliometric analysis spanning four decades of research in the field of signature verification (SV). SV holds substantial significance in practical domains like finance, law enforcement, and document authentication. The primary objective of this study is to offer a comprehensive overview of SV’s evolution, pinpoint research trends, and illuminate gaps within the existing literature. The review encompasses 1,552 studies published from 1982 to the present, with analysis focusing on various SV facets such as feature extraction, classification algorithms, datasets, evaluation metrics, and applications. The findings underscore substantial growth and diversification within the field, showcasing the development and testing of diverse approaches. Nevertheless, challenges such as the absence of standardised evaluation metrics and limited accessibility to public datasets emerge. The article concludes with a discourse on prospective directions for SV, considering the potential influence of emerging technologies like deep learning and biometric authentication on the field’s future.
    Keywords: signature verification; SV; bibliometric analysis; thematic evaluation; cluster analysis.
    DOI: 10.1504/IJBM.2025.10064620
     
  • An action recognition of track and field athletes based on Gaussian mixture model   Order a copy of this article
    by Qin Yang, Zhenhua Zhou 
    Abstract: To solve the problem of low recognition accuracy caused by the complexity of individual actions in track and field in the past, a method of action recognition for track and field athletes based on Gaussian mixture model was proposed. First, the data is analysed by the interaction of spatiotemporal features. Secondly, a low-pass filter is used to eliminate the impact of noise on the data to reduce the calculation loss. On the basis of pre-processing data, Hilbert Huang transform (HHT) was used for feature extraction to capture and understand athletes' motion features more accurately, thus significantly improving the accuracy of movement recognition. Then, the Gaussian mixture model is used to model the characteristic parameters, determine the number of mixed components and initialise the model parameters, and complete the movement recognition of track and field athletes. The experimental results show that the traditional method has high computational loss and low recognition accuracy, while the proposed method has very low computational loss and the highest recognition accuracy can reach 98%. The comparison shows that this method has the advantages of low computational complexity, high accuracy and good recognition performance.
    Keywords: Gaussian mixture model; GMM; interaction of spatiotemporal features; action data; low pass filter; athlete movement recognition.
    DOI: 10.1504/IJBM.2025.10064813
     

Special Issue on: Advanced Technologies for Emotion Recognition

  • Identifying illegal actions method of basketball players based on improved genetic algorithm   Order a copy of this article
    by Zhenyu Zhu 
    Abstract: In order to reduce the time required for identifying athlete violations and improve the recognition rate, this paper proposes a basketball player violation recognition method based on an improved genetic algorithm. Firstly, the surface electromyographic signals of athletes are collected using a wireless sEMG signal acquisition device. Secondly, the location of signal acquisition is determined and the time-domain features of the signal is extracted. Then, a composite filter is used to denoise the signal. Finally, the genetic algorithm is improved by combining support vector machines to design an action recognition classifier, which outputs the results of illegal action recognition through the recognition classifier. Through experiments, it can be seen that this method can effectively improve the recognition rate by 9.44%, and within 0.5 minutes, the recognition effect of basketball players' illegal actions is good.
    Keywords: improved genetic algorithm; wavelet transform threshold denoising; action recognition classifier; surface electromyographic signal; sEMG.
    DOI: 10.1504/IJBM.2024.10060862
     
  • A multistate pedestrian target recognition and tracking algorithm in public places based on Camshift algorithm   Order a copy of this article
    by GaoFeng Han, Yuanquan Zhong 
    Abstract: In order to improve the accuracy of polymorphic pedestrian target recognition and tracking, and shorten tracking time, this paper proposes a public place polymorphic pedestrian target recognition and tracking algorithm based on the Camshift algorithm. Firstly, greyscale the input image and use Hog to select polymorphic pedestrian target features in public places. Then, calculate the probability density of the target area model and construct a pedestrian target recognition and tracking model. Finally, extract the colour features of the target, select the Bhattacharyya coefficient to calculate the similarity between the target model and the candidate model, and use the Camshift algorithm for target recognition, tracking, and matching to obtain the final recognition and tracking results. The experimental results show that the accuracy of the proposed method can reach 97.78 and the operation time is only 0.082 frames/s, indicating that the proposed method effectively improves the target recognition and tracking performance.
    Keywords: gamma correction method; Camshift algorithm; HSV space; hue histogram; search window.
    DOI: 10.1504/IJBM.2024.10061224
     
  • Rapid recognition of athlete's anxiety emotion based on multimodal fusion   Order a copy of this article
    by Li Wang 
    Abstract: The diversity of anxiety emotions and individual differences among different athletes have increased the difficulty of emotion recognition. To address this, a rapid recognition method of athlete's anxiety emotion based on multimodal fusion is proposed. Wireless sensor networks are used to collect facial expression images of athletes, and wavelet transform is applied for denoising the collected images. Image features are extracted using grey-level co-occurrence matrix, and the athlete's facial expression images are normalised. Features related to the athlete's emotions, such as voice characteristics, facial expression features, and physiological indicators, are obtained. These features from different perceptual modalities are fused to achieve rapid recognition of athletes' anxiety emotions. The test results demonstrate that this method not only improves the image denoising effect but also achieves high accuracy and efficiency in emotion recognition, enabling accurate and real-time recognition of athletes' emotions.
    Keywords: multimodal fusion; rapid recognition; wireless sensor networks; wavelet transform.
    DOI: 10.1504/IJBM.2024.10060859
     
  • Facial micro-expression recognition method based on CNN and transformer mixed model   Order a copy of this article
    by Yi Tang, Jiaojun Yi, Feigang Tan 
    Abstract: The existing methods for facial microexpression recognition have the problem of low efficiency and accuracy. Therefore, a facial micro-expression recognition method based on a hybrid model of CNN and transformer is proposed. Extract facial hierarchical features using a hybrid model of CNN and transformer, and use them as inputs to a deep network. At the same time, the facial microexpression image area is segmented and the image is smoothed through threshold to obtain the feature vectors of the facial microexpression. These feature vectors are input into a CNN and transformer hybrid model to achieve recognition of facial microexpressions. The experimental results show that the proposed method can recognise facial microexpressions in complete or incomplete images, and the recognition state delay is controlled below 5 ms. In addition, compared to traditional methods, this method has a higher average recognition accuracy, up to 98%.
    Keywords: CNN; transformer mixed model; micro-expression of human face; recognition method.
    DOI: 10.1504/IJBM.2024.10060860
     
  • Athlete facial micro-expression recognition method based on graph convolutional neural network   Order a copy of this article
    by Haochen Xu, Zhiqiang Zhu 
    Abstract: The recognition accuracy of athlete facial micro-expression is low due to insufficient consideration, failure to remove invalid data from the recognition data, and inaccurate extraction of micro-expression features. To this end, a new method for athlete facial micro-expression recognition based on image convolutional neural networks was studied. Firstly, the athlete's face data is preprocessed using facial alignment, unified frame, and optical flow extraction algorithms; Then, the graph convolutional neural network is used to extract athlete facial micro-expression features; Finally, to improve the performance of micro expression recognition tasks, a classification layer was added before the output layer of the network, and support vector machine algorithm was introduced to optimise and improve the graph convolutional neural network to adjust the discriminative boundaries between categories, achieving more accurate and effective micro expression recognition. The experimental results show that the proposed method can accurately extract micro-expression features, with a recognition accuracy of 97.0% and high convergence, effectively improving the recognition effect.
    Keywords: graph convolutional neural network; facial micro-expression; support vector machine; optical flow extraction algorithm; unified frame.
    DOI: 10.1504/IJBM.2024.10061500
     
  • Multimodal emotion detection of tennis players based on deep reinforcement learning   Order a copy of this article
    by Wenjia Wu 
    Abstract: The research on multimodal emotional detection of tennis players is considered to be of great significance in terms of understanding their psychological state, improving technical performance. The problems of high detection error and low recall rate in traditional detection methods are sought to be solved. Therefore, a multimodal emotion detection method of tennis players based on deep reinforcement learning has been designed. The facial expressions, speech emotional signals, and physical behaviour emotional feature parameters of tennis players are extracted, and the obtained emotional feature parameters are used as input vectors for a multimodal emotion detection model based on deep reinforcement learning. The problem of high dimensionality of multimodal emotion parameters is addressed through the value function of reinforcement learning, and the multimodal emotion detection results of tennis players are output by the model. The experimental results demonstrate that the proposed method yields low detection error, and high recall rate.
    Keywords: deep reinforcement learning; tennis players; multimodal emotion detection; facial expression; voice emotion signal; body behaviour emotion.
    DOI: 10.1504/IJBM.2024.10061499
     
  • Multi-pose face recognition method based on improved depth residual network   Order a copy of this article
    by Feigang Tan, Yi Tang, Jiaojun Yi 
    Abstract: Multi-pose face recognition method can reduce the interference of pose change on face characteristics by analysing pose change. In order to improve the accuracy of multi-pose face recognition and shorten the recognition time, a multi-pose face recognition method based on improved depth residual network is proposed. The multi-pose face image is transformed logarithmically, and the face image is enhanced by homomorphic filtering algorithm. The spatial transformation network is introduced to improve the depth residual network model, and the enhanced face image is input into the improved depth residual network model. Through the calculation of loss function and the update of gradient parameters, the multi-pose face image recognition is completed. The experimental results show that this method has strong multi-pose face image enhancement ability, can effectively recognise multi-pose face images, and has high recognition accuracy. When the occlusion is 30%, the face recognition accuracy can reach 0.989.
    Keywords: improved depth residual network; multi-pose; face recognition; image enhancement; Softmax regression model.
    DOI: 10.1504/IJBM.2024.10063084
     
  • Dynamic emotion recognition of human face based on convolutional neural network   Order a copy of this article
    by Lanbo Xu 
    Abstract: To improve the accuracy and speed of facial dynamic emotion recognition, a dynamic emotion recognition of human face based on convolutional neural network is designed. Firstly, locate the facial region, and after greying out the facial region image, use the chaotic frog jump algorithm to enhance the clarity of image features through enhanced processing. Then, analyse the geometric and texture features of the facial image separately to determine key feature points. Finally, after training the convolutional neural network, input geometric features and texture features, calculate the basic parameters, dynamic emotional parameters and feature Loss function of facial dynamic emotional features, and then match geometric features, texture features and emotional template categories to get the final recognition results. Experiment shows that after applying this method, its recognition accuracy is between 97.6%-98.7%, and the maximum recognition time is only 112 ms, indicating that this method has high recognition accuracy and speed.
    Keywords: dynamic facial images; emotional recognition; regional positioning; greyscale processing; enhanced processing; feature extraction; convolutional neural network.
    DOI: 10.1504/IJBM.2024.10063905
     

Special Issue on: Advanced Bio-Inspired Algorithms for Biometrics - Part 2

  • A method of badminton video motion recognition based on adaptive enhanced AdaBoost algorithm   Order a copy of this article
    by YunTao Chang 
    Abstract: To overcome the problems of low recognition accuracy, poor recognition recall, and long recognition time in traditional badminton video action recognition methods, a badminton video action recognition method based on an adaptive enhanced AdaBoost algorithm is proposed. Firstly, the badminton video actions are collected through inertial sensors, and the badminton action videos are captured to construct an action dataset. The data in this dataset is normalised, and then the badminton video action features are extracted. The weighted fusion method is used to fuse the extracted badminton video action features. Finally, the fused action features are used as the basis, Construct a badminton video action classifier using the adaptive enhanced AdaBoost algorithm, and output the badminton video action recognition results through the classifier. The experimental results show that the proposed method has good performance in recognising badminton video actions.
    Keywords: inertial sensor; weighted fusion method; AdaBoost algorithm; motion recognition; data standardisation.
    DOI: 10.1504/IJBM.2024.10063377
     
  • Motion recognition of football players based on deformable convolutional neural networks   Order a copy of this article
    by Lingqiang Xuan, Di Zhang 
    Abstract: In order to improve the accuracy of football player action recognition and the number of frames transmitted per second, a football player action recognition method based on deformable convolutional neural network is proposed. Firstly, the action images of football players are collected through binocular vision, and distortion correction and disparity calculation are performed on the images to improve their quality. Secondly, based on the collected athlete action images, the receptive field of the action images is calculated in two-dimensional convolution to extract football player action features. Finally, the extracted action features are input into the support vector machine to construct the optimal classification plane and complete the recognition of football player actions. The experimental results show that the action recognition accuracy of our method can reach up to 99.3%, and the transmission speed of our method is always stable at 24 frames per second or above.
    Keywords: variable convolutional neural network; CNN; football players; action recognition; binocular vision.
    DOI: 10.1504/IJBM.2024.10063378
     
  • Basketball player action recognition based on improved LSTM neural network   Order a copy of this article
    by Xudong Yang  
    Abstract: In order to improve the IoU value and accuracy of basketball player action recognition methods, this paper proposes a basketball player action recognition method based on an improved LSTM neural network. Firstly, establish a coordinate system in the visual system and perform appropriate sequence transformations on the collected basketball player action images to complete image acquisition. Next, a Kalman filter is used to filter and process the collected action images. Finally, based on the LSTM neural network unit, two sigmoid gating units are introduced to improve it. Using the filtered action image as input and the action recognition result as output, an improved LSTM neural network is used to construct an action recognition model and obtain the recognition result. The experimental results show that the proposed method has achieved significant improvement in IoU value and accuracy in action recognition, with the highest recognition accuracy reaching 98.26%.
    Keywords: improving LSTM neural network; basketball players; action recognition.
    DOI: 10.1504/IJBM.2024.10063379
     
  • Facial expression recognition method based on multi-level feature fusion of high-resolution images   Order a copy of this article
    by Li Wan, Wenzhi Cheng 
    Abstract: To improve the accuracy of facial expression recognition, the paper designs a facial expression recognition method based on multi-level feature fusion of high-resolution images. Firstly, smooth the noise and texture in the facial image and perform enhancement processing. Secondly, extract multi-level features of facial images, and then fuse multi-level features through reverse solving. Then, extract the attributes of different regions of the face and assign them to the corresponding representation data. Extract decoupled data of facial expressions based on feature fusion results. Compare decoupled representation and representation data to complete facial expression recognition. The experiment shows that the geometric mean of the recognition results obtained by this method is between 0.963 and 0.989, and the similarity of the feature vectors is between 0.972 and 0.988, indicating that this method can accurately output facial expression recognition results.
    Keywords: facial images; expression recognition; high resolution images; multi-level feature fusion.
    DOI: 10.1504/IJBM.2024.10063380
     
  • A method for identifying foul actions of athletes based on multimodal perception   Order a copy of this article
    by Jiuying Hu 
    Abstract: In order to improve the recall rate and accuracy of foul action recognition for track and field athletes, and solve the problem of poor classification effect of foul action, this study proposed and designed a multi-modal perception-based foul action recognition method for track and field athletes. Firstly, the foul action dataset of track and field athletes is constructed. Then, the wavelet denoising method is used to process the movement image noise of track and field athletes. Finally, the recognition function of foul action of track and field athletes is established by means of multi-modal perception, and the bidirectional ranking loss is used to train the function, and the similarity between skeleton and video matching is calculated, so as to obtain the final recognition result of foul action of track and field athletes. The experimental results show that the accuracy of foul action identification is 98.5%, the classification accuracy is 98.6%, the recognition recall rate is 99.2%, the recognition sensitivity is high, and the application effect is good.
    Keywords: multimodal perception; athletes; identification of foul actions; bidirectional ranking loss.
    DOI: 10.1504/IJBM.2024.10063381
     
  • Character emotion recognition algorithm in small sample video based on multimodal feature fusion   Order a copy of this article
    by Jian Xie, Dan Chu 
    Abstract: In order to overcome the problems of poor recognition accuracy and low recognition accuracy in traditional character emotion recognition algorithms, this paper proposes a small sample video character emotion recognition algorithm based on multimodal feature fusion, aiming to overcome the problems of low accuracy and poor precision in traditional algorithms. The steps of this algorithm include extracting facial image scene features and expression features from small sample videos, using GloVe technology to extract text features, and obtaining character speech features through filter banks. Subsequently, a bidirectional LSTM model was used to fuse multimodal features, and emotions were classified using fully connected layers and softmax functions. The experimental results show that the method achieves an emotion recognition accuracy of up to 98.6%, with a recognition rate of 64% for happy emotions and 62% for neutral emotions.
    Keywords: multimodal feature fusion; bidirectional LSTM model; attention mechanism; softmax function.
    DOI: 10.1504/IJBM.2024.10063382
     
  • Fine grain emotional intelligent recognition method for athletes based on multi physiological information fusion   Order a copy of this article
    by Dong Guo 
    Abstract: Aiming to solve the problems of low accuracy in collecting multiple physiological information, low recognition rate of fine-grained emotions, and long recognition time in traditional recognition methods, and fine grain emotional intelligent recognition method for athletes based on multi physiological information fusion is proposed. Various physiological information of athletes is collected using ECG sensors, EMG sensors, EDA sensors, as well as airflow sensors to acquire signals such as electrocardiogram, electromyogram, skin conductance, and respiration. The collected information is denoised, and the denoised information is then fused using the Bayesian method. Fuzzy neural networks are used to extract fine-grained emotional characteristics of athletes, and the results of fine-grained emotion recognition are obtained by combining with base classifiers. Experimental results show that the average accuracy of multi-physiological information collection of the proposed method is 97.2%, the average recognition rate is 97.5%, and the average recognition time is 1.41s.
    Keywords: multi physiological information fusion; athletes; fine grain emotional intelligent recognition; Bayesian method; fuzzy neural networks; base classifiers.
    DOI: 10.1504/IJBM.2024.10063383
     
  • Classroom learning behavior recognition method for English teaching students based on adaptive feature fusion   Order a copy of this article
    by Shuyu Li 
    Abstract: A new method of English teaching students’ classroom learning behaviour recognition based on adaptive feature fusion is proposed aiming at the problem of low recognition rate of classroom learning behaviour recognition. First of all, the video images of English teaching class were collected, and then divided into frames and grey-scale processing. Secondly, the improved guided filtering algorithm was used to enhance the image. Then, the maximum inter-class variance method was used to segment the image. Finally, SIFT algorithm was introduced to design an adaptive feature fusion architecture, which adaptively allocates feature weights and fuses shallow and deep features to realise learning behaviour recognition. The experimental results show that the proposed method has a peak signal-to-noise ratio of 51.7dB, a recognition rate of 97.9%, and a maximum delay of 1.9s, which can accurately identify classroom learning behaviour.
    Keywords: adaptive feature fusion; English teaching; student classroom; learning behaviour recognition; guided filtering algorithm; maximum between-class variance method.
    DOI: 10.1504/IJBM.2025.10063979
     
  • A method for identifying abnormal classroom behaviours of students based on multi-objective weighted learning   Order a copy of this article
    by Lin Zou 
    Abstract: In order to improve the accuracy of identifying abnormal behaviours among students and shorten the recognition time, a method based on multi-objective weight learning for identifying abnormal behaviours in student classrooms is proposed. Firstly, using mixed Gaussian background modelling to remove noise from student classroom monitoring images and improve image quality. Secondly, normalise the coordinates of student behaviour posture and extract classroom behaviour characteristics from both temporal and spatial features. Finally, taking student behaviour characteristics as input and student classroom abnormal behaviour recognition results as output, a multi-objective weight learning abnormal behaviour recognition model is constructed to obtain the recognition results of student classroom abnormal behaviour. The experimental results show that the method proposed in this paper can improve the recognition accuracy of abnormal classroom behaviour among students, with a recognition accuracy of 95.4%, and can shorten the recognition time, all of which are not less than 3.5 seconds.
    Keywords: multi-objective weight learning; abnormal behaviour; student behaviour; classroom monitoring images.
    DOI: 10.1504/IJBM.2025.10063980
     
  • A method of athlete foul action recognition based on DTW algorithm   Order a copy of this article
    by Weihai Zhong, Yanpeng Zhao, Cheng Yang, Chuan Wang 
    Abstract: There may be various foul actions in different sports events, which vary in form, speed, and rhythm, increasing the difficulty of action recognition. Therefore, a DTW algorithm based athlete foul action recognition method is proposed. Select the OV7670 camera to collect athlete action images, extract image contour information, and use it as a feature of athlete foul actions. Based on the collected images, evenly sample the contours of the images to obtain contour feature points. Optimise the DTW algorithm from three aspects: reference template modelling, distortion calculation, and determining the optimal path. Use the extracted action feature sequence as the input of the DTW algorithm to output the recognition results of athlete foul actions. The experimental results show that the distortion of this method is as low as 0.58, indicating that it can accurately identify athlete foul actions and has high application value.
    Keywords: DTW algorithm; athletes; foul action; identification method; distortion; optimal path.
    DOI: 10.1504/IJBM.2025.10064549
     
  • English translation robot pronunciation error correction method based on semantic matching   Order a copy of this article
    by Xiaohong Yu 
    Abstract: In order to improve the pronunciation accuracy of English translation robots, a semantic matching based pronunciation correction method for English translation robots is proposed. Use digital filters to pre emphasise the collected pronunciation signals to improve the accuracy and clarity of the pronunciation signals. Then, by calculating the semantic weight of the pronunciation signal and limiting the vector direction based on its weighted vector of semantic features, the pronunciation semantics of the English translation robot are matched. In order to further improve accuracy, the method of state separation can be used when processing feature quantities of pronunciation signals, and pronunciation error correction can be achieved through feature fusion. The experimental results show that this method is less affected by noise and can improve the reliability of robot pronunciation correction.
    Keywords: semantic matching; speech recognition; English translation robot; mispronunciation signal; pronunciation recognition.
    DOI: 10.1504/IJBM.2025.10064550
     
  • Keystroke Dynamics and Quantum Machine Learning   Order a copy of this article
    by Namisha Bhasin, Sanjay Kumar Sharma, Rajesh Mishra 
    Abstract: The performance of machine learning algorithms is often suboptimal in identifying and classifying patterns hence, there was always a requirement for methods that could provide optimal solutions. Quantum algorithms have demonstrated a significantly greater efficiency in many tasks than traditional machine learning algorithms. Quantum computers leverage unique properties such as entanglement and superposition, allowing them to generate patterns inaccessible to classical systems. Keystroke dynamics, a method for user identification based on typing style, is categorised into static authentication, where users input a username/password combination of 15-20 letters, and dynamic authentication, where users type unbiased text such as emails, chats, or online exams. Both static and dynamic authentication primarily involve involuntary actions. This research paper focuses on authenticating users based on static keystroke dynamics using various quantum and hybrid algorithms.
    Keywords: Quantum support vector classifier; QSVC; ZZFeatureMap; quantum neural network; QNN; variational quantum circuit; VQC.
    DOI: 10.1504/IJBM.2025.10065008
     
  • Anomalous behaviour recognition in MOOC learning based on local intuitionistic fuzzy support vector machine   Order a copy of this article
    by Qingyun An 
    Abstract: In order to improve the accuracy and efficiency of MOOC learning anomalous behaviour recognition, a MOOC learning anomalous behaviour recognition method based on local intuitionistic fuzzy support vector machine is proposed. Firstly, construct a sliding filter for MOOC learning video image grids and filter the MOOC learning video image channels. Secondly, using the key points of the student skeleton as behavioural posture features, the detection of anomalous behaviour in MOOC learning is carried out. Finally, based on the theory of local intuitionistic fuzzy sets, the local intuitionistic indices of positive and negative samples in MOOC learning behaviour are calculated, and a decision function for classifying and recognising MOOC learning anomalous behaviours is constructed to complete the classification and recognition of MOOC learning anomalous behaviours. The results show that the recognition accuracy of the method proposed in this paper is consistently above 90%, and the recognition time does not exceed 3 s.
    Keywords: local intuitionistic fuzzy support vector machine; MOOC learning; anomalous recognition; posture features.
    DOI: 10.1504/IJBM.2025.10065015
     
  • Accurate recognition of emotions of audio-visual bimodal characters based on dual level feature dimensions   Order a copy of this article
    by Xiao Zhang  
    Abstract: In order to accurately and quickly recognise the emotions of bimodal characters, a precise emotion recognition method for audio-visual bimodal characters based on dual level feature dimensions is proposed. Firstly, based on audio data, logarithmic transformation and cepstral function are used to extract emotional features from character audio signals. Secondly, by using local binarisation mode and Gabor wavelet transform, emotional feature maps of character videos are extracted. Finally, after cross modal interaction processing of the audio and video features of the character’s emotions, a feature fusion model based on gated neural networks is constructed using the visual and acoustic features after interaction as inputs to obtain the final audio-visual bimodal character emotion recognition results. The experimental results show that compared to existing methods, the highest accuracy of character emotion recognition in our method is 0.99, and the longest recognition time does not exceed 10 s.
    Keywords: dual level feature dimension; audio and video dual-mode; character emotions; accurate recognition.
    DOI: 10.1504/IJBM.2025.10065016
     
  • Detection method of students English classroom learning behaviour: multi-channel feature fusion   Order a copy of this article
    by Yun Zhang 
    Abstract: Aiming to improve the problems of large root mean square error and low F1-value in existing methods, a student English classroom learning behaviour detection method based on multi-channel feature fusion is proposed. Firstly, collect classroom data such as students’ oral pronunciation characteristics, note taking texts, eye tracking data, and facial expression recognition; secondly, extract student English classroom learning behaviours, including oral pronunciation features, text features, and visual attention features; then, input the above features into a recurrent neural network to achieve feature fusion; finally, establish a machine learning model, use semi supervised learning for model training, and use the trained model to detect student English classroom learning behaviour. The experimental results show that the average RMSE of the proposed method is 0.30, and the F1-value is higher, fully indicating that its detection effect is better.
    Keywords: multi-channel feature fusion; English classroom; learning behaviour; acoustic signal processing technology; semi-supervised learning.
    DOI: 10.1504/IJBM.2025.10065017