Forthcoming Articles
International Journal of Information and Communication Technology

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.
Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.
Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.
Online First articles are also listed here. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.
Register for our alerting service, which notifies you by email when new issues are published online.
International Journal of Information and Communication Technology (65 papers in press) Regular Issues
Abstract: With the rapid growth of data-intensive applications, achieving low-latency and reliable content retrieval in complex networks has become a major challenge. Information-centric networking (ICN) leverages content naming and pervasive in-network caching to enable retrieval from multiple replicas, making replica selection crucial for performance. However, selection is complicated by replica capacity limits, bursty workloads, and dynamic path variations. To address these issues, we propose a replica selection strategy that integrates the multi-armed bandit (MAB) framework with dynamic redundancy control. By modelling selection as an MAB problem, the strategy incorporates path variability, service heterogeneity, and blocking risk into decision-making, enabling adaptive exploration and exploitation. An additional load-aware redundancy mechanism adjusts redundancy levels to curb exploration overhead and suppress tail latency. Simulations on a real-world topology show that the method significantly reduces latency and improves robustness. Compared with nearest-replica routing, it reduces average latency by 32.09% and P99 tail latency by 45.76%. Keywords: information-centric networking; ICN; multi-armed bandits; MAB; adaptive redundancy; in-network cache; replica selection. DOI: 10.1504/IJICT.2026.10078035
Abstract: Japanese kana writing is fundamental to learning the Japanese language, and its standardisation has a significant impact on language learning outcomes. To address the inefficiency and subjectivity of traditional manual evaluation, this study proposes an intelligent evaluation model that integrates a convolutional long short-term memory (ConvLSTM) network with a conditional random field (CRF). First, the model utilises the ConvLSTM to efficiently extract spatiotemporal features of handwriting traces. Second, the CRF layer optimises sequence annotation to achieve automatic quantitative evaluation of kana writing accuracy, fluency, and structural standardisation. Finally, a self-constructed dataset containing 2,000 handwriting trace samples from five common hiragana and five katakana categories was used for evaluation experiments. The results show that the model achieved a 98.2% accuracy rate in kana character recognition, a Pearson correlation coefficient of 0.91 between its writing style score and expert evaluations, and a 91.2% accuracy rate in kana stroke regularity assessment, significantly outperforming the single LSTM and CNN-CRF models. Keywords: writing trajectory evaluation; ConvLSTM; CRF; Japanese kana; intelligent evaluation; sequence labelling. DOI: 10.1504/IJICT.2026.10078036
Abstract: Tourism route planning has traditionally emphasised shortest-path optimisation, often over-looking the importance of enhancing tourists overall experiences. Many travellers rely on user-generated content to guide their journeys, yet manually searching and adjusting routes in real-time can be inefficient and inaccurate. This study focuses on building a high-quality database to support model training for tourism route optimisation and dynamic adjustments. By leveraging graph theory and the Floyd-Warshall algorithm, the proposed approach integrates various tourism-related data factors to enhance route planning accuracy based on personalised preferences. The high-quality dataset, sourced from travel agencies and user-generated data, ensures the algorithms adaptability in real-world scenarios. The model is tested on an online tourism platform, with its effectiveness evaluated through a framework grounded in tourism theories and user behaviour research. The results demonstrate significant improvements in both route planning accuracy and the efficiency of real-time adjustments when travellers modify their plans mid-journey. Keywords: database establishment; machine learning; tourism route planning adjustment. DOI: 10.1504/IJICT.2026.10078037
Abstract: Addressing the issue of traditional customer segmentation relying on static data and struggling to respond to behavioural changes in real-time, a real-time customer segmentation framework based on big data analysis and clustering analysis is proposed. The data comes from e-commerce websites and includes user activities, transactions, and demographic information. Preprocessing involves data cleaning, normalisation, and TF-IDF feature extraction. The key features include transaction frequency, interest in product categories, and page dwell time. The proposed model is an adaptive k-nearest neighbour (k-NN) logistic regression based on clonal selection (CS-AK-LR), integrating adaptive K-means clustering (AK) and logistic regression (LR) for customer clustering and value classification prediction. The clonal selection algorithm (CS) optimises the hyperparameters of AK and LR. The segmentation detection rate of this method reaches 96.21%, and the error rate is reduced by 1.03% compared to existing methods. Combining big data with real-time clustering analysis can effectively enhance the speed and accuracy of marketing responses. Keywords: consumer segmentation; clonal selection-based adaptive K-logistic regression; CS-AK-LR; marketing strategy; big data; cluster analysis. DOI: 10.1504/IJICT.2026.10078097
Abstract: The complex sea ice and marine environment in the polar region significantly affects marine safety operations. How to accurately simulate the complex polar environment is a key concern at home and abroad. The greenhouse effect leads to ice melting, with the expanding area of broken ice posing new challenges to ice navigation. This paper reviews the principle of discrete element method (DEM), special features of ship navigation in broken ice areas, and the progress of DEM applications in broken ice research. Based on this foundation, it discusses the existing challenges and key research applying DEM to broken ice studies. Keywords: discrete element method; DEM; broken ice areas; computational fluid dynamics; ship navigation; review. DOI: 10.1504/IJICT.2026.10078099
Abstract: Government agencies struggle to track and respond to public sentiment on social media platforms like Weibo. This case study describes the design and development of a monitoring system for an anonymous municipal government in China, leveraging deep learning to analyse sentiment and emerging topics. The case details the system architecture, implementation challenges, and how the outputs can be used for targeted public communication. To achieve effective management of social public opinion, this article uses deep learning and clustering algorithms to process public opinion information on the Weibo platform and establishes a Weibo public opinion analysis system. Focusing on user blog posts and comments, we first use distributed crawlers to obtain data, and then complete preprocessing through cleaning and word segmentation. Emotion analysis is implemented to obtain sentiment polarity and probability, and to explore potential themes using a latent Dirichlet allocation topic model. The experimental results show that the established model has high accuracy in emotion classification. Using real Weibo data, the emotional value change curve of netizens is plotted to determine the impact of topics on netizens emotions. The system supports targeted public opinion intervention for governmental use. Keywords: Weibo; public opinion; analysis. DOI: 10.1504/IJICT.2026.10078157
Abstract: This study proposes a framework for suppressing the spread of fake news on social networks based on multimodal sentiment analysis. This study employs the BERT model to extract contextual semantic vectors from news texts. These are then fused with the output of a bidirectional long short-term memory (BiLSTM) network through feature concatenation, enabling simultaneous capture of local context and global long-range dependencies. Emoticon sentiment features are then extracted through autoencoders and deeply integrated to accurately identify user sentiment inclinations. The studys core innovations are: 1) a multi-tiered fake news detection and suppression architecture; 2) deep fusion of text and emoticon features through multimodal sentiment analysis; 3) dual-strategy dissemination suppression combining detection + sentiment immunity. Experimental results demonstrate that the fake news detection model achieves an accuracy of up to 89.4%. The proposed model can provide effective solutions for building a timely and accurate false news prevention and control system. Keywords: fake news; multi-modal data; sentiment analysis; dissemination suppression; BERT model. DOI: 10.1504/IJICT.2026.10078158
Abstract: The demands of customers for spiritual culture are successfully met by cultural and creative products, which are a significant carrier of museum culture. Customers may develop a closer relationship with museums through the creative and cultural products original design, which can greatly increase museums social awareness. This paper suggests using the KANO model to innovate the design of museum cultural products from the perspective of consumer demand, given the issues of significance homogenisation, exorbitant prices, and a lack of functional development of current museum cultural products. The KANO model is utilised to analyse and prioritise consumers demands for museum cultural products. This analysis employs a series of metrics to assess consumer needs and determine the most pressing issues within the field. The application of the KANO model in this particular context facilitates the generation of innovative concepts in the domain of product design and development. Keywords: KANO model; cultural and creative products of museums; product design; consumer demand. DOI: 10.1504/IJICT.2026.10078159 Abstract: This study proposes a collaborative management framework for tourist destination dynamic carrying capacity based on multi-agent deep reinforcement learning (MADRL) and spatio-temporal graph neural network (STGNN). A multi-dimensional topological model is constructed to characterise the spatio-temporal correlation of passenger flow, resources, environment, and service. A STGNN module embedded with spatio-temporal attention is designed to capture dynamic evolution features. A hierarchical MADRL structure realises global coordination. Experiments show that the framework reduces MAE to 0.037, shortens response delay to within 8.2 s, and improves carrying capacity utilisation to 92.6%. It outperforms traditional models in prediction, response, and multi-objective balance, providing an effective method for intelligent and sustainable tourism management. Keywords: tourist destination; dynamic bearing capacity; multi-agent deep reinforcement learning; MADRL; spatio-temporal graph neural network. DOI: 10.1504/IJICT.2026.10078160
Abstract: Addressing psychological and social factors in graduate employment prediction, this paper proposes a graph network model that integrates psychological time-series data with dynamic social relationships. Traditional methods use static academic data and cannot capture key psychological factors like anxiety and career efficacy, or their interaction with peer and alumni resources. By constructing a time-series graph from psychological scales and social ties, tested on public graduate data, the model achieves an area under the curve of 0.891 for employment prediction. It significantly outperforms long short-term memory networks (area under the curve 0.801) and static graph neural networks (area under the curve 0.832), with normalised discounted cumulative gain at rank position 5 of 0.882, demonstrating reliable destination ranking. This work provides a data-driven approach for precise employment guidance through psychological monitoring. Keywords: temporal graph convolutional network; T-GCN; employment trajectory prediction; mental health; dynamic social network. DOI: 10.1504/IJICT.2026.10078197
Abstract: This research tackles the pressing challenge of real-time automatic error detection in piano performance, a task where conventional approaches often propagate inaccuracies due to the decoupling of audio-score alignment and error identification. This paper introduce the DiffAlignTransformer framework, which incorporates a differentiable dynamic programming mechanism to jointly learn probabilistic notelevel alignment and error classification within a hierarchical crossmodal encoder. Evaluated on the Vienna Synchronous Library dataset using a leaveoneperformerout validation strategy, the model attains an overall F1score of 0.872, exceeding the strongest baseline by 6.0%, with marked gains in onset (7.2%) and offset (8.1%) error recognition. Inference requires only 78 milliseconds per second of audio, satisfying strict realtime constraints. These outcomes confirm that our method successfully resolves the intertwined alignment-detection problem and delivers precise, instantaneous feedback for piano pedagogy. Keywords: piano performance assessment; error detection; differentiable alignment; cross‑modal transformer; real‑time feedback. DOI: 10.1504/IJICT.2026.10078198
Abstract: This paper proposes a quantum-threat-mitigated encryption scheme by reengineering core algorithms via mathematical lattice constructs. While offering quantum-resistant security, lattice-based homomorphic encryption suffers from high latency and storage overhead. To overcome this, we redesign the ciphertext structure and decryption algorithm, introducing a polynomial Chinese remainder theorem-based method to pack multiple complex plaintexts into a single polynomial. A reconfigurable modular unit and a hybrid crossbar-fixed interconnection network are co-designed to optimise operational efficiency. This dual approach facilitates algorithm reconstruction and optimisation. Security analysis and simulations confirm that our method not only resists quantum computing attacks but also achieves an encryption time of 0.98ms per bit, meeting real-time requirements. Keywords: post-quantum cryptography; lattice-based construction; homomorphic encryption; algorithm reengineering. DOI: 10.1504/IJICT.2026.10078199
Abstract: The proposed study presents a new AI-supported CPS architecture that facilitates the establishment of real-time co-creation among artists and intelligent machines based on an adaptive communication set. The architecture of the system is a three-layer system that comprises the perception layer, where a generative adversarial network (GAN)-based design recommender is optimised by a feedback loop of reinforcement learning (RL) that captures sensory feedback of the system, the cognitive layer, which interprets the input data into a recommended creative modification; and finally the layer that implements the suggested modification into the work environment with robotic actuators and additive manufacturing tools. The semantic communication protocol is carried out on the message queuing telemetry transport (MQTT) and open platform communications unified architecture (OPC-UA) standards to promote the uninterrupted exchange and synchronisation of the data between the interface of the artist and the physical production space. Keywords: cyber-physical system; AI-assisted design; co-creation framework; generative adversarial network; GAN; ceramic manufacturing. DOI: 10.1504/IJICT.2026.10078237
Abstract: News interests shift quickly, and collecting fine-grained reading logs in one place is increasingly risky, so privacy-preserving personalisation must handle heterogeneous clients and unstable feedback. This paper proposes a dynamic-threshold federated reinforcement learning scheme for personalised news delivery. In the scheme, first, each device learns a sequential policy from local interactions to optimise long-horizon utility. Then, each round estimates update reliability and adjusts a participation cutoff to filter noisy client contributions. Finally, the server aggregates selected shared updates while keeping lightweight personalisation on device. Experimental results show that the proposed scheme raises NDCG at ten from 0.401 to 0.423, improves diversity from 0.287 to 0.319, increases cumulative reward from 1.866 to 2.034, and reduces communication per round from 10.9 to 7.8 megabytes, achieving a stronger balance of utility, diversity, and efficiency. Keywords: dynamic threshold; federated reinforcement learning; personalised news recommendation; client heterogeneity; communication efficiency.
Abstract: Artificial intelligence (AI), edge computing, and the internet of things are all helping to make real-time analytics and immersive fan interaction possible in the sports world. However, typical cloud-based sports communication networks have too much latency, too much bandwidth congestion, and limited scalability, which makes real-time sports analytics and interactive fan experiences difficult. This paper presents an AI-driven edgeIoT sports communication framework (AIESCF) for the intelligent processing of sports data, utilising edge-based deep learning inference, adaptive bandwidth-aware communication protocols, and distributed IoT sensing infrastructures. It uses spatiotemporal event recognition models, edge-level data filtering, and AI-assisted communication optimisation to find events and look at how well players are doing without putting too much strain on the network. The system has an accuracy of 90%, a latency of 85 ms, a bandwidth optimisation of 55%, and an engagement rate of 87%. Results demonstrate scalable efficient architecture deployment. Keywords: AI-driven networks; Edge-IoT; sports analytics; fan engagement; real-time communication.
Abstract: In view of the increasing role of culture and tourism in promoting tourism, it is an urgent issue to assess exactly how effective they are. In this paper, a multi-modality transformer-based model is proposed to evaluate the communication efficiency of cultural and tourist videos. This model combines VI, BERT and MFCC for the extraction of image, text and audio characteristics. Cross-modal attention and consistency constraints can be used to improve the convergence of information. In order to verify the effectiveness of the proposed model, a series of experiments were carried out to evaluate the performance of the proposed model, such as user engagement, dissemination breadth, and sentiment fluctuation. Experimental results show that the prediction accuracy of the proposed model is 89.2%, the prediction of user interaction is 87.1%, and the correlation coefficient is 0.78. Compared with the conventional single-mode model, the performance of this model is significantly improved, which indicates that multimodal data fusion plays an important role in the evaluation of communication efficiency. Keywords: multimodal learning; cross-modal fusion; short video marketing; emotional fluctuation analysis; cultural tourism dissemination; sentiment analysis; user behaviour analysis. DOI: 10.1504/IJICT.2026.10078252
Abstract: In the face of increasingly covert cyber-attacks, traditional detection models struggle to effectively capture the complex contextual correlation features in the traffic, resulting in insufficient ability to identify new threats. To address this issue, this study proposes a detection model based on bidirectional self-attention mechanism, which achieves deep perception of abnormal behaviours by simultaneously learning the context information of the traffic sequence. Experimental results show that compared with mainstream long short-term memory and standard transformer methods, this model has an average area under the curve improvement of over 4.2%, and the recall rate for low-rate attacks has increased by 7.5%, significantly enhancing the accuracy and robustness of detection. This study provides a new idea for improving the active defence capability of network security. Keywords: cybersecurity; anomaly detection; self-attention; bidirectional encoding.
Abstract: The ceramic industry generates an enormous amount of waste every year. Traditional recycling relies on manual sorting, with an accuracy rate of only about 78%, and the production line scheduling is rigid. To achieve efficient resource utilisation, this paper innovatively integrates convolutional and recurrent neural networks to construct an intelligent waste recognition model, and embeds it into a discrete event simulation system for dynamic optimisation. Experiments show that the new method increases the classification accuracy to 96.7%, the system throughput increases by 32.4% after simulation optimisation, and the utilisation rate of key equipment increases by 22.8%. This research provides an intelligent solution for the precise identification and system regulation of ceramic waste recycling, promoting the implementation of the circular economy. Keywords: ceramic waste recycling; hybrid neural network; HNN; discrete event simulation; DES; resource utilisation rate.
Abstract: Precise prediction of the purchase intention for marine cultural and creative products is of vital importance for e-commerce marketing. Addressing the issues of the existing methods that separately analyse images and comments, and the difficulty in capturing cross-modal collaborative effects, this study proposes a dual-stream deep learning model that integrates visual saliency and text sentiment. This model achieves a deeper understanding of user preferences by simultaneously extracting the salient regions of the images and the sentiment tendencies of the comments. Experiments on public datasets show that the purchase intention prediction accuracy of this method reaches 85.6%, significantly outperforming the baseline models that only use images (72.1%) or text (78.3%), with a recall rate increase of over 10 percentage points. This study provides an effective tool for multimodal fusion analysis and personalised recommendations in the marine cultural and creative field. Keywords: purchase intention prediction; visual saliency; text sentiment analysis; multimodal fusion; marine cultural products.
Abstract: This paper proposes a deep learning-based model for computer music denoising, addressing accuracy and efficiency limitations in existing methods. It employs a dual-branch network to separately identify transient and periodic noise, combined with an improved spectral subtraction for precise audio separation. Model compression via pruning and knowledge distillation ensures real-time capability. Experimental results on AudioSet show recognition accuracies of 93.5% (transient) and 94.1% (periodic), with average denoising gains of 15.1 dB, 15.0 dB, and 16.3 dB for transient, periodic, and mixed noise, respectively. When processing 100 minutes of lab-recorded audio, latency remains under 21.0 ms, outperforming three benchmark models in speed and stability. The model demonstrates robust noise reduction and real time performance, suitable for applications like live music, low-latency communication, high-quality post-production, and restoration of noisy historical recordings. Keywords: dual-branch communication; spectral subtraction; computer music denoising; model pruning; knowledge distillation.
Abstract: This research tackles the persistent challenge of uncontrolled approximation errors and unreliable convergence in neural network-based methods for solving partial differential equations. We introduce a novel error-controlled implicit neural representation framework, which incorporates a trainable error indicator network and an adaptive weighting scheme to dynamically steer the optimisation process. Our approach utilises a dual-encoding architecture to represent physical fields with high fidelity and a cooperative training mechanism that iteratively estimates and reduces local errors. Experimental validation on a national aeronautics and space administration turbulent flat-plate boundary layer benchmark demonstrates that error-controlled implicit neural representation achieves a relative L2 error of 8.73 x 104, outperforming the best existing baseline by 42.6% and improving boundary-layer accuracy by 52.0%. Moreover, the proposed method reduces training time by 34.7%55.2% while maintaining physically consistent solutions, confirming its efficacy and efficiency in error-aware numerical simulation. Keywords: partial differential equations; PDEs; adaptive optimisation; numerical simulation; implicit neural representation; INR.
Abstract: Academic pressure has a significant impact on students physical and mental health and academic performance. Understanding its dynamic formation mechanism is crucial for effective intervention. This paper proposes an innovative framework that combines causal discovery with time series modeling to simulate the development path of academic pressure. The framework first uses the systematic causal discovery algorithm to learn the directional acyclic diagram to present the causal correlation between key factors; then builds a causal constraint time series model to simulate the dynamic evolution process of student pressure. Based on the comprehensive longitudinal data evaluation of 300 students, compared with the traditional benchmark model, this model notably increases the average accuracy of pressure precursor recognition by 7.4% and effectively reduces the trajectory simulation error by 28.4%. This research finding provides operational insights for early warning of stress and personalized intervention strategies. Keywords: academic pressure; simulation; causal discovery; time series modeling; oriented acyclic diagram.
Abstract: To solve the problems of modal heterogeneity, temporal asynchrony and cognitive adaptation imbalance in multimodal real-time interaction, a CLT-driven multi-modal real-time fusion architecture was proposed. Experimental verification on HoloAssist dataset shows that the interactive intention prediction accuracy of the proposed architecture reaches 95.2% +- 1.3%, which is 3.5 percentage points higher than that of AlignMamba model. The end-to-end delay is 0.18 s +- 0.02 s, and the alignment delay is as low as 0.028 s. The subjective score of cognitive load was 3.2 +- 0.8, which was significantly better than the baseline model. Ablation experiments confirm that each core module is crucial to performance improvement, and the model has excellent robustness in scenarios with modal loss and noise interference. This research provides support for the implementation of real-time multimodal interaction technology. Keywords: multi-modal fusion; cognitive load theory; optimal transmission; real-time human-computer interaction; adaptive weight.
Abstract: Digital English learning environments generate massive interaction data, offering potential for adaptive learning path optimisation. However, many existing approaches treat learning state estimation and learning path recommendation as separate tasks, restricting long-term personalised learning support. This study proposes a framework that integrates a multi-factor fusion knowledge tracing (MFFKT) model with reinforcement learning. It jointly analyses behaviour sequences, item attributes, knowledge structures and temporal features to dynamically capture learners knowledge states, which serve as environment states for long-term reward-driven path optimisation. Experiments on ASSISTments 2017 and EdNet-KT4 show that MFFKT achieves AUC scores of 0.834 and 0.812, surpassing baseline models. Ablation studies validate the efficacy of multi-dimensional feature fusion. When combined with conservative Q-learning, the methods outperform greedy, rule based, and random strategies in cumulative reward, completion rate, and efficiency. Overall, the proposed framework enables coordinated modelling of learning states and learning path decisions, providing an effective technical approach for adaptive and personalised English learning within digital learning environments. Keywords: multi-factor fusion knowledge tracing; MFFKT; RL; digital English learning; personalisation; sequential decision-making.
Abstract: Detecting subtle tampering traces within complex backgrounds remains a significant challenge in image copy-move forgery detection, primarily due to the inadequacy of shallow feature extraction. To overcome these limitations, this paper proposes an enhanced DeepLabV3+ architecture designed for efficient multi-scale feature fusion. The framework utilises a lightweight MobileNetV3 backbone within an encoder-decoder structure, integrated with an improved atrous spatial pyramid pooling (ASPP) module employing depthwise separable dilated convolutions. To strictly preserve low-level details, we introduce a dual-branch shallow feature enhancement module (dual-branch SFEM) augmented by efficient channel attention (ECA). Furthermore, the feature fusion stage is optimised through architectural restructuring to reduce computational complexity while maintaining performance. A key innovation is the inclusion of a lightweight gating network that generates spatially adaptive weights, dynamically balancing the trade-off between semantic abstraction and detail preservation. Extensive experiments on the CASIA, DEFACTO, and COVERAGE datasets demonstrate the models superiority over state-of-the-art methods. Specifically, the proposed method achieves an AUC of 95.41% and an F1 score of 77.24% on the DEFACTO dataset, while exhibiting robust generalisation capabilities on CASIA 1.0 (AUC: 78.93%, F1: 57.68%). Keywords: image forgery detection; gated fusion; efficient channel attention mechanism; dual-branch shallow feature enhancement module; DB-SFEM.
Abstract: Aiming at the problems of semantic distortion and over-correction that often occur in the text error correction of English learners by existing generative models, this paper accordingly proposes a novel generative adversarial network method integrating grammatical rule constraints (generative adversarial networks with grammatical rule constraints). By introducing formal grammatical knowledge as a flexible constraint into the network training process, the model is effectively guided to correct errors while better maintaining the original meaning and overall fluency of sentences. Experiments conducted on the public learner corpus show that this method significantly increases the error correction accuracy by 12% and effectively reduces the number of over-correction cases by 16%. The research thereby provides an effective way to solve the persistent balance problem between accuracy and naturalness in automatic grammar correction. Keywords: English interlanguage correction; generative adversarial network; grammatical rule constraints; semantic retention; excessive correction.
Abstract: In order to solve the problem of how to integrate visual content and semantic information into oil paintings well, this paper puts forward an emotion recognition model for oil paintings based on a multimodal adaptive deep network. Visual and textual information are handled with a two-path system in the model; it gets deep visual features out of paintings and contextual semantic features from connected texts. Adaptive feature fusion module is created to adaptively adjust the fusion weights of different modality features by using cross-modal attention and gating mechanisms. On the ArtEmis oil painting dataset, the experiment shows that the proposed model has achieved 76.8% accuracy in discrete emotion classification task and 0.319 RMSE in continuous emotion dimension prediction. Compared with the basic model, it has better classification accuracy, which proves the validity of the adaptive fusion mechanism in the analysis of multimodal art emotions. Keywords: emotion recognition; oil painting analysis; multimodal learning; adaptive fusion.
Abstract: This study proposes a unified framework that jointly models personalised learning path recommendation and knowledge tracing to improve individualised learning support in large-scale online education. The framework integrates learners knowledge states, prerequisite relationships, learning load, and preferences within a single space, enabling dynamic tracking and coordinated optimisation. An online-updatable knowledge tracing model captures mastery levels, which inform a scoring and recommendation mechanism that adapts as knowledge states evolve. Experiments on the EdNet KT1 dataset show the proposed model achieves superior prediction accuracy and lower mean absolute error than recent baselines, with reduced parameters and training time. This approach balances predictive performance and computational efficiency, offering a practical solution for personalised learning support. Keywords: large-scale online education; knowledge tracing; personalised learning path recommendation; deep learning; educational data mining.
Abstract: Addressing the urgent need for cross-language text translation quality assessment, this paper proposes a neural network-based model for evaluating English-Chinese translation quality. Current widely adopted automated evaluation methods exhibit significant limitations in handling specialised terminology and nuanced semantics, particularly when addressing culture-specific concepts. The neural network model constructed in this study integrates deep semantic representation with contextual correlation analysis, achieving remarkable results on the Chinese-English test set of the public WMT 2020 metrics shared task dataset. It achieved a core correlation metric (Pearsons r) of 0.682, along with a multi-dimensional classification evaluation (macro-F1) of 0.689 and a ranking quality metric (normalised discounted cumulative gain @10) of 0.927, comprehensively outperforming mainstream baseline models. This model provides a reliable technical tool for cross-language text quality control. Keywords: neural network; translation quality assessment; cross-language application.
Abstract: Some people are concerned about the protection of students privacy, the authenticity of the materials used in the courses, and the safety of students communicating with one another in virtual learning settings. The growing number of educational institutions offers online degree programs. There is a single point of failure in older systems, which could be exploited to alter academic records, gain unauthorised access to systems, and cause problems. It is caused by the fact that earlier systems were designed to function well with centralised architecture. BESDL, which stands for blockchain-enabled secure distance learning, is one approach that could address these issues. The blockchain technologys permanent record and decentralised consensus make data more trustworthy, open, and dependable than it would otherwise be. The three primary components of this system are the secure content-based access control (SCBAC), the decentralised identity management (DIM), and the encrypted content delivery (ECD) protocols, which collaborate to safeguard educational resources. According to the test findings, BESDL enhances system security, maintains examinable academic records, and accelerates the checking process. To summarise, BESDL is not only dependable but also flexible, making it an excellent option for future higher education institutions that will combine online learning. Keywords: blockchain; distance learning; higher education; smart contracts; secure authentication.
Abstract: The psychological stress problems among college students are increasingly prominent, requiring efficient and objective identification methods. However, existing real-time systems struggle to balance accuracy with processing speed and lack deep integration of multi-source information (such as expressions, voices, and texts). This study proposes a real-time recognition system based on faster robust bidirectional encoder representations from transformers multimodal fusion, significantly improving computing efficiency through an innovative lightweight fusion mechanism. Experiments on public datasets show the system achieves 86.5% accuracy in stress recognition, significantly improving on traditional methods (e.g., 73.2% for single-modal convolutional neural network). Its inference speed meets real time requirements (30 fps), with the key area under the curve indicator increasing to 0.91 (from 0.82). This study provides an effective approach for non-intrusive, real-time environments. Keywords: psychological stress; multimodal fusion; real-time system; mental health.
Abstract: The extraction efficiency of natural antioxidants is influenced by the dynamic coupling of multiple factors, and traditional static optimisation methods are unable to cope with real-time disturbances. This study proposes a dynamic optimisation framework based on multi-agent simulation, which realises real-time precise control of the extraction process by simulating the autonomous decisions and collaboration of agents such as solvents, temperature, and equipment. The experiment uses an open dataset of antioxidant kinetics for verification. Compared with traditional methods, this method increases the extraction rate by an average of 12.5%, and the improvements in key indicators (area under the curve, normalised discounted cumulative gain) have passed statistical significance tests (p < 0.05), providing a new idea for solving the dynamic optimisation problem in food processing processes. Keywords: multi-agent simulation; dynamic optimisation; process control; antioxidant extraction.
Abstract: n response to the three major challenges in AI music generation limited chord representation, monotonous emotions, and low audio fidelity this research proposes a novel end-to-end framework termed PEMF that integrates PerformanceNet with a multi-emotion music generation model. The core innovations include a structured four-dimensional chord encoding method using root, third, fifth, and crown notes to expand harmonic diversity to 60 chord types, a dual-encoding transformer architecture that independently processes melody and chord streams for superior structural coherence, and a fine-grained emotion regulation mechanism mapping pitch histograms and rhythm density parameters to Russells two-dimensional emotion space for continuous control. For audio synthesis, an asymmetric U-net structure combined with a multi-band residual learning mechanism and a flooding loss strategy significantly enhances spectral fidelity and training stability. Experimental results demonstrate that PEMF achieves a chord vocabulary coverage near 1.0, an emotion recognition accuracy of 92.3% significantly outperforming symbolic transformers 78.6%, a high-frequency energy retention rate of 89.1%, and a Frechet audio distance of 0.5. System performance shows a 36.9% improvement in emotional consistency and a 64.2% reduction in latency compared to staged training, validating its efficacy in practical applications like music therapy and film scoring. Keywords: PerformanceNet; multi-emotional music generation model; emotional regulation; audio synthesis.
Abstract: Focused on calligraphy, this research addresses style transfer distortion and inadequate compositional aesthetics in AI-generated art. We propose an algorithm that integrates decoupling representation learning with content-aware layout modelling. A dual-encoder architecture separates character structure and brushstroke style features, enabling precise and controllable style transfer via dynamic instance normalisation. A visual-linguistic bimodal network with hierarchical spatial modules is introduced to model relationships at the character, line, and global levels. The proposed method achieves a style similarity of 0.751, a content preservation PSNR of 36.9 dB, and an 8%-16% improvement in cross-font generalisation accuracy on unseen characters. For layout generation, the framework maintains a line-spacing fluctuation coefficient of 0.032, achieves a layout aesthetics score of 4.8, and demonstrates strong long-text stability with a cross-page style consistency of 0.94. Ablation studies further confirm the effectiveness of the dynamic weight adjustment mechanism, achieving an optimisation efficiency of 0.98. This work addresses key technical bottlenecks in digital calligraphy generation, providing a practical tool for cultural heritage preservation and a transferable framework for other structured art generation tasks, thereby advancing the integration of artificial intelligence with traditional arts. Keywords: decoupled representation learning; content-aware; generative adversarial networks; GANs; digital art generation; deep learning.
Abstract: To enhance pop music creation, this study proposes an automatic accompaniment generation method combining sliding window technology with the MuseFlow model. The sliding window segments long music sequences into short-time overlapping frames, balancing time and frequency resolution to capture local signal characteristics. MuseFlow employs an enhanced bidirectional mapping architecture and training objectives to accurately model complex relationships in multi-track music data. Experimental results show that MuseFlow achieves Frechet inception distance (FID) scores of 26.3 on the POP909 dataset and 25.4 on the FreeMidi dataset, significantly outperforming baseline models. These findings demonstrate that the proposed method generates high-quality, diverse accompaniments compatible with main melodies, providing an efficient tool for music creators. Keywords: MuseFlow; sliding windows; SWs; popular music accompaniment; STFT; audio quality; bass track generation; multi-track coordination.
Abstract: To address voltage unbalance induced by three-phase load discrepancies in rural areas, this paper proposes a voltage compensation technology utilising hybrid photovoltaic-energy storage (PV-ESS) inverters. Data analysis from 12 typical distribution stations indicates an average three-phase load unbalance of 18.7% and a maximum phase-voltage deviation of 7.2%, contributing to a 35% rise in user-side equipment failure rates. The study employs a collaborative PV-ESS control strategy that dynamically modulates inverter output by monitoring three-phase currents alongside real-time active and reactive power. A three-month pilot verification involving 500 households in a distribution area demonstrated that voltage unbalance dropped from 15.3% to 2.1%, the power factor improved from 0.82 to 0.96, line losses decreased by 12.8%, and the user-side voltage compliance rate rose from 92.1% to 98.7%. Through optimised charge-discharge strategies, the technology achieves a PV self-consumption rate exceeding 85%, effectively mitigating heavy loads on distribution transformers. This study provides a quantifiable technical solution for rural grid voltage regulation, with empirical data validating its significant compensation efficacy. Keywords: three-phase load imbalance; PV-ESS inverter; voltage unbalance compensation; data analysis; distribution network optimisation. DOI: 10.1504/IJICT.2026.10077723
Abstract: To address the fragmentation in tourist need identification and the disconnect between a multi-model fusion analysis method is proposed. This approach uses a bidirectional long short-term memory (Bi-LSTM) network to extract semantics from review texts and a latent Dirichlet allocation (LDA) model to identify core topics. A spatiotemporal cube structure maps emotional labels to spatiotemporal coordinates, quantifying experiential differences and optimising tourist group segmentation. Experimental results showed that five themes from both positive and negative comments and five from negative comments were well-separated, effectively reflecting dimensional differences in tourist feedback. Multiple regression models indicated varied group preferences, with one group favouring architectural features (preference coefficient of 1.820) and another prioritising affordability (preference coefficient of 2.186). The overall prediction accuracy of the model is 0.82. The research results provide data-driven decision-making basis for precise service design and resource optimisation allocation in scenic spots. Keywords: scenic area management; multi-clustering; data mining; tourist behaviour; latent Dirichlet allocation; LDA. DOI: 10.1504/IJICT.2026.10077728
Abstract: With the rapid growth of global tourism, Xiamen has drawn attention for its sustainable tourism development. This study applies GIS analysis, an improved CNN model, grey correlation analysis, and a Swin transformer to analyse the spatial distribution and influencing factors of Xiamen's tourism industry. The enhanced CNN performs well in spatial analysis, accurately identifying tourism resource distribution. Results show the improved CNN achieves an accuracy of 0.953, recall of 0.947, F1-score of 0.950, MSE of 0.039, and MAE of 0.201. The values are all superior to those of the multilayer perceptron (MLP) and the long short-term memory network (LSTM). The Swin transformer also excels in predicting employment impact and resource efficiency, with accuracies of 0.882 (energy consumption), 0.856 (resource recovery), 0.874 (water use efficiency), and 0.863 (waste management). Its performance is also superior to that of the vision transformer (ViT) and data-efficient image transformers (DeiT) models. The findings indicate the improved CNN effectively captures spatial distribution patterns, while the Swin transformer reliably predicts employment and resource utilisation outcomes. This research provides a valuable basis for policymaking and sustainable development of Xiamen. Keywords: Xiamen's tourism industry; XMTI; GIS spatial analysis; improved CNN model; Swin transformer model; grey correlation analysis; GCA. DOI: 10.1504/IJICT.2026.10077766
Abstract: Building energy efficiency management is crucial for sustainable development amid global energy challenges. This study integrates big data analytics and artificial intelligence to develop an intelligent scheduling system for building energy optimisation. Using long short-term memory (LSTM) networks, a deep learning model was trained on multi-source data including energy consumption, weather forecasts, and pedestrian flow, achieving over 95% prediction accuracy. The system dynamically adjusts building equipment operations based on predictive outcomes, reducing overall energy consumption by 20%. Experimental results demonstrate significant economic benefits and enhanced energy efficiency. The research also explores broader applications of AI in energy management, such as equipment failure prediction and performance evaluation. This work provides a novel technological pathway for green building development and supports global sustainability goals. Keywords: big data analytics; artificial intelligence; building energy efficiency; intelligent scheduling; deep learning. DOI: 10.1504/IJICT.2026.10077974
Abstract: To address the shortcomings of neural machine translation in handling complex sentences and terminology, this paper proposes a translation quality improvement model based on the quantum-optimised osprey optimisation algorithm (QOOA). This model integrates quantum computing and metaheuristic algorithms, enhancing population diversity through qubit encoding, dynamically adjusting individual positions using a quantum rotation gate strategy to balance global exploration and local exploitation, and constructing a multi-objective fitness function that combines semantic similarity and syntactic complexity. Experiments on the WMT2018 English-Chinese dataset show that, compared to the baseline model, this method improves the BLEU score by 3.2 percentage points and reduces the TER by 12.7%, significantly reducing translation confusion. The results demonstrate that QOOA effectively improves translation quality, especially in long sentences and technical texts. Keywords: quantum optimisation; osprey algorithm; machine translation; parameter optimisation; BLEU index; meta-heuristic algorithm. DOI: 10.1504/IJICT.2026.10077980
Abstract: In today's globalised and technology-driven world, improving spoken English is increasingly important. However, traditional automatic speech recognition (ASR) systems often produce outputs with grammatical errors, poor word choices, and pronunciation ambiguities, hindering effective communication. To address this, we propose MTG-ERR, a novel multimodal transformer-GCN framework that integrates acoustic and textual information for real-time and accurate spoken English error correction. The model uses a transformer-based acoustic encoder to capture temporal speech features and a GCN-based module with dependency syntactic trees to model grammatical structures. A dynamic fusion mechanism effectively combines both modalities, significantly enhancing error correction. Experiments on the L2-ARCTIC and LibriSpeech corpora show our framework outperforms baseline models, achieving a 92.7% F1-score in grammatical error correction. Ablation studies confirm that incorporating grammatical information improves performance on long, complex sentences by 12.1% in F1-score. With an average response latency under 320 ms, the system meets real-time interactive requirements. This research provides valuable insights for developing robust spoken language assistance systems, with significant potential for educational and commercial applications. Keywords: oral error correction; multimodal learning; transformer; graph convolutional networks; GCNs; real-time systems; grammatical dependency analysis. DOI: 10.1504/IJICT.2026.10078001
Abstract: As the scale of university graduates has continued to expand, the evaluation of employability has become a critical issue in higher education management and talent cultivation. This study aimed to develop a scientific, quantitative, and multidimensional method for assessing graduate employability. An employability indicator system was constructed using the analytic hierarchy process (AHP). The results indicated that professional competence had the highest weight (0.55). Within this dimension, technical operation and experimental skills, the application of theoretical knowledge, and the quality of project experience contributed most significantly to employment competitiveness. A back propagation neural network (BPNN) model was further applied to train and predict the sample data. The results demonstrated a high level of consistency between the predicted values and the actual values. The absolute error ranged from 0.03 to 0.11, the relative error remained below 2.12%, and the overall accuracy reached 0.926. Universities should strengthen professional practice and innovation capacity development and given to enhancing students' professional qualities to improve overall employment competitiveness. The main contribution of this study provides a decision-making reference for higher education management, career guidance, and policy formulation. Keywords: back propagation neural network; BPNN; neural network model; university graduates; employability. DOI: 10.1504/IJICT.2026.10077937
Abstract: As the belt and road project continues to proliferate, the ChinaLaos Railway appears as a connector between infrastructure and a place of cultural exchange. This paper examines the architecture of an interactive three-dimensional (3D) book that can be fuelled with Artificial Intelligence and shows the geography, ethnic culture, and transport development along the railway. The study improves visual representation, multimedia incorporation by using AIGC image generation and layout, and interactive design tools. A four-dimensional framework is created, which is called railway culture, spatial structure, interactive experience and AI generation mechanism to maximise visual narratives and dynamic content. The results show that AI is a way of enhancing design efficiency and creative behaviours as it is a new avenue of merging cultural communication with technology using conventional paper-based media. Keywords: China-Laos Railway; three-dimensional book design; artificial intelligence; AI; interactive narrative; cultural communication; visual expression. DOI: 10.1504/IJICT.2026.10077981
Abstract: Student mental health issues are becoming increasingly severe, yet traditional scale assessments suffer from limitations such as high subjectivity and delayed feedback. To address this challenge, this paper proposes an intelligent evaluation framework multi-branch adaptive social-emotional fusion network that integrates social-emotional analysis with multi-branch neural networks. This framework continuously and seamlessly integrates multidimensional digital footprints generated by students in campus life (e.g., text, voice, behavioural patterns) to enable dynamic psychological risk assessment. In this work, digital footprints refer to passively collected multimodal data from students' daily digital activities, including text messages, voice recordings, smartphone sensor logs, and social interaction records. Experiments on the public studentlife dataset demonstrate that the proposed method achieves a key evaluation metric area under the receiver operating characteristic curve of 0.927, surpassing mainstream unimodal models by over 3.2%. It also achieves an overall accuracy of 89.7% and passes statistical significance tests. This confirms the effectiveness and feasibility of utilising multi-source socio-emotional signals for early, objective intervention. Keywords: student mental health; social-emotional analysis; multi-branch neural networks; intelligent assessment; digital footprint. DOI: 10.1504/IJICT.2026.10077722
Abstract: Complex occlusions and rapid movements in basketball make tracking difficult, but deep learning-based visual computing provides effective new solutions. This study proposes an object tracking method that integrates the YOLOv5 model with the simple online and realtime tracking (SORT) algorithm. To address the challenge of multi-source information fusion, a cross-modal transformer model was designed to achieve adaptive deep integration of visual and motion data. Experiments utilised the public SportsMOT dataset, featuring 240 HD clips from real games across diverse arenas, lighting, and tactics. Validation on datasets shows the algorithm achieved a recall of 0.97 and a precision of 97.2%, with the mean average precision improving by 15% over the baseline. The multiple object tracking accuracy and precision reached 98.1% and 96.2% respectively. The algorithm thus proves to be an efficient and accurate tracking solution, offering robust data for coaching analysis and strategy. Keywords: tracking method; basketball players; multi-source data; attention mechanism; DeepSORT algorithm. DOI: 10.1504/IJICT.2026.10077727
Abstract: This paper proposes a personalised English teaching knowledge recommendation system based on the MoE-RAG algorithm. The system integrates a mixture of experts (MoE) architecture with eight specialised sub-models and a sparse gated network to dynamically select the most relevant experts for each query. Combined with a retrieval-augmented generation (RAG) module, it retrieves relevant knowledge from multiple bases and fuses it with expert output via a transformer-based generator to produce personalised recommendations. This approach effectively addresses cold-start issues and enhances interpretability. Experiments on 10,000 interaction records from 500 students show that MoE-RAG significantly outperforms traditional models (e.g., collaborative filtering), achieving 87.5% accuracy, 90.2% precision, 84.1% recall, and an 87.0% F1-score. Through a real-time feedback and reinforcement learning mechanism, the system dynamically adjusts resources and optimises learning paths, demonstrating strong adaptability across different learning stages and improving student engagement, time optimisation, and satisfaction. This system promotes the intelligent, personalised development of English education. Keywords: MoE algorithm; RAG algorithm; personalised recommendation system; English teaching. DOI: 10.1504/IJICT.2026.10077815
Abstract: This paper proposes a GAN-LSTM model for predicting brand value fluctuations and managing risks in rural characteristic industries. The model integrates generative adversarial networks to enhance limited data samples and long short-term memory networks to capture long-term dependencies in time-series data, improving prediction accuracy and robustness. Analysing a decade of data from traditional agriculture, tourism, handicrafts, and local food across multiple provinces, the study reveals distinct fluctuation patterns. Prediction errors were minimal, with handicrafts at -4.00% and local food at 3.16%. The GAN-LSTM model outperformed traditional and basic LSTM methods, reducing average prediction errors by 15% and 8%, respectively. It also provides quantified risk assessments, achieving up to 92% prediction accuracy and 90% risk management effectiveness. The findings offer theoretical guidance and practical support for the sustainable development of rural industries. Keywords: GAN-LSTM model; fluctuations in brand value; risk management; rural characteristic industries. DOI: 10.1504/IJICT.2026.10077858
Abstract: Aiming at the problem of prediction deviation caused by ignoring deep semantic information in online marketing effect perception, this study proposes an innovative framework that deeply integrates neural networks and multi-level semantic mining. Traditional methods mostly rely on shallow interaction features, making it difficult to capture complex intentions in texts. Our model achieves deep understanding and alignment of user preferences and product connotations through collaborative fine-tuning of pre-trained language models and graph neural networks. Experiments on public datasets show that, compared with mainstream baseline models, this framework has increased the area under the receiver operating characteristic curve for click-through rate prediction by 2.1% and the ranking metric normalised discounted cumulative gain @10 by 4.7%. All improvements are statistically significant (p < 0.01). This research provides an effective approach for building a more precise and interpretable intelligent marketing system. Keywords: online marketing; deep neural networks; DNNs; semantic mining; effect perception; recommendation systems. DOI: 10.1504/IJICT.2026.10077936
Abstract: This study proposes a scenariobased stochastic optimisation framework for the optimal placement and sizing of energy storage systems (ESS) in distribution networks. The model integrates ICTenabled data acquisition and communication infrastructures to process realtime load and renewable energy data. A complete mixedinteger linear programming (MILP) formulation is developed, incorporating power balance, ESS dynamics, and network operational constraints across multiple uncertainty scenarios. The proposed method is validated on a real distribution network case study, demonstrating operational cost reductions, improved grid stability, and enhanced renewable energy utilisation compared with deterministic approaches. Keywords: energy storage systems; ESS; distribution networks; stochastic optimisation; mixed-integer linear programming; MILP. DOI: 10.1504/IJICT.2026.10078003
Abstract: This paper addresses the prevalent issue of 'over-correction' in current AI English writing tools - where correct personalised expressions are often misidentified as errors - by proposing an innovative adversarial generative error correction system. This system mimics the 'teacher-student interaction' mechanism: one network attempts to modify sentences, while another network judges the necessity of such modifications, thereby achieving more precise error correction. For instance, a system might incorrectly 'correct' a stylistically chosen active-voice sentence (e.g., 'our team analysed the data') into a passive construction ('the data was analysed by our team'), thereby altering the author's intended emphasis. Another common over-correction involves replacing a correctly used but less frequent disciplinary term with a more common, yet less precise, synonym. In public dataset evaluations, the system achieves an 89.5% correction accuracy - a significant improvement over traditional rule-based methods (approximately 70.2%) - while maintaining an over-correction rate of only 12.1%, substantially lower than that of a general-purpose large model (approximately 35.7%). This demonstrates the advantages of adversarial generation methods in understanding writing intent and context, providing an effective pathway for developing smarter, more human-like writing assistance tools. Keywords: English writing assistance; adversarial generative networks; grammar correction; overcorrection. DOI: 10.1504/IJICT.2026.10077721
Abstract: To address the challenges of data silos and privacy in cross-institutional collaboration, this study introduces a secure data collaboration framework combining federated learning (FL) and differential privacy (DP). The framework enables collaborative model training by keeping data local while using client-side DP to counter privacy threats like membership inference attacks. An adaptive privacy budget allocation (APBA) strategy further optimises the utility-privacy balance. Evaluations on real educational datasets show the framework maintains strong privacy (attack success <5%), achieves a 95% F1 score - comparable to centralised training - and improves communication efficiency by ~40%. This work provides a technical foundation for building secure and efficient cross-institutional platforms in innovation and entrepreneurship education. Keywords: federated learning; FL; differential privacy; DP; data security; innovation and entrepreneurship education; collaboration mechanism; privacy protection. DOI: 10.1504/IJICT.2026.10077814
Abstract: In the current era of deeply integrated global supply chains, new-quality productivity acts as a powerful engine, continuously driving the high-quality development of the transportation and logistics sectors. This study focuses on innovative productivity, technical productivity, and green productivity as core dimensions of new-quality productivity. By gathering panel data from 31 provinces through multiple sources, an empirical model is constructed to analyse the influence mechanisms of each productivity component on the evolution of railway logistics. The findings reveal that innovation-driven productivity encourages railway logistics enterprises to continually explore new service models. Meanwhile, digital and technical productivity, along with regional e-commerce development levels, exert a substantial impact on railway logistics and facilitate industrial synergy through optimised resource integration. The results not only offer a foundation for enterprises to refine investment strategies and enhance operational development, but also provide valuable insights for policymakers to formulate targeted industrial policies and promote the dynamic growth of new logistics business forms. Keywords: new quality productivity; railway logistics; factor synergy. DOI: 10.1504/IJICT.2026.10077726
Abstract: To achieve dynamic attitude monitoring of shipboard equipment and overcome the limited generality of traditional single-view methods, this paper proposes a multi-feature-point, multi-view attitude measurement approach. The method is designed as a redundant supplement to existing attitude measurement systems, providing additional attitude information under complex operating conditions. An improved MobileNet-v2 network incorporating squeeze-and-excitation modules is employed for feature point extraction from equipment images, achieving a detection accuracy of 95.4% on the test set. Based on multi-view observations and known equipment geometry, the three-dimensional coordinates of feature points in the camera coordinate system are reconstructed, and the equipment attitude is estimated by solving the rotation matrix between the camera and body coordinate systems using singular value decomposition. Experimental validation on a small-scale multirotor UAV platform demonstrates root mean square errors of 0.708° in pitch, 0.833° in roll, and 0.593° in heading, indicating good potential for attitude monitoring of large-scale ship equipment. Keywords: shipboard drones; attitude measurement; MobileNet-v2; multi-view. DOI: 10.1504/IJICT.2026.10077859
Abstract: Social network data has become a vital resource driving product innovation and design. Current research struggles to fully uncover users' emotional needs toward products when dealing with unstructured, high-dimensional social data, resulting in subpar product quality. To address this, this paper first employs a multi-scale attention network to analyse product emotional needs, capturing users' emotional demands. Subsequently, a spatial cross-reconstruction module is designed within the generative adversarial network to obtain more refined features. Simultaneously, a semantic correlation attention module is designed for mapping emotional needs to product images. This extracts attribute and word encodings as semantic representations to guide image generation, enhancing semantic consistency between emotional needs and visual content. Experimental results demonstrate that the proposed method achieves 92.71% accuracy in emotional need recognition and an FID of 11.88 for product images, outperforming state-of-the-art methods and delivering outstanding performance in innovative product design tasks. Keywords: innovative product design; social network; deep learning; emotional needs analysis; generative adversarial network; GAN. DOI: 10.1504/IJICT.2026.10077935
Abstract: It is a challenge in education to evaluate the actual learning progress of students accurately and dynamically. The existing evaluation methods mainly rely on static exam scores, so that they cannot reflect the competency development of students' learning process in a timely manner. This paper proposes a novel intelligent evaluation model. Based on the meta-learning technology, the model can adapt to various course requirements and combine the evaluation experience of multiple virtual educational specialists. Experiments show that the system can realise the dynamic learning process and realise precise evaluation of the learning process of students. The results show that the accuracy rate is 89.2% and the error rate is only 7.8%. In the early stage, the model achieves a high accuracy of 80%. The model overcomes the lag effect of the existing evaluation method and helps educators to get timely information for education improvement. Keywords: meta-learning; knowledge distillation; outcome-based education; OBE; dynamic evaluation. DOI: 10.1504/IJICT.2026.10077765
Abstract: Lexical ambiguity is a core challenge in machine translation, such as translating 'apple' as either 'fruit' or 'Apple Inc.' depending on context. While existing neural machine translation models produce fluent output, they often exhibit bias in selecting specialised terminology and low-frequency words. To address this issue, this study innovatively combines statistical patterns from large-scale corpora with the probabilistic modelling capabilities of neural networks to construct a lexical selection optimisation framework. Experiments on the publicly available workshop on machine translation English-German translation dataset demonstrate that this approach improves the bilingual evaluation understudy score from 31.2 to 33.3 while significantly reducing the translation error rate from 52.1% to 49.8%. This confirms that integrating statistical prior knowledge effectively enhances machine translation accuracy and lexical consistency. Keywords: machine translation; lexical choice corpus statistics; probabilistic modelling. DOI: 10.1504/IJICT.2026.10077720
Abstract: Traditional course recommendation methods lack flexibility and perform poorly, as their quality relies heavily on historical datasets - making them ill-suited for complex-attribute tasks. This study integrates collaborative filtering (CF) with deep learning, leveraging item-based recommendation timeliness to boost accuracy. Experimental outcomes indicate that the proposed algorithm achieves a mean square error of 0.572, whereas the mean square error for content-based recommendation algorithms and singular value decomposition methods increase by 0.076 and 0.099. The core new insight lies in the bidirectional empowerment fusion architecture of CF and deep learning (DL): integrating CF-derived course similarity as a priori information into the DL model's input and attention mechanism, which not only solves CF's sensitivity to sparse data but also compensates for DL's poor interpretability and reliance on massive data. This provides a novel hybrid recommendation paradigm for addressing personalised course recommendation issues in higher vocational colleges. Keywords: deep learning; course recommendation; collaborative filtering; recurrent neural network; RNN; higher vocational colleges. DOI: 10.1504/IJICT.2026.10077725
Abstract: This study tackles inefficient resource allocation in concurrent English translation teaching platforms by proposing the C-GHM model. This model integrates a greedy heuristic task migration algorithm with multi-dimensional corpus features. It constructs a priority evaluation system (using vectors like task professionalism and syntactic complexity) and a simulated annealing optimisation layer for intelligent computing resource allocation. Experimental results show C-GHM significantly outperforms traditional algorithms: reducing average task completion time to 125.3 seconds, increasing throughput to 45.2 tasks/second, and optimising load imbalance to 0.15. It also excels in robustness, energy efficiency, and scalability tests. Its core contribution is a transferable, collaborative scheduling framework that synergistically combines greedy heuristics, corpus features, and simulated annealing, achieving superior performance in heterogeneous task environments, with potential applications beyond translation platforms. Keywords: greedy heuristic task migration algorithm; English translation teaching platform; resource scheduling optimisation; corpus-driven; multi-objective optimisation. DOI: 10.1504/IJICT.2026.10077857
Abstract: This paper proposes a federated learning framework integrated with adaptive graph convolution for accurate and privacy-preserving carbon emission calculation in cross-regional power grids. It addresses data silos and privacy concerns by training models locally, avoiding raw data transfer. The adaptive graph convolution component automatically captures the dynamic spatial dependencies and carbon flow effects between grid regions. Validated on a Chinese grid dataset, the method reduces calculation errors by 22.3% and 14.7% compared to centralised and traditional distributed approaches, respectively, while demonstrating strong robustness against grid topology and operational fluctuations. Keywords: federated learning; adaptive graph convolutional networks; grid carbon emissions; collaborative computing; privacy protection. DOI: 10.1504/IJICT.2026.10078002
Abstract: Vocational skill assessment needs decisions that are accurate, explainable, and stable when rubrics change. To reduce hidden contradictions and cohort drift in evidence-driven assessment, this paper proposes a joint management approach that integrates a knowledge graph with rule-checked reasoning and distribution monitoring. First, indicator dependencies and evidence links are organised into a graph to keep decisions traceable. Then, explicit rules are validated and applied together with model outputs to prevent inconsistent judgements. Finally, evidence embeddings are analysed with manifold and density diagnostics to reveal sparse regions and guide targeted governance tests. Experiments on five cohorts and large-scale interaction logs show that decision accuracy increases by 3.4% points, rule-consistency improves from 88.1% to 96.7%, and uncovered rule paths decrease by 41% compared with strong baselines. The approach supports maintainable, audit-ready assessment at scale. Keywords: vocational skill assessment; knowledge graph; rule-based reasoning; evidence governance; embedding manifold; robustness monitoring. DOI: 10.1504/IJICT.2026.10077813
Abstract: Accurately attributing Chinese second language grammatical errors is crucial for optimising teaching strategies. However, traditional methods are prone to being disturbed by confounding factors such as the learner's level, making it difficult to distinguish between superficial correlation and true causation. To address this, this paper introduces the framework of counterfactual causal inference for the first time. By simulating 'correction' interventions on specific grammatical points, it aims to identify the root causes of the errors. Experiments based on a large-scale public Chinese proficiency test dynamic composition corpus show that this method achieves an accuracy rate of 87.5% in error attribution, an improvement of 8.2% over the best baseline model; its causal effect ranking quality reaches 0.92, significantly outperforming traditional correlation analysis. This method provides interpretable and verifiable causal insights for Chinese second language teaching, and can directly serve the construction of personalised learning paths. Keywords: second language acquisition; attribution of grammatical errors; dual machine learning; DML. DOI: 10.1504/IJICT.2026.10077934
Abstract: Foreign language anxiety is a key psychological barrier to language acquisition. In this study, we propose a dynamic mitigation framework based on multi-modal conditional generative adversarial networks, which uses cross-modal transformers to fuse speech, text, and facial features in real time to identify anxiety states, and conditionalise them to generate text rewriting, speech adjustment, and visual guidance feedback. Experiments show that the anxiety recognition accuracy of the system reaches 85.3%, and the naturalness score of the generated feedback is significantly better than the baseline (mean opinion score 4.2). Longitudinal studies found that participants who used the system had an average 30% decrease in state anxiety scores and a 25% increase in spoken fluency. This study provides an effective paradigm for developing personalised and adaptive emotion regulation systems. Keywords: adversarial networks; dynamic mitigation; affective computing; conditional generation. DOI: 10.1504/IJICT.2026.10077719
Abstract: While neural machine translation facilitates effective cross-lingual information transfer, existing lightweight architectures continue to encounter significant challenges in structural modelling precision and decoding stability. They struggle with long-range and syntactic dependencies, shallow attention limits fine-grained structure representation, and compressed architectures often cause semantic drift or repetition. To address these issues, we propose LDCIR-Trans, a lightweight structure-aware translation model. It introduces essential structural priors and a stable decoding mechanism while remaining compact. First, a dependency graph modelling (DGM) module explicitly constructs dependency graphs to supply syntactic cues and compensate for limited global modelling capacity. Second, a dependency-constrained iterative refinement (DCIR) mechanism guides decoding with structure-enhanced signals, enables progressive correction, and reduces semantic deviation. Finally, the lightweight structure-aware decoder (LSAD) employs parameter sharing and distribution calibration to improve representation and stability. Experiments show that LDCIR-Trans achieves high efficiency and significantly outperforms existing lightweight baselines. Keywords: neural machine translation; NMT; dependency graph modelling; DGM; gate-augmented iterative update; lightweight structure-aware decoder; LSAD. DOI: 10.1504/IJICT.2026.10077724
Abstract: Accurate photovoltaic (PV) model parameter identification is crucial for reliable simulation and maximum power point tracking (MPPT). To address common metaheuristic shortcomings like sensitivity to initialisation and premature convergence, this study proposes a modified improved crested porcupine optimiser (MICPO) featuring a dynamic balancing framework. MICPO integrates chaotic reverse learning, optimal value-guided search, and polynomial differential learning to maintain a robust global-local search balance. Validated on single- and double-diode models, MICPO achieves state-of-the-art accuracy (e.g., RMSE of 9.8602E-04) with faster, more stable convergence. Its superior generalisation is further demonstrated on the CEC2017 benchmark and commercial PV modules under varying conditions. Results confirm MICPO as a highly accurate, efficient, and robust solution for practical PV parameter extraction. Keywords: PV cell parameter identification; single-diode and double-diode model; modified improved crested porcupine optimisation; MICPO; algorithm; dynamic balancing framework. DOI: 10.1504/IJICT.2026.10077856
Abstract: Combining personalised expression with high-fidelity geometric reconstruction has been a challenging task. To generate realistic virtual humans, this paper proposes an effective new framework. By combining users' emotional preferences with implicit neural radiance fields, personalised virtual human bodies are generated. This method encodes multi-modal user input into structured conditional variables, and then guides the conditional neural radiance field model to generate facial images with emotional expressiveness. The innovative learnable user-specific embeddings can capture individual expression styles. Additionally, the attention-based fusion module ensures precise alignment between emotional semantics and facial details. Through experiments on standard datasets, the proposed method achieved a fréchet inception distance score of 15.38 and an emotion recognition accuracy of 0.892, significantly outperforming three baseline approaches. These results demonstrate its substantial advantages in emotional accuracy, identity preservation, and overall visual quality. Keywords: virtual human generation; implicit neural radiance fields; affective computing; conditional generation. DOI: 10.1504/IJICT.2026.10078000 |
Open Access
