Forthcoming Articles
International Journal of Information and Communication Technology

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.
Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.
Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.
Online First articles are also listed here. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.
Register for our alerting service, which notifies you by email when new issues are published online.
International Journal of Information and Communication Technology (16 papers in press) Regular Issues
Abstract: The amount of garbage continues to rise, making intelligent garbage classification increasingly important for future resource recovery. Current methods still rely largely on static images, which perform poorly in dynamic real-world settings. Moreover, practical applications such as surveillance cameras or mobile inspections face additional challenges including computational efficiency, scene diversity, and long-term robustness, which traditional approaches cannot adequately address. This paper presents a real-time garbage classification framework suitable for both image and video surveillance. We design an encoder-decoder structure that eliminates matrix multiplication, significantly reducing computational cost. Additionally, we introduce a dynamic tanh (DyT) layer to enhance normalisation, replace the traditional feedforward module with a Kolmogorov-Arnold network (KAN) for better interpretability of features, and employ dense layers without matrix multiplication to further boost efficiency. Experiments demonstrate that our method achieves an effective balance of accuracy, computational cost, and robustness, making it well-suited for complex, dynamic garbage detection scenarios. Keywords: waste sorting; image; MatMul-free; transformer; dynamic tanh; DyT; Kolmogorov-Arnold network; KAN. DOI: 10.1504/IJICT.2026.10078892
Abstract: This study addresses the limitations of existing UAV logistics research, which often neglects communication instability and multiple disturbances in real-time path planning. The objective is to optimise coordinated UAV paths by considering line crossing and multifactorial disruptions to improve delivery efficiency and system robustness. The methodology enhances a simulated annealing algorithm with a sub-path local search operator, integrated with a communication sensing strategy and energy consumption model. Results from six test cases show the improved algorithm reduces average distance by 3.135.41%, computation time by 13.617.6%, and boosts path quality by 11.5% in communication-hostile settings. The algorithm effectively manages disturbances like wind, obstacles, and demand changes, enhancing stability and adaptability. This research offers a novel co optimisation method for UAV coordination, balancing communication and energy efficiency, with significant practical value for urban delivery systems. Keywords: terminal logistics; route planning; cruising range; simulated annealing algorithm. DOI: 10.1504/IJICT.2026.10078893
Abstract: To address the issue of existing spelling correction methods neglecting semantic relevance in academic English translation quality analysis, a semantic spelling correction algorithm is proposed. The pipeline first performs pre-analytic validation to reduce spelling and semantic noise, then conducts coherence, grammaticality, and terminology assessments using learned features. A curated corpus comprising 20,000 academic text fragments is utilised for training and evaluation, and benchmark baselines are included to ensure methodological comparability. Quantitative results demonstrate that the designed algorithm achieves a spelling error recognition rate of 93.45% and a processing speed of 240.78 words/second, significantly improving the accuracy and efficiency of spelling correction, while maintaining semantic integrity (cosine similarity 0.87), which is significant for improving the quality of academic English translation. The work reframes correction as methodological infrastructure within quality analysis, integrating a semantic-aware module that safeguards metric fidelity before analytic scoring. Keywords: academic English translation; spelling correction algorithm; semantic analysis; academic quality assessment; deep learning model. DOI: 10.1504/IJICT.2026.10078894
Abstract: In response to the challenges faced by public art design in the urbanisation process, such as low efficiency and insufficient diversity, as well as the issues of black-boxing and poor controllability inherent in traditional generative adversarial networks (GANs), this paper proposes an innovative algorithm that integrates GANs with parametric design. The algorithm aims to decouple structure and style through a dual-branch generator, achieve dynamic regulation through a parametric attention fusion module, and enhance authenticity with a multi-scale discriminator. The dual-branch generator architecture is employed to separately process structural primitives and stylistic details of the form, while introducing a parametric attention fusion module to dynamically modulate the feature fusion process. Additionally, a multi-scale discriminator and decoupling loss function are incorporated to improve generation quality and stability. Experimental results show a FID of 15.2, an inception score (IS) of 8.9, a peak signal-to-noise ratio (PSNR) of 28.5 dB, a robustness FID of only 12.5%, and a user rating of 4.3 Keywords: generative adversarial network; GAN; public art; generation algorithm; parametric design; double branch generator. DOI: 10.1504/IJICT.2026.10078895
Abstract: With societal development, traditional community planning relies on designers experience and static norms, failing to dynamically address residents diverse subjective needs. Existing generative adversarial network (GAN) models aid design but suffer from pixel blurring, structural defects and insufficient UX semantic guidance. This study integrates GIS environmental features, embedded coding social features and scale-analysed UX features; via a gated cross-attention mechanism, it enables the generator to focus on key spatial elements. Multi-discriminator collaboration and cycle consistency loss (CCL) ensure generation quality at pixel, structural and semantic levels. SCD-20k experiments show the peak signal-to-noise ratio of Sustainable Community user experience generation (SCUE-Gen) is 34.6 dB. Its experience vector similarity (EVS) is 0.89, outperforming benchmarks like cGAN and Pix2Pix; professional and non-professional satisfaction both exceed 8.8. The framework fits urban planning workflows, offering iterative schemes balancing professional norms and residents needs, data-driven support for sustainable communities, and interdisciplinary backing for humanistic smart cities. Keywords: fused generative adversarial network; attention model; community user experience; visual attention mechanism; multimodal conditions. DOI: 10.1504/IJICT.2026.10078896
Abstract: This study develops transformer-based multimodal intelligent evaluation model for college Russian translation instruction, tackling frequent pragmatic failures and context deficiency that lead to lagged feedback. It leverages XLM-RoBERTa to capture Russians intricate morphological and syntactic features, adopts ViT for global visual context of accompanying images, and uses multi-head cross-attention (MCA) to deeply integrate and calibrate textual-visual semantics. Experiments on the improved multimodal corpus based on Wikipedia-based image text (WIT) show that the models Pearson correlation coefficient to gauge the consistency of scoring is as high as 0.835, and the error diagnosis reaches 89.4%. In identifying high-order pragmatic inconsistency errors, compared with the ResNet+XLM-R (naive fusion) baseline model, its F1-score significantly improved to 0.87 (p < 0.01), with statistical significance verified. Highly consistent with expert scores in real teaching, the model proves valuable for accurate teacher feedback and cultivating students text-image integrated translation thinking. Keywords: multimodal representation learning; Russian translation teaching; translation intelligence evaluation; XLM-RoBERTa; vision transformer. DOI: 10.1504/IJICT.2026.10078897
Abstract: This study addresses the critical challenge of automatically harvesting high-quality Korean language teaching resources from the open web, where existing methods focus on topical relevance rather than pedagogical suitability. The study proposes a novel quality-aware adaptive crawling and cleansing framework. It integrates a real-time linguistic quality assessment module, powered by universal dependencies parsing, with an adaptive crawling strategy driven by a contextual bandit algorithm. Experimental results demonstrate that quality-aware adaptive crawling and cleansing framework significantly outperforms current state-of-the-art methods. It achieves a high-quality page acquisition rate of 7.47 pages per hour (a 39% improvement), a pedagogical precision of 0.892, and a top-ranking accuracy of 0.915. The framework successfully bridges linguistic theory and web mining, offering an effective solution for building structured, high-quality pedagogical resource repositories. Keywords: adaptive web crawling; quality assessment; universal dependencies; resource cleansing. DOI: 10.1504/IJICT.2026.10078898
Abstract: With the rapid growth of energy literature data, accurately mining semantic associations between keywords and dynamically tracking topic evolution patterns is a key challenge. This paper proposes an algorithm for keyword association mining and topic evolution analysis for knowledge graphs in the energy field. The model integrates topic modelling, graph neural networks and time series analysis. It combines the topic probability distribution from BERTopic with the knowledge graph topology via a topic- graph coupling mechanism, uses a graph attention network to optimise association weights. Test results show the model outperforms baseline models like LDA and BERTopic in accuracy (91.2%), F1-score (0.892) and topic consistency (0.848). It also excels in robustness (F1-score drops 7.2% with 20% noise), interpretability (expert score 4.5/5) and generalisation (performance degradation 6.3%). These results verify the model's efficiency and reliability for practical energy knowledge analysis, providing support for energy policy evaluation and technology trend prediction. Keywords: knowledge graph of energy field; keyword association mining; theme evolution analysis; graph neural network; GNN; dynamic topic model; DTM. DOI: 10.1504/IJICT.2026.10078952
Abstract: English learning anxiety significantly impairs learners cognitive performance and language acquisition, yet existing interventions lack real-time responsiveness and personalisation. This paper introduces multimodal generative artificial intelligence for anxiety intervention in conversation, a multimodal generative artificial intelligence dialogue system that continuously perceives a learners anxiety level through audio, video, and text, and generates adaptive supportive responses to alleviate anxiety in real time. The system integrates a cross-modal transformer with bidirectional long short-term memory for anxiety perception, a conditional variational autoencoder for generating empathetic responses, and deep reinforcement learning to optimise when to intervene. A new English learning anxiety corpus comprising 120 real learners is constructed for training and evaluation. Experiments demonstrate that MAGIC significantly reduces self-reported anxiety (a = 0.31, p < 0.01) compared to baseline methods, confirming its effectiveness in providing timely and personalised emotional support. Keywords: multimodal perception; generative dialogue system; English learning anxiety; ELA; real-time intervention; cognitive load theory; CLT. DOI: 10.1504/IJICT.2026.10078953
Abstract: In the field of intelligent interaction and mental health screening, accurately identifying the emotions in singing is crucial. However, traditional methods rely solely on voice features, which are prone to misjudgment when recognising complex emotions (such as sarcastic or mixed emotions). To address this, this paper proposes a multimodal deep learning model that integrates singing and lyrics text, achieving deep collaboration of the two types of information through a cross-modal attention mechanism. Experiments show that the sentiment recognition accuracy of the model proposed in this paper reaches 81.3%, which is significantly higher than that of the model using only voice (accuracy 72.1%) and the early fusion method (accuracy 78.5%). Its comprehensive discrimination ability is also superior to the comparison baseline. This confirms that multimodal fusion can more comprehensively capture emotional cues, providing a reliable solution for achieving more refined human-computer interaction. Keywords: vocal emotion recognition; multimodal learning; attention mechanism; deep learning. DOI: 10.1504/IJICT.2026.10078954
Abstract: Peak-time congestion in smart scenic areas often concentrates at a few hotspots and spreads quickly, raising safety pressure and degrading visitor experience. To address dynamic diversion under non-stationary demand, this paper proposes a constrained deep reinforcement learning framework for real-time guidance. First, a graph-based encoder captures spatial spillover among attractions and corridors. Then, a spatiotemporal attention module anticipates short-horizon surges and stabilises decisions. Finally, constraint-aware learning keeps recommendations within safety margins while balancing waiting, load equity, and throughput. Experiments on calibrated peak-demand scenarios show that the proposed method reduces average waiting time from 29.2 to 24.6 minutes and cuts safety violation rate from 2.0% to 1.2% compared with a vanilla learning baseline. relative to rule-based control, waiting drops from 38.6 to 24.6 minutes and near-violation time decreases from 41.2 to 14.8 minutes. The framework delivers robust improvements with steadier operating behaviour under diverse demand regimes. Keywords: smart scenic area; visitor diversion; crowd management; constrained deep reinforcement learning. DOI: 10.1504/IJICT.2026.10078955
Abstract: Aiming at the problem that online learning systems are difficult to perceive and optimise learners internal states in real time, this paper proposes a dynamic optimisation algorithm based on cognitive-emotional load model and multimodal fusion. The algorithm constructed a theoretical framework of cognition and emotion collaborative computing, and used a hierarchical deep reinforcement learning architecture to realise the continuous space optimisation of intervention strategies. The dataset constructed in the simulation environment, compared with a variety of cutting-edge baselines, The algorithm can significantly improve the standardised learning benefit to 85.7, and reduce the incidence of harmful cognitive emotional overload events to 9.3%. This research provides a feasible path with both theory and technology for the construction of state adaptation intelligent education system. Keywords: cognitive emotional load; multi-modal fusion; deep reinforcement learning; DRL; online learning intervention. DOI: 10.1504/IJICT.2026.10079056
Abstract: To address the issues of insufficient real-time performance and accuracy in the communication of sports competitions, this study proposes a dynamic LSTM for sports event communication effectiveness (DLSTM-SEC) model based on dynamic long short-term memory (LSTM) to optimise the communication effect of sports events. The model combines compressed sensing feature selection and the momentum-improved adaptive moment estimation (Adam) algorithm to achieve efficient capture and rapid response to key events. Experimental results show that the DLSTM-SEC model achieves a test accuracy of 91.1% with an average loss value below 0.35. In terms of communication delay, 95% of the latency is controlled within 250 ms, and the 99th-percentile delay is no more than 450 ms. Under abnormal load conditions, the packet loss recovery rate remains above 92%. The results demonstrate that the model has stable real-time communication capability and data adaptability in complex dynamic scenarios, and can support the dynamic optimisation and efficient operation of sports event communication in an intelligent media environment. This study aims to provide a reliable competition communication tool for sports event organisers, intelligent media platforms, and audiences to improve real-time information acquisition and user experience, and realise more efficient event content push and interactive feedback. Keywords: long short-term memory; LSTM; Adam algorithm; sports events; intelligent media communication; interactive feedback. DOI: 10.1504/IJICT.2026.10079116
Abstract: The increasing demand for sustainable and technology-enabled education underscores the necessity of adaptive learning systems to meet diverse learner needs. Traditional static curricula struggle to support dynamic knowledge domains and personalised learning paths. This study presents a big data-driven personalised learning framework that integrates educational datasets and learner analytics from wearable-enabled environments to dynamically adjust content delivery. Experiments using real educational data and simulated interaction logs show that the proposed framework outperforms conventional static approaches, with a 32.6% improvement in learning gain, a 22% increase in quiz accuracy, and a 50% rise in learner engagement time. Comparative assessments against four existing adaptive models verify its superior effectiveness and robustness. The findings highlight the value of big data analytics and intelligent models in aligning academic learning with evolving industry skills. This framework offers a scalable, sustainable solution for modern education, supporting personalised, adaptive learning tailored to future skill requirements. Keywords: big data; personalised learning; adaptive content; educational data analytics; intelligent education. DOI: 10.1504/IJICT.2026.10079117
Abstract: In response to challenges in temporal modelling, heterogeneous data fusion, and recommendation transparency in student career development, this paper proposes a temporal knowledge graph method based on large language models and interpretable reasoning. The approach designs a dynamic graph with time intervals to capture skill evolution, builds a self-validating extraction pipeline to automatically extract temporal information from unstructured resumes, and integrates symbolic logic with vector matching for interpretable reasoning. Experiments on CareerHop demonstrate strong recommendation accuracy with area under the curve reaching 0.842, an 11.7% improvement over graph neural networks, and temporal extraction accuracy reaching 0.957, a 57% increase over rule-based baselines. This technical approach addresses limitations of static representations in capturing ability growth and provides an accurate, transparent solution for high-stakes career decisions. Keywords: temporal knowledge graph; explainable recommendation; large language model; LLM; person-job matching. DOI: 10.1504/IJICT.2026.10079189
Abstract: Traditional subjective observation for diagnosing students state in English classes suffers from bias, low efficiency, and inability to capture multi-dimensional information. This study builds a dynamic diagnosis system integrating text, audio, and video data. It employs an improved active learning algorithm with diversity constraints to optimise annotation, a semantic-state decoupling framework using triplet loss to reduce interference, feature alignment and cross-modal attention fusion to improve feature quality, and parallel deployment to accelerate response. Experimental results show multi-modal fusion achieves 90.5% accuracy, 8% higher than the best single modality. The active learning strategy yields 83.1% sample utilisation and 91.9% model accuracy within 180 minutes. The decoupling mechanism lowers diagnostic error to 9.0% in high-semantic complexity scenarios, and parallel deployment cuts response delay to 80ms. The system significantly enhances diagnostic accuracy and efficiency, providing reliable technical support for personalised and real-time teaching intervention in English classes. Keywords: students’ status in English class; dynamic diagnosis; multi-modal data fusion; feature decoupling; improved active learning. DOI: 10.1504/IJICT.2026.10079190 |
Open Access
