Forthcoming Articles

International Journal of Information and Communication Technology

International Journal of Information and Communication Technology (IJICT)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are also listed here. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Information and Communication Technology (28 papers in press)

Regular Issues

  •   Free full-text access Open AccessThree dimensional temperature field modelling and simulation for the deicing of post insulators
    ( Free Full-text Access ) CC-BY-NC-ND
    by Shenli Wang, Song Xie, Di Zhang, Jun Wu, Yi Shi, Jin Li, Zhenqiang Liao 
    Abstract: The icing of pillar insulators in actual substation operating environments can lead to potential hazards such as power outages. Effective deicing of pillar insulators is of great significance. As an emerging deicing method, jet heating deicing lacks existing models. This study innovatively explores the temperature field distribution through advanced numerical simulation methods based on COMSOL, unlike traditional methods that rely mainly on experimental measurements or simple theoretical models. It further describes the process of establishing a three-dimensional model in detail. Through simulation analysis, the three-dimensional temperature field distributions of post insulators under different working conditions or heat source parameters are obtained, visually presenting the change trends and distribution characteristics of temperature. The research results provide a theoretical basis for in-depth understanding of the heat transfer mechanism during the deicing process of post insulators.
    Keywords: pillar insulator; three-dimensional temperature field distribution; distribution characteristics; heat transfer mechanism.
    DOI: 10.1504/IJICT.2025.10074671
     
  •   Free full-text access Open AccessDevelopment of an instructional model for Korean translation in multilingual classroom contexts
    ( Free Full-text Access ) CC-BY-NC-ND
    by Rong Rong, Xiaojian Liu 
    Abstract: Multilingual classroom contexts pose significant pedagogical challenges, particularly when students have diverse native languages and varying levels of Korean proficiency. This study introduces a data-driven instructional model for Korean translation education that employs machine learning to address learner diversity. The model evaluates translation outputs, identifies learner-specific error patterns, and personalises instruction based on three key variables: native language influence, historical translation accuracy, and individual learning progression. A dataset of Korean translation tasks was collected from university students representing six L1 backgrounds Chinese, Vietnamese, Arabic, Russian, Japanese, and Spanish. Texts were pre-processed through tokenisation, lemmatisation, and POS tagging, with Word2Vec embeddings used for feature extraction. The proposed Sparrow Search Optimiser Tuned Attention-based Sequence-to-Sequence (SSO-Attn-Seq2Seq) model demonstrated substantial improvements, achieving 8891% across accuracy, precision, recall, and F1-score. Results highlight its adaptability in handling idiomatic expressions and syntactic variation, providing a scalable solution for multilingual Korean language education.
    Keywords: multilingual classroom settings; grammatical variations; languages; SSO-Attn-Seq2Seq.
    DOI: 10.1504/IJICT.2025.10074672
     
  •   Free full-text access Open AccessApplication of visual data analysis system based on artificial intelligence
    ( Free Full-text Access ) CC-BY-NC-ND
    by Xinyun Cheng, Shijie Zhang, Pengfei Wang, Zhikang Wang, Lincheng Qi 
    Abstract: For the insufficient of traditional systems in automated data processing and predictive analysis capability, this study explored a visual data analysis system based on advanced artificial intelligence technology, integrating the three core functions of automated data preparation, intelligent recommendation and predictive analysis. Data cleaning was carried out by weighted k-nearest neighbours imputation and isolation forest algorithm. The unstructured data was handled utilising bidirectional encoder representations from transformers (BERT) models, and key patterns, trends and anomalies were discovered by means of association rule learning techniques. Relying on the autoregressive integrated moving average (ARIMA) model, the time series data was precisely forecasted. Distributed deployment supports the hardware and solves the system layout problem. The evaluation outcomes demonstrated that the ARIMA model performed the best in data prediction with an average prediction time of only 1.075 seconds, the lowest RMSE (7.19) and MAE (4.70), and the highest prediction accuracy (96.00%). This paper provides efficient and intelligent data support and solutions to help decision-making and strategic planning in various industries.
    Keywords: visual data analysis system; artificial intelligence technology; distributed deployment; predictive analysis; ARIMA model; data automation processing.
    DOI: 10.1504/IJICT.2025.10074751
     
  •   Free full-text access Open AccessModelling and optimisation of intelligent speech feedback mechanisms for French pronunciation correction
    ( Free Full-text Access ) CC-BY-NC-ND
    by Ge Song, Wenyong Guo 
    Abstract: In order to improve the accuracy of French pronunciation correction, this study develops a multimodal feedback system, which adopts the improved wav2vec2 model to integrate the physiological features of articulation, and combines time-frequency analysis to extract the acoustic parameters; generates the targeted training materials through the dynamic knowledge graph and integrates the articulatory organ visualisation module; and designs the selective spectrum enhancement strategy to assist in the listening discrimination training. Experiments show that the feedback delay of the system is 155 ms, and the VOT recognition error is reduced by 9.2%; after ten weeks of training, the confusion rate of articulatory parts is reduced by 5.1%, and the accuracy rate of question rhymes reaches 79.2%. The results confirm that moderate multimodal feedback has a progressive optimisation effect on French pronunciation.
    Keywords: multimodal feedback systems; French language; acoustics; dynamic knowledge mapping.
    DOI: 10.1504/IJICT.2025.10074805
     
  •   Free full-text access Open AccessSports movement analysis method considering early fusion network structure and key points of human body
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jianjun Yin, Jie Chen 
    Abstract: With the rapid development of sports and computer technology, accurate sports movement analysis has become crucial for enhancing athlete performance and rehabilitation. Traditional methods face challenges such as difficulty in recognising multi-scene actions and inconsistent sequence lengths. To address this, a novel approach combining an early fusion network with human key point data is proposed. By integrating skeleton node information and using Neville interpolation, the method enhances feature extraction and temporal localisation. Experimental results show significant improvements: compared to traditional models such as LSTM and ST-GCN, the EF-GCN model proposed in this study achieves an increase in classification accuracy of up to 18.5% across various neural networks, and performance metrics such as accuracy, precision, recall, and F1-score improve by around 10%. This approach offers substantial advancements in motion analysis and holds great potential for future sports training and rehabilitation applications.
    Keywords: early fusion network structure; key points of the human body; sports movement analysis; Neville interpolation method; temporal positioning.
    DOI: 10.1504/IJICT.2025.10074850
     
  •   Free full-text access Open AccessDynamic scheduling of hospital social security settlement based on multi-agent reinforcement learning
    ( Free Full-text Access ) CC-BY-NC-ND
    by Yinping Tian 
    Abstract: Hospital social security settlement is a core link connecting medical services, medical insurance systems and patient interests, with its operational efficiency directly affecting medical service quality and social security system sustainability. Expanded medical insurance coverage, surging daily settlements and frequent policy adjustments make traditional static scheduling unable to adapt to system dynamics, causing long patient waits, low terminal utilisation and high verification failures. This study proposes a dynamic scheduling framework for hospital social security settlement based on multi-agent reinforcement learning, with four intelligent agents for distributed decision-making, plus a multi-objective reward function and constrained action mechanism. Experiments with real data from a tertiary Grade A hospital show the framework cuts average settlement delay by 38.2% and 21.5%, raises terminal utilisation by 27.6% and maintains over 99.5% compliance. It offers an intelligent solution to boost settlement efficiency and supports medical insurance service digital transformation.
    Keywords: hospital social security settlement; dynamic scheduling; multi-agent reinforcement learning; MARL; intelligent agent; settlement efficiency.
    DOI: 10.1504/IJICT.2025.10074806
     
  •   Free full-text access Open AccessAn automatic fluency evaluation method for broadcast hosting speech: autoregressive speech LLM
    ( Free Full-text Access ) CC-BY-NC-ND
    by Bingyuan Li 
    Abstract: Oral fluency is a key indicator for evaluating the professional skills of broadcast hosting. To address the current research gap in modelling deep semantic associations for spoken fluency, this paper first utilises Res2Net for multiscale feature extraction from broadcast hosts speech. Subsequently, a pause prediction module is proposed. This module predicts multiple types of pause labels based on the original text. It then predicts a Gaussian mixture distribution for each phoneme and achieves diverse phoneme durations through random sampling. Finally, an autoregressive large language model and a discriminative module based on transformer are proposed. This module is applied at each time step of the autoregressive process and prevents misalignment phenomena via the Transformer and judging mechanism. Experimental results show that the proposed model achieves an evaluation accuracy of 93.35% and a word error rate of 0.7%, enabling high-accuracy fluency evaluation for oral speech.
    Keywords: spoken fluency assessment; feature extraction; Res2Net model; autoregressive large language model; Transformer model.
    DOI: 10.1504/IJICT.2025.10074862
     
  •   Free full-text access Open AccessJapanese pronunciation detection and corpus construction based on cross-modal attention
    ( Free Full-text Access ) CC-BY-NC-ND
    by Xiaolu Liu 
    Abstract: To address Japanese pronunciation error detection, this paper proposes a fusion method based on cross-modal attention mechanisms and constructs a Japanese pronunciation corpus. The model integrates audio Mel-spectrogram and visual lip-motion features through attention mechanisms, effectively capturing fine-grained cross-modal interactions and enabling precise phoneme-level error recognition. Evaluated on both the public corpus from Saruwatari Lab, University of Tokyo and a self-built corpus, the proposed approach achieves an accuracy of 92.3%, which is 3.1% higher than the best baseline model. Moreover, it maintains a robust accuracy of 85.3% under a low signal-to-noise ratio of 5 db, representing a 6.6% improvement compared to other methods. This study provides an effective and noise-robust tool for multimodal speech learning with strong potential for educational applications. The released corpus contains 50 hours of multimodal data with detailed annotations, offering comprehensive support for Japanese language teaching and advanced speech technology development.
    Keywords: cross-modal learning; pronunciation error detection; Japanese speech processing; attention mechanisms; corpus construction.
    DOI: 10.1504/IJICT.2025.10074807
     
  •   Free full-text access Open AccessThe impact of a personalised music recommendation system driven by reinforcement learning on college students' psychological adjustment
    ( Free Full-text Access ) CC-BY-NC-ND
    by Xiaomei Xu, Tiantian Xu 
    Abstract: Music serves as a convenient and effective tool for emotional regulation and holds significant value in psychological adaptation. Addressing the issue that existing research has overlooked real-time changes in college students interests, leading to insufficient analysis of the impact on psychological adaptation, this study first embeds music input into a long short-term memory network model, modelling the sequence processing issue as the Markov decision process, and uses a multilayer perceptron model as the decision agent. High and low-score decision actions are input into a pseudo-twin network to generate delayed rewards. The agent gradually learns strategies that maximise rewards, enabling reliable music recommendations. Finally, the study analyses the systems impact on college students psychological adaptation. Experimental results show that the proposed models hit rate improves by at least 4.73%, significantly enhancing college students psychological well-being.
    Keywords: psychological adjustment; reinforcement learning; music recommendation; Markov decision process; long short-term memory network model.
    DOI: 10.1504/IJICT.2025.10074808
     
  •   Free full-text access Open AccessAI-driven management information system for cost accounting and budget optimisation
    ( Free Full-text Access ) CC-BY-NC-ND
    by Xinying Lu 
    Abstract: This study aims to enhance traditional management accounting by developing an AI-based system for cost accounting and budget optimisation. The proposed framework follows a structured nine-step process, beginning with problem identification and concluding with system validation. Each stage ensures transparency and effective implementation. AI contributes to improved prediction accuracy, cost reduction, and more reliable financial decision-making, while highlighting the limitations of outdated, paper-based methods. In practice, AI assists in tasks such as tax processing, error detection, and forecasting. Historical data are used to train AI models, which are then applied to accounting operations and validated for accuracy and relevance. Despite challenges in integration, scalability, and ethical considerations, results indicate strong reliability, with Cronbachs alpha and composite reliability values exceeding 0.8 in SEM tests. Overall, the AI model outperformed traditional methods by reducing costs and adapting effectively to workload variations.
    Keywords: AI-driven; management information system; MIS; cost accounting; budget optimisation; machine learning; decision support systems; financial management; predictive analytics; reinforcement learning; digital transformation.
    DOI: 10.1504/IJICT.2025.10074809
     
  •   Free full-text access Open AccessSmart tourism services and resource optimisation based on big data and knowledge graphs
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jia Zuo, Jian Li 
    Abstract: To address the challenges of information overload and resource misallocation in the tourism industry, this paper proposes an intelligent service framework that integrates multi-source big data with knowledge graphs. By constructing a tourism-specific knowledge graph from the Yelp dataset (containing over 12,537 POIs and 45,821 users) and combining relational graph convolutional networks with long short-term memory models, the framework achieves precise personalised recommendations and dynamic resource optimisation. The proposed multi-task learning architecture jointly optimises recommendation accuracy and resource prediction performance. Extensive experiments show that the model significantly outperforms baseline methods, achieving a Precision@10 of 0.0914 and Recall@20 of 0.2542, along with a 21.73 root mean square error in flow prediction demonstrating notable improvements in interpretability and robustness. This study provides an effective technical pathway for enhancing tourism service intelligence and operational efficiency.
    Keywords: knowledge graph; smart tourism; resource optimisation; recommendation system; big data analysis.
    DOI: 10.1504/IJICT.2025.10074810
     
  •   Free full-text access Open AccessSpatio-temporal convolutional networks empowering political ideology trend prediction on social media platforms
    ( Free Full-text Access ) CC-BY-NC-ND
    by Wencheng Liu, Xiaojuan Deng 
    Abstract: This paper proposes a prediction model based on a spatio-temporal graph convolutional network to capture the spatio-temporal dependencies of user interactions and improve the accuracy of predicting trends in the dissemination of ideological and political themes. Experimental results show that this method achieves an average improvement of 6.8% in the F1 score and a reduction of 7.2% in the prediction root mean square error for key metrics such as the probability of ideological and political hot topics appearing in the coming week and changes in regional sentiment distribution. This model effectively integrates the spatial structural information of social networks with dynamic temporal features, providing a reliable computational tool for quantitative analysis and forward-looking assessment of ideological and political dynamics in social media environments. These aid relevant departments in timely sensing and guiding the online ideological and political ecosystem.
    Keywords: spatio-temporal convolutional networks; social media platforms; ideological and political communication.
    DOI: 10.1504/IJICT.2025.10074811
     
  •   Free full-text access Open AccessA preschool education social media-monitoring system based on optimised-sentiment analysis
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jie Qiu 
    Abstract: This paper suggests a sentiment monitoring system in the preschool industry to monitor the sentiment of the people regarding early childhood learning. The system gathers and pre-processes social media posts with help of transformer-based language models and entropy scoring, sentiment classification, and unpredictability measurement. The information is collected and presented in real-time on a dynamic dashboard. Findings indicate that there is no consistency between the magnitude and the sentiment change of post volume and that entropy-based metrics provide a more precise analysis of the volume. The system is capable of identifying any abrupt shifts in the mood and thus organisations can be able to respond to current issues at the earliest opportunity. In preschool learning, this method increases parent involvement, organisational sensitivity, and relationship development by using AI-based sentiment analysis.
    Keywords: sentiment analysis; social media monitoring; emotional entropy; transformer models.
    DOI: 10.1504/IJICT.2025.10074812
     
  •   Free full-text access Open AccessEmotion recognition in artistic images based on feature fusion and transfer learning
    ( Free Full-text Access ) CC-BY-NC-ND
    by Laohui Liang 
    Abstract: Currently, artistic images are scarce with limited sample sizes, and most sentiment analysis relies on low-level image features with low accuracy. To address this, this paper first extracts two-dimensional features from images in different colour spaces. It then employs multi-scale convolutional kernels to extract deep semantic information from images, fusing feature information from different dimensions to effectively preserve semantic features across scales. Finally, the transfer component analysis algorithm is employed to reduce dimensionality of features in source and target domains within original space. An improved joint subspace learning method is used to learn a feature transformation subspace, reducing the conditional probability distribution distance between source and target domains while balancing recognition accuracy across categories. Model optimisation is achieved through adversarial training. Experimental results demonstrate that the proposed model improves recognition accuracy by at least 3.82%, effectively enhancing the accuracy of emotional recognition in artistic images.
    Keywords: artistic image emotion recognition; feature fusion; transfer learning; adversarial training; feature extraction.
    DOI: 10.1504/IJICT.2025.10074813
     
  •   Free full-text access Open AccessAdaptive genetic algorithm for multi-topic polyphonic music generation
    ( Free Full-text Access ) CC-BY-NC-ND
    by Huiyan Zeng 
    Abstract: Automatic composition struggles to balance multi-theme planning with strict polyphonic constraints. To address this, this paper proposes an adaptive genetic framework for multi-topic polyphonic music generation. In the scheme, first a bar-aligned, voice-aware representation is prepared with tonal cues and a theme schedule. Then a domain-aware evolutionary core explores the search space through voice-preserving crossover, musically constrained mutation, and lightweight local repair around exposed phrases. Finally, a composite evaluator guides selection while an adaptive controller adjusts operator rates using diversity and stagnation signals. Experiments on chorale, chamber, and modern tonal sets show fewer rule violations, higher consonance with 83% vertical consonance and tonal stability, stronger theme recognisability, and faster convergence without extra runtime. The approach delivers structured, stylistically credible music with strong controllability, clear diagnostics, and room for interactive use.
    Keywords: algorithmic composition; polyphonic music; adaptive genetic algorithm; thematic scheduling.
    DOI: 10.1504/IJICT.2025.10074814
     
  •   Free full-text access Open AccessSelf-identification of legal conflicts in intellectual property contracts based on zero-knowledge proofs
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jing Xu 
    Abstract: The rapid expansion of the digital economy heightens the need for privacy and trust in intellectual property transactions. Traditional centralised approaches to identifying legal conflicts in intellectual property contracts are prone to data leakage and fail to balance transparency with confidentiality. This paper proposes a self-identification method for legal conflicts in intellectual property contracts using zero-knowledge proofs. By combining a light gradient boosting machine learning model with the zero-knowledge succinct non-interactive argument of knowledge protocol, our approach allows verifiable detection of potential legal conflicts without revealing sensitive information. Experiments on the United States patent and trademark office patent dataset demonstrate that the method achieves high performance in conflict prediction (area under the receiver operating characteristic curve = 0.872) and verification efficiency (<10 ms), providing a novel and practical framework for privacy-aware legal technology.
    Keywords: zero-knowledge proof; ZKP; intellectual property contract; automatic identification of legal conflicts; privacy protection; machine learning.
    DOI: 10.1504/IJICT.2025.10074815
     
  •   Free full-text access Open AccessEnglish reading text generation based on optimised variational autoencoder
    ( Free Full-text Access ) CC-BY-NC-ND
    by Liu Yang 
    Abstract: To address the critical global demand from 1.3 billion English as a Foreign Language learner for personalised reading materials, this study develops a dual-channel regularised variational autoencoder. The model systematically overcomes conventional limitations in readability control and semantic coherence by establishing dynamic mappings between educational linguistic features and latent space, designing a novel readability-driven regularisation loss that integrates lexical complexity, syntactic simplification, and discourse cohesion, and implementing curriculum learning for progressive optimisation. Comprehensive evaluations on the Newsela benchmark corpus demonstrate statistically significant improvements: 7.2% in BLEU-4, 32.8% reduction in readability errors, and 20.6% enhancement in teacher-assessed quality. This framework provides an efficient solution for adaptive learning systems, advancing intelligent generation and scalable deployment of educational resources with high practical utility.
    Keywords: optimised variational autoencoder; English reading text generation; readability control; integration of educational features; Newsela dataset.
    DOI: 10.1504/IJICT.2025.10074816
     
  •   Free full-text access Open AccessSimulation and evaluation of green power consumption policies driven by spatio-temporal graph convolutional networks
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jie Jiao, Jiyuan Zhang, Wenshi Ren 
    Abstract: Green power consumption has become a key challenge in the energy transition. Existing research struggles to capture complex relationships between the spatio-temporal dynamics of the power system and policy interventions. To this end, this paper first designs a power load forecasting model based on spatio-temporal graph convolutional networks. The model dynamically adjusts the graph structure according to users electricity consumption patterns and introduces a weighted skip connection mechanism, assigning different weights to connections at different time steps. Then, a mathematical model for optimal combinations of power consumption policies is established. Through deep reinforcement learning algorithms interacting with the environment, it solves for the optimal combination of power consumption policies that minimise economic and carbon emission costs. Experimental outcome demonstrates that the proposed method achieves a green power consumption rate of 97.16%, outperforming comparison methods, thus helping to promote efficient green power consumption.
    Keywords: green power; consumption policy; spatio-temporal graph convolutional network; deep reinforcement learning algorithms; skip connections.
    DOI: 10.1504/IJICT.2025.10074817
     
  •   Free full-text access Open AccessMulti-objective optimisation for sustainable landscape planning using genetic algorithms
    ( Free Full-text Access ) CC-BY-NC-ND
    by Miao Weng 
    Abstract: Rapid urbanisation intensifies pressures on urban landscapes, driving sustainability challenges like ecological degradation and unequal green space access. This study develops a genetic algorithm (GA)-based multi-objective optimisation (MOO) framework for sustainable landscape planning in Nanchang Citys first ring road. The non-dominated sorting genetic algorithm II (NSGA-II) is adopted to simultaneously optimise ecological, social and economic objectives. Spatial data, including land use and population density, are integrated within a grid-based model, with constraints such as ecological protection lines. In the park green space case, optimisation achieves 100% service coverage, reduces residents total travel time by 28.2%, increases 15-minute accessible population from 70.35% to 94.31%, and enhances efficiency. The Pareto optimal solution set illustrates critical trade-offs, while the optimised spatial layout demonstrates significant accessibility gains. This approach provides a robust decision-making tool for sustainable urban development, balancing ecological integrity, social equity, and economic viability in high-density environments.
    Keywords: multi-objective optimisation; MOO; non-dominated sorting genetic algorithm II; NSGA-II; sustainable landscape planning; genetic algorithm; GA.
    DOI: 10.1504/IJICT.2025.10074818
     
  •   Free full-text access Open AccessDRL-MusicEdu: a deep reinforcement learning-based dynamic music teaching recommender system
    ( Free Full-text Access ) CC-BY-NC-ND
    by Pengfei Wu, Ruixue Sun, Wu Jun 
    Abstract: Addressing the inability of traditional music teaching systems to dynamically adapt to learners personalised states, this study proposes deep reinforcement learning-MusicEdu a dynamic recommender system based on deep reinforcement learning. The framework constructs an intelligent agent that continuously perceives multidimensional learner states (skill proficiency, interests, fatigue) and dynamically optimises teaching-resource sequences via deep reinforcement learning (using proximal policy optimisation). This leverages a structured resource library derived from the Lakh Musical Instrument Digital Interface Dataset, annotated with metadata including difficulty, style, and technical attributes. Experimental validation across 20 weeks with five learner profiles demonstrates that deep reinforcement learning-MusicEdu significantly outperforms baselines, improving skill growth rate by 19.2% (p < 0.01) and user retention by 18.1%. The system enables personalised adaptive learning pathways, establishing an innovative decision making framework for intelligent music education.
    Keywords: deep reinforcement learning; DRL; music education; personalised recommendations; Lakh MIDI Dataset; adaptive learning.
    DOI: 10.1504/IJICT.2025.10074819
     
  •   Free full-text access Open AccessDigital media operations prediction based on user sentiment analysis and deep neural networks
    ( Free Full-text Access ) CC-BY-NC-ND
    by Xinyu Chen, Zhenbin Huang 
    Abstract: Against the backdrop of increasingly fierce competition in the digital media industry, how to accurately predict operational effects has become the key to enhancing the competitiveness of media. Aiming at the problem of fusion redundancy caused by the existing research ignoring the mutual influence among cross-modalities, this paper first uses BERT and the improved visual transformer model to extract text and image features respectively. Then, cross-modal shared computing is utilised to enhance the complementarity among the features of each modal. Introduce text gating enhancement and use text information as prior knowledge to guide and improve the representation of image characteristics. Eventually, the fused characteristics are input into the classification layer for prediction. Experimental outcome indicates that the prediction accuracy rate of the suggested approach is 95.3%, which is at least 2.2% higher, significantly improving the accuracy of predicting the operation effect of digital media.
    Keywords: digital media; operation effect prediction; sentiment analysis; convolutional neural network; vision transformer.
    DOI: 10.1504/IJICT.2025.10074866
     
  •   Free full-text access Open AccessTowards an enhanced evaluation framework for English reading competence: leveraging multimodal learning analytics
    ( Free Full-text Access ) CC-BY-NC-ND
    by Lina Liu 
    Abstract: As the field of educational assessment is growing, traditional ways of testing English reading ability cannot adequately show all the different kinds of information that learners use when they read. Because of this, how to employ multimodal learning behaviour data to make more accurate assessments is a popular topic in educational research right now. This research suggests the MLB-ERAM model for assessing English reading proficiency based on facts on how people learn in different ways. MLB-ERAM uses a lot of multimodal learning behaviour data and deep learning (DL) technology to get a whole picture of how well students can read. The experimental results reveal that the MLB-ERAM model works well with multimodal data, gets around the problems with standard assessment methods, and is a useful guide for the future growth of educational assessment technology.
    Keywords: multimodal data; learning behaviour; English reading proficiency assessment; DL.
    DOI: 10.1504/IJICT.2025.10074867
     
  •   Free full-text access Open AccessSparse coding-based vocal music feature extraction and real-time transmission
    ( Free Full-text Access ) CC-BY-NC-ND
    by Fangzi Zhang, Jinyi Hu 
    Abstract: Traditional audio compression and transmission methods struggle with bandwidth usage and transmission delay, thereby creating a growing need for a real-time audio transmission. This work presents a sparse coding-based approach for vocal audio feature extraction and real-time transmission (SCTRT) to handle these difficulties. By means of sparse coding approaches, the model efficiently compresses and extracts audio information, hence lowering data redundancy and improving transmission efficiency. Three components make up the model: real-time transmission and recovery, feature extraction and compression, and audio capture and pre-processing, guaranteeing low latency and effective transmission of audio signals. In terms of compression ratio, audio quality and transmission delay, the experimental findings reveal that the SCTRT model is particularly appropriate for real-time audio transmission applications since it has notable benefits over conventional techniques.
    Keywords: sparse coding; vocal feature extraction; audio compression; real-time transmission.
    DOI: 10.1504/IJICT.2025.10074749
     
  •   Free full-text access Open AccessM-DRAMA: a multimodal-driven framework for classical drama short video promotion
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jun Su 
    Abstract: Facing declining youth engagement in traditional theatre (under 30% attendance), this study addresses the paradox of surging opera-related short video consumption by proposing a multimodal-driven framework for targeted classical drama promotion. We introduce M-DRAMA, an integrated model leveraging three technical innovations: A drama knowledge graph (DKG) with hyperbolic embedding to structure cultural metadata; a cross-modal alignment (CMA) module enforcing frame-level synchronisation of lyrics, movements, and music via matrix constraints, reducing semantic deviation to < 0.2 s. A spatiotemporal interest decoupling network capturing ephemeral youth preferences through gated LSTM-TCN fusion. Validated on the CDS-1K dataset, M-DRAMA achieves NDCG of 0.341 and elevates cultural diffusion index (CDI) by 40%. The framework increases youth user penetration to 37.5%, demonstrating efficacy in minimising cultural discount while balancing algorithmic reach and heritage preservation.
    Keywords: interactive digital media; media convergence; dissemination path optimisation; reinforcement learning; information entropy.
    DOI: 10.1504/IJICT.2025.10074750
     
  •   Free full-text access Open AccessReal-time detection of Business English grammar errors driven by transfer learning
    ( Free Full-text Access ) CC-BY-NC-ND
    by Zhenxin Fang, Zhenyu Song 
    Abstract: Improving the grammatical accuracy of Business English writing is crucial, but general grammar checking tools often struggle to adapt to professional contexts. This study proposes a real-time grammar error detection method based on BERT transfer learning, aimed at enhancing performance in business scenarios. Methodologically, the BERT-base pre-trained model is directly utilised to capture general language features. To meet real-time requirements, a lightweight model inference architecture was designed. Experimental results show that the model fine-tuned for the business domain achieves an accuracy rate of 89.2% and an F1 score of 0.842. The improvements are particularly significant in detecting formal expressions and complex sentence structures specific to business texts. This study demonstrates that combining BERT-based transfer learning with fine-tuning using small yet representative domain-specific datasets can effectively enhance the practicality and accuracy of grammar error detection in Business English.
    Keywords: transfer learning; Business English; grammar error detection; BERT.
    DOI: 10.1504/IJICT.2025.10074621
     
  •   Free full-text access Open AccessEnergy efficiency analysis and optimisation strategies for green building design based on gravitational search algorithm
    ( Free Full-text Access ) CC-BY-NC-ND
    by Yaxi Gong, Yingyi Ma, Shanshan Cheng 
    Abstract: As people's requirements for energy saving and emission reduction continue to increase, the issue of energy consumption in buildings has received more and more attention. How to efficiently optimise the energy consumption of green buildings has become an important research goal in the field of energy consumption analysis and architectural design. This study, aiming at the energy consumption problem in green buildings, designs a method based on gravitational search algorithm (GSA) to optimise energy consumption. First, sensor data of equipment in the building is collected. Then, a multi-objective optimisation model is constructed to ensure that the final goal is the lowest energy consumption without reducing comfort. The final experimental results show that the overall building energy use decreased by 33.8% because the GSA algorithm can effectively reduce the overall energy consumption of building equipment and meets the requirements for energy consumption optimisation in green buildings.
    Keywords: green buildings; gravitational search algorithm; GSA; energy consumption analysis; multi-objective optimisation model.
    DOI: 10.1504/IJICT.2025.10074623
     
  •   Free full-text access Open AccessIdentification of translation bias in Chinese-Korean Confucian texts based on pre-trained language models
    ( Free Full-text Access ) CC-BY-NC-ND
    by Zhengfeng Huang 
    Abstract: Confucian classics hold a foundational position in the history of Sino-Korean cultural exchange. However, machine translation of these texts often leads to semantic distortion and cultural bias. This paper proposes an automated bias identification framework based on the pre-trained cross-lingual model x-language model-robustly optimised bidirectional encoder representations from transformers pretraining approach. Through a multi-task architecture integrates contrastive learning, semantic role labelling, and context-aware alignment, our method effectively identifies and quantifies semantic, cultural, and grammatical deviations in translated Confucian texts. Experimental results on multiple public available corpora demonstrate that the proposed approach achieves an F1-score of 0.83 and accuracy of 85%, outperforming existing baselines in both metrics, especially in identifying culturally specific terms and nuanced expressions (F1 = 0.86 for cultural bias). This research provides valuable methodological insights for evaluating classical text translation quality and supports the accurate dissemination and digital preservation of Confucian cultural heritage.
    Keywords: pre-trained language models; PLMs; Chinese-Korean translation; Confucian texts; bias identification; cross-language processing.
    DOI: 10.1504/IJICT.2025.10074595
     
  •   Free full-text access Open AccessA spatio-temporal transformer predictive model for elderly-oriented tourism via attention mechanism
    ( Free Full-text Access ) CC-BY-NC-ND
    by Jiya Sun 
    Abstract: To address the issue that current models for predicting the potential of retirement destinations overlook the spatio-temporal correlations between influencing factors, this paper first selects the influencing factors of retirement destination potential and designs an improved empirical mode decomposition algorithm to decompose these factors, obtaining the individual mode components. Then, the characteristics of each mode component are captured, and the spatio-temporal dependencies are unified through an adaptive embedding mechanism. Subsequently, a temporal self-attention module is designed to capture temporal dependencies, and a spatial self-attention mechanism is implemented to model geographical relationships. Feature fusion is achieved using a multi-head attention mechanism, and the prediction results are output through a feedforward neural network. Experimental outcome indicates that the prediction accuracy of the suggested model improves by 2.7%-11.8% compared to the baseline model, validating the superiority of the suggested model.
    Keywords: potential prediction; spatiotemporal transformer; empirical mode decomposition; EMD; attention mechanism.
    DOI: 10.1504/IJICT.2025.10074622