Title: Unsupervised generation of Arabic words
Authors: Ahmed Khorsi; Abeer Saad Alsheddi
Addresses: Al-Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia ' Al-Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia
Abstract: Automated word generation might be seen as the reverse process of morphology learning. The aim is to automatically coin valid words in the targeted language. As many other challenges in the field of natural language processing (NLP), the building of the generation engine might be carried out using a supervised or unsupervised approach. The former requires a clean learning data set of a decent size whereas the later needs no more than a plain text. Nonetheless, the unsupervised approaches are usually blamed for their low accuracy. The present article reports the results of an investigation on a context free generation of classical Arabic words. Unsupervised and relatively simple, The proposed approach reached easily an accuracy of 90%.
Keywords: Arabic language; classical vocabulary; computational linguistics; corpus expansion; linguistic corpora; morphology learning; natural language processing; unsupervised learning; statistical linguistics; word generation.
DOI: 10.1504/IJISTA.2019.100793
International Journal of Intelligent Systems Technologies and Applications, 2019 Vol.18 No.4, pp.340 - 352
Received: 28 May 2017
Accepted: 23 Nov 2017
Published online: 18 Jul 2019 *