Title: Indic script identification from handwritten document images
Authors: Pawan Kumar Singh; Ram Sarkar; Mita Nasipuri
Addresses: Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata-700032, West Bengal, India ' Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata-700032, West Bengal, India ' Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata-700032, West Bengal, India
Abstract: Script identification plays an important role in document image processing especially for multilingual environment. This paper hires two conventional textural methods for recognition of the scripts of the handwritten documents inscribed in different Indic scripts. The first method extracts well-known Haralick features from spatial grey-level dependence matrix (SGLDM) and the second method computes fractal dimension by using segmentation-based fractal texture analysis (SFTA). Finally, a 104-element feature vector is constructed from each page image by these two methods. The proposed technique is then evaluated on a total dataset comprising 360 handwritten document pages written in 12 Indian official scripts namely Bangla, Devanagari, Gujarati, Gurumukhi, Kannada, Malayalam, Manipuri, Oriya, Tamil, Telugu, Urdu and Roman. Experimentations using multiple classifiers reveal that multilayer perceptron (MLP) shows the highest identification accuracy of 96.94%. The encouraging outcome confirms the efficacy of customary textural features to handwritten Indic script identification.
Keywords: script identification; handwritten Indic documents; textural features; spatial grey-level dependence matrix; segmentation-based fractal texture analysis; MLP classifier; Haralick features; multiple classifiers; statistical significance tests.
DOI: 10.1504/IJISTA.2019.099341
International Journal of Intelligent Systems Technologies and Applications, 2019 Vol.18 No.3, pp.303 - 321
Received: 07 Dec 2016
Accepted: 27 Aug 2017
Published online: 29 Apr 2019 *