Title: Multi-domain intelligent system for document image retrieval
Authors: Donato Barbuzzi; Alessandro Massaro; Angelo Galiano; Leonardo Pellicani; Giuseppe Pirlo; Matteo Saggese
Addresses: Dyrecta Lab srl, Via V. Simplicio, 45, 70014, Conversano (BA), Italy ' Dyrecta Lab srl, Via V. Simplicio, 45, 70014, Conversano (BA), Italy ' Dyrecta Lab srl, Via V. Simplicio, 45, 70014, Conversano (BA), Italy ' Dyrecta Lab srl, Via V. Simplicio, 45, 70014, Conversano (BA), Italy ' Bari University, Via E. Orabona, 4 – 70125, Bari, Italy ' Kibematsrl, Via del Pescaccio, 30 – 00166, Rome, Italy
Abstract: This paper presents an experimental analysis on document image retrieval using a multi-domain intelligent system. More specifically, on the same document image, the combination of three different domains: layout, logo and signature are discussed. This new method analyses every single decision provided by multi-domain system so that, in the training phase, a new sample classified with a dissimilar confidence to the previous trained samples is used to update the system. DTW, Euclidean distance and cosine similarity have been used, respectively for the analysis of layout, logo and signature. Finally, the weighted combination of individual decisions was considered. The experimental results, carried out on 30 rotated forms belonging to 13 different companies, demonstrate the superiority of the proposed approach with respect to single-domain retrieval systems, based on the ANR performance index. The ANR parameter is able to evaluate the multi-domain system.
Keywords: document management system; document image retrieval; multi-expert intelligent system; feedback-based strategy; instance selection.
DOI: 10.1504/IJAIS.2019.108381
International Journal of Adaptive and Innovative Systems, 2019 Vol.2 No.4, pp.282 - 297
Received: 01 Dec 2015
Accepted: 06 May 2016
Published online: 13 Jul 2020 *