A Recognition-Based Approach to Segmenting Arabic Handwritten Text

Abstract

Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcoming, which is over-segmentation. The aim of segmentation is to produce the letters (segments) of a handwritten word. When a resulting letter (segment) is made of more than one piece (stroke) instead of one, this is called over-segmentation. Our objective is to overcome this problem by using an Artificial Neural Networks (ANN) to verify the resulting segment. We propose a set of heuristic-based rules to assemble strokes in order to report the precise segmented letters. Preprocessing phases that include normalization and feature extraction are required as a prerequisite step for the ANN system for recognition and verification. In our previous work [1], we did achieve a segmentation success rate of 86% but without recognition. In this work, our experimental results confirmed a segmentation success rate of no less than 95%.

Share and Cite:

Elnagar, A. and Bentrcia, R. (2015) A Recognition-Based Approach to Segmenting Arabic Handwritten Text. Journal of Intelligent Learning Systems and Applications, 7, 93-103. doi: 10.4236/jilsa.2015.74009.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Elnagar, A. and Bentrecia, R. (2012) A Multi-Agent Approach to Arabic Handwritten Text Segmentation. Journal of Intelligent Learning Systems and Applications, 4, 207-215.
http://dx.doi.org/10.4236/jilsa.2012.43021
[2] Cheriet, M., Kharma, N. and Lui, C.-L. and Suen, Y. (2007) Character Recognition Systems: A Guide for Students and Practitioners. John Wiley & Sons, Inc., Hoboken.
[3] Elnagar, A. and Alhajj, R. (2003) Segmentation of Connected Handwritten Numeral Strings. Pattern Recognition, 36, 625-634.
http://dx.doi.org/10.1016/S0031-3203(02)00097-3
[4] Vellasques, A.E. Oliveira, L.S., Britto Jr., A.S., Koerich, A.L. and Sabourin, R. (2008) Filtering Segmentation Cuts for Digit String Recognition. Pattern Recognition, 41, 3044-3052.
http://dx.doi.org/10.1016/j.patcog.2008.03.019
[5] Al-Badr, B. and Mahmoud, S.A. (1995) Survey and Bibliography of Arabic Optical Text Recognition. Signal Processing, 41, 49-77.
http://dx.doi.org/10.1016/0165-1684(94)00090-M
[6] Amin, A. (1998) Offline Arabic Character Recognition: The State of the Art. Pattern Recognition, 31, 517-530.
http://dx.doi.org/10.1016/S0031-3203(97)00084-8
[7] Eldin, A.S. and Nouh, A.S. (1998) Arabic Character Recognition: A Survey. SPIE Proceedings: Optical Pattern Recognition IX, 3386, 331-340.
[8] Khorsheed, M.S. (2002) Off-Line Arabic Character Recognition: A Review. Pattern Analysis and Applications, 5, 31-45.
http://dx.doi.org/10.1007/s100440200004
[9] Parhami, B. and Taraghi, M. (1981) Automatic Recognition of Printed Farsi Texts. Pattern Recognition, 14, 395-403.
http://dx.doi.org/10.1016/0031-3203(81)90084-4
[10] Amin, A. and Masini, G. (1986) Machine Recognition of Multi-Font Printed Arabic Texts. Proc. of the 8th IEEE International Joint Conference on Pattern Recognition, 392-395.
[11] Gillies, A., Erlandson, E., Trenkle, J. and Schlosser, S. (1999) Arabic Text Recognition System. Proceedings of the Symposium on Document Image Understanding Technology, Annapolis, 14-16 April 1999, 253-260.
[12] Hamami, L. and Berkani, D. (2002) Recognition System for Printed Multi-Font and Multi-Size Arabic Characters. The Arabian Journal for Science and Engineering, 27, 57-72.
[13] Dehghan, M., Faez, K., Ahmadi, M. and Shridhar, M. (2001) Handwritten Farsi (Arabic) Word Recognition: A Holistic Approach Using Discrete HMM. Pattern Recognition, 34, 1057-1065.
http://dx.doi.org/10.1016/S0031-3203(00)00051-0
[14] Al-Qahtani, S.A. and Khorsheed, M.S. (2004) An Omni-Font HTK-Based Arabic Recognition System. Proceedings of the 8th IASTED International Conference on Artificial Intelligence and Soft Computing, Marbella, 1-3 September 2004.
[15] Al-Qahtani, S.A. and Khorsheed, M.S. (2004) A HTK-Based System to Recognize Arabic Script. Proceedings of the 4th IASTED International Conference on Visualization, Imaging, and Image Processing, Marbella, 6-8 September 2004.
[16] Al-Badr, B. and Haralick, R. (1998) A Segmentation-Free Approach to Text Recognition with Application to Arabic Text. International Journal on Document Analysis and Recognition, 1, 147-166.
http://dx.doi.org/10.1007/s100320050014
[17] Al-Badr, B. and Haralick, R. (1995) Segmentation-Free Word Recognition with Application to Arabic. Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, 14-16 August 1995, 355-359.
http://dx.doi.org/10.1109/ICDAR.1995.599012
[18] Khorsheed, M.S. and Clocksin, W.F. (1999) Structural Features of Cursive Arabic Script. British Machine Vision Conference, Nottingham, 13-16 September 1999, 422-431.
http://dx.doi.org/10.5244/c.13.42
[19] Amin, A. (2000) Recognition of Printed Arabic Text Based on Global Features and Decision Tree Learning Techniques. Pattern Recognition, 33, 1309-1323.
http://dx.doi.org/10.1016/S0031-3203(99)00114-4
[20] Pechwitz, M. and Maergner, V. (2003) HMM-Based Approach for Handwritten Arabic Word Recognition Using the IFN/ENIT-Database. Proceedings of the 7th International Conference on Document Analysis and Recognition, ICDAR, Edinburgh, 6 August 2003, 890-894.
http://dx.doi.org/10.1109/icdar.2003.1227788
[21] Gonzalez, R.C. and Wintz, P. (1987) Digital Image Processing. 2nd Edition, Addison-Wesley, Boston.
[22] Teuber, J. (1991) Digital Image Processing. Prentice Hall International Series in Acoustics, Speech and Signal Processing, Prentice Hall, Upper Saddle River.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.