ABSTRACT
Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.
- N. Otsu, A Thresholding Selection Method from Gray-level Histogram, IEEE Transanctions on Systems, Man and Cybernetics, vol. 9, pp. 62--66, March 1979.Google ScholarCross Ref
- J. Kittler, J. Illingworth, and J. Foglein, Threshold selection based on a simple image statistic, Computer Vision, Graphics, and Image Processing, vol. 30, no. 2, pp. 125--147, 1985.Google Scholar
- J. Canny, A Computational Approach to Edge Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679--698, November 1986. Google ScholarDigital Library
- W. Niblack, An introduction to digital image processing. New York: Prentice Hall, 1986. Google ScholarDigital Library
- J. J. Sauvola and M. Pietäikinen, Adaptive document image binarization, Pattern Recognition, vol. 33, no. 2, pp. 225--236, 2000.Google ScholarCross Ref
- R. O. Duda, P. E. Hart and D. G. Stork., Pattern classification, Wiley, 2001. Google ScholarDigital Library
- H. C. Thode, Jr., Testing for Normality, New York, Marcel Dekker, 2002.Google Scholar
- J. Matas, O. Chum, M. Urban and T. Pajdla, Robust wide baseline stereo from maximally stable extremal regions, British Machine Vision Conference, pp. 384--393, 2002.Google ScholarCross Ref
- Y. Boykov and V. Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision, IEEE Trans. PAMI, vol. 26, no. 9, pp. 1124--1137, September 2004. Google ScholarDigital Library
- S. M. Lucas et.al, ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions, International Journal on Document Analysis and Recognition, vol. 7, no. 2, pp. 105--122, June 2005.Google Scholar
- S. M. Lucas, Text Locating Competition Results, Proc. 8th Intl. Conf. on Document Analysis and Recognition (ICDAR) 2005, pp. 80--85, 2005. Google ScholarDigital Library
- H. Liu, X. Ding, Handwritten Character Recognition Using Gradient Feature and Quadratic Classifier with Multiple Discrimination Schemes, Proc. 8th Int. Conf. on Document Analysis And Recognition, pp. 19--25, 2005. Google ScholarDigital Library
- C. M. Thillou and B. Gosselin, Color text extraction from camera-captured images: the impact of the choice of the clustering distance, Proc. of 8th Int. Conf. on Document Analysis And Recognition, pp. 312--316, 2005. Google ScholarDigital Library
- C. M. Thillou and B. Gosselin, Color text extraction with selective metric-based clustering, Computer, Vision and Image Understanding, vol. 107, no. 2, pp. 97--107, 2007. Google ScholarDigital Library
- T. Kasar, J. Kumar and A. G. Ramakrishnan, Font and background color independent text binarization, Proc 2nd Camera-based Document Analysis and Recognition (CBDAR), pp. 3--9, 2007.Google Scholar
- T. Kasar and A. G. Ramakrishnan, COCOCLUST: Contour-based color clustering for robust binarization of colored text, Proc 3rd Camera-based Document Analysis and Recognition (CBDAR), pp. 11--17, 2009.Google Scholar
- L. Neumann and J. Matas, A Method for Text Localization and Recognition in Real-World Images, Proc. 10th Asian Conference on Computer Vision (ACCV), pp. 770--783, 2010. Google ScholarDigital Library
- C. Zeng, W. Jia and X. He, An Algorithm for Colour-based Natural Scene Text Segmentation, Proc 4th Camera-based Document Analysis and Recognition (CBDAR), pp. 67--72, 2011. Google ScholarDigital Library
- A. Mishra, K. Alahari and C. V. Jawahar, An MRF Model for Binarization of Natural Scene Text, Proc. 11th International Conference of Document Analysis and Recognition, pp. 11--16, September 2011. Google ScholarDigital Library
- A. Shahab, F. Shafait and A. Dengel, ICDAR 2011 Robust Reading Competition - Challenge 2: Reading Text in Scene Images, In Proc. 11th International Conference of Document Analysis and Recognition, pp. 1491--1496, September 2011. Google ScholarDigital Library
- K. Wang, B. Babenko and S. Belongie, End-to-End Scene Text Recognition, Proc. 13th International Conference on Computer Vision (ICCV), pp. 1457--1464, 2011. Google ScholarDigital Library
- A. Mishra, K. Alahari and C. V. Jawahar, Top-Down and Bottom-Up Cues for Scene Text Recognition, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2012. Google ScholarDigital Library
- D. Kumar, M. N. Anil Prasad and A. G. Ramkrishnan, Benchmarking recognition results on word image datasets, CoRR, vol. abs/1208.6137, 2012. http://arxiv.org/abs/1208.6137Google Scholar
- "Abbyy Fine reader," http://www.abbyy.com/.Google Scholar
- "Adobe Reader," http://www.adobe.com/products/acrobatpro/scanning-ocr-to-pdf.html.Google Scholar
- IAPR TC11 Reading Systems-Datasets List, http://www.iapr-tc11/mediawiki/index.php/DatasetsGoogle Scholar
- "Inzisoft," http://www.inzisoft.com/english/.Google Scholar
- "KAIST AIPR," http://ai.kaist.ac.kr/home/.Google Scholar
- "Nuance Omnipage reader," http://www.nuance.com/.Google Scholar
- "Tesseract OCR engine," http://code.google.com/p/tesseract-ocr/.Google Scholar
Index Terms
- MAPS: midline analysis and propagation of segmentation
Recommendations
OCR for printed Kannada text to machine editable format using database approach
This paper describes an Optical Character Recognition (OCR) system for printed text documents in Kannada, a South Indian language. The proposed OCR system for the recognition of printed Kannada text, which can handle all types of Kannada characters. The ...
Hybrid OCR Techniques for Cursive Script Languages - A Review and Applications
CICSYN '10: Proceedings of the 2010 2nd International Conference on Computational Intelligence, Communication Systems and NetworksSoftware-based Arabic optical character recognition (OCR) has been used quite successfully for many years. However, the hardware-based implementations of the OCR – which can be 10-100 times faster than the software-only method – seem to not have been ...
OCR for printed Kannada text to machine editable format using database approach
ICAI'08: Proceedings of the 9th WSEAS International Conference on International Conference on Automation and InformationThis paper describes an Optical Character Recognition (OCR) system for printed text documents in Kannada, a South Indian language. The proposed OCR system for the recognition of printed Kannada text, which can handle all types of Kannada characters. The ...
Comments