skip to main content
10.1145/2425333.2425348acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvgipConference Proceedingsconference-collections
research-article

MAPS: midline analysis and propagation of segmentation

Published:16 December 2012Publication History

ABSTRACT

Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.

References

  1. N. Otsu, A Thresholding Selection Method from Gray-level Histogram, IEEE Transanctions on Systems, Man and Cybernetics, vol. 9, pp. 62--66, March 1979.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. Kittler, J. Illingworth, and J. Foglein, Threshold selection based on a simple image statistic, Computer Vision, Graphics, and Image Processing, vol. 30, no. 2, pp. 125--147, 1985.Google ScholarGoogle Scholar
  3. J. Canny, A Computational Approach to Edge Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679--698, November 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. Niblack, An introduction to digital image processing. New York: Prentice Hall, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. J. Sauvola and M. Pietäikinen, Adaptive document image binarization, Pattern Recognition, vol. 33, no. 2, pp. 225--236, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  6. R. O. Duda, P. E. Hart and D. G. Stork., Pattern classification, Wiley, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. C. Thode, Jr., Testing for Normality, New York, Marcel Dekker, 2002.Google ScholarGoogle Scholar
  8. J. Matas, O. Chum, M. Urban and T. Pajdla, Robust wide baseline stereo from maximally stable extremal regions, British Machine Vision Conference, pp. 384--393, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  9. Y. Boykov and V. Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision, IEEE Trans. PAMI, vol. 26, no. 9, pp. 1124--1137, September 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. M. Lucas et.al, ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions, International Journal on Document Analysis and Recognition, vol. 7, no. 2, pp. 105--122, June 2005.Google ScholarGoogle Scholar
  11. S. M. Lucas, Text Locating Competition Results, Proc. 8th Intl. Conf. on Document Analysis and Recognition (ICDAR) 2005, pp. 80--85, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Liu, X. Ding, Handwritten Character Recognition Using Gradient Feature and Quadratic Classifier with Multiple Discrimination Schemes, Proc. 8th Int. Conf. on Document Analysis And Recognition, pp. 19--25, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. M. Thillou and B. Gosselin, Color text extraction from camera-captured images: the impact of the choice of the clustering distance, Proc. of 8th Int. Conf. on Document Analysis And Recognition, pp. 312--316, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. M. Thillou and B. Gosselin, Color text extraction with selective metric-based clustering, Computer, Vision and Image Understanding, vol. 107, no. 2, pp. 97--107, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Kasar, J. Kumar and A. G. Ramakrishnan, Font and background color independent text binarization, Proc 2nd Camera-based Document Analysis and Recognition (CBDAR), pp. 3--9, 2007.Google ScholarGoogle Scholar
  16. T. Kasar and A. G. Ramakrishnan, COCOCLUST: Contour-based color clustering for robust binarization of colored text, Proc 3rd Camera-based Document Analysis and Recognition (CBDAR), pp. 11--17, 2009.Google ScholarGoogle Scholar
  17. L. Neumann and J. Matas, A Method for Text Localization and Recognition in Real-World Images, Proc. 10th Asian Conference on Computer Vision (ACCV), pp. 770--783, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Zeng, W. Jia and X. He, An Algorithm for Colour-based Natural Scene Text Segmentation, Proc 4th Camera-based Document Analysis and Recognition (CBDAR), pp. 67--72, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Mishra, K. Alahari and C. V. Jawahar, An MRF Model for Binarization of Natural Scene Text, Proc. 11th International Conference of Document Analysis and Recognition, pp. 11--16, September 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Shahab, F. Shafait and A. Dengel, ICDAR 2011 Robust Reading Competition - Challenge 2: Reading Text in Scene Images, In Proc. 11th International Conference of Document Analysis and Recognition, pp. 1491--1496, September 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Wang, B. Babenko and S. Belongie, End-to-End Scene Text Recognition, Proc. 13th International Conference on Computer Vision (ICCV), pp. 1457--1464, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Mishra, K. Alahari and C. V. Jawahar, Top-Down and Bottom-Up Cues for Scene Text Recognition, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Kumar, M. N. Anil Prasad and A. G. Ramkrishnan, Benchmarking recognition results on word image datasets, CoRR, vol. abs/1208.6137, 2012. http://arxiv.org/abs/1208.6137Google ScholarGoogle Scholar
  24. "Abbyy Fine reader," http://www.abbyy.com/.Google ScholarGoogle Scholar
  25. "Adobe Reader," http://www.adobe.com/products/acrobatpro/scanning-ocr-to-pdf.html.Google ScholarGoogle Scholar
  26. IAPR TC11 Reading Systems-Datasets List, http://www.iapr-tc11/mediawiki/index.php/DatasetsGoogle ScholarGoogle Scholar
  27. "Inzisoft," http://www.inzisoft.com/english/.Google ScholarGoogle Scholar
  28. "KAIST AIPR," http://ai.kaist.ac.kr/home/.Google ScholarGoogle Scholar
  29. "Nuance Omnipage reader," http://www.nuance.com/.Google ScholarGoogle Scholar
  30. "Tesseract OCR engine," http://code.google.com/p/tesseract-ocr/.Google ScholarGoogle Scholar

Index Terms

  1. MAPS: midline analysis and propagation of segmentation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
          December 2012
          633 pages
          ISBN:9781450316606
          DOI:10.1145/2425333

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 December 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate95of286submissions,33%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader