research-article

MAPS: midline analysis and propagation of segmentation

Authors:
Deepak Kumar

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India
View Profile

,
M. N. Anil Prasad

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India
View Profile

,
A. G. Ramakrishnan

Indian Institute of Science, Bangalore, India

Indian Institute of Science, Bangalore, India
View Profile

ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image ProcessingDecember 2012Article No.: 15Pages 1–7https://doi.org/10.1145/2425333.2425348

Published:16 December 2012Publication History

ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing

Pages 1–7

ABSTRACT

Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.

References

N. Otsu, A Thresholding Selection Method from Gray-level Histogram, IEEE Transanctions on Systems, Man and Cybernetics, vol. 9, pp. 62--66, March 1979.Google ScholarCross Ref
J. Kittler, J. Illingworth, and J. Foglein, Threshold selection based on a simple image statistic, Computer Vision, Graphics, and Image Processing, vol. 30, no. 2, pp. 125--147, 1985.Google Scholar
J. Canny, A Computational Approach to Edge Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679--698, November 1986. Google ScholarDigital Library
W. Niblack, An introduction to digital image processing. New York: Prentice Hall, 1986. Google ScholarDigital Library
J. J. Sauvola and M. Pietäikinen, Adaptive document image binarization, Pattern Recognition, vol. 33, no. 2, pp. 225--236, 2000.Google ScholarCross Ref
R. O. Duda, P. E. Hart and D. G. Stork., Pattern classification, Wiley, 2001. Google ScholarDigital Library
H. C. Thode, Jr., Testing for Normality, New York, Marcel Dekker, 2002.Google Scholar
J. Matas, O. Chum, M. Urban and T. Pajdla, Robust wide baseline stereo from maximally stable extremal regions, British Machine Vision Conference, pp. 384--393, 2002.Google ScholarCross Ref
Y. Boykov and V. Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision, IEEE Trans. PAMI, vol. 26, no. 9, pp. 1124--1137, September 2004. Google ScholarDigital Library
S. M. Lucas et.al, ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions, International Journal on Document Analysis and Recognition, vol. 7, no. 2, pp. 105--122, June 2005.Google Scholar
S. M. Lucas, Text Locating Competition Results, Proc. 8th Intl. Conf. on Document Analysis and Recognition (ICDAR) 2005, pp. 80--85, 2005. Google ScholarDigital Library
H. Liu, X. Ding, Handwritten Character Recognition Using Gradient Feature and Quadratic Classifier with Multiple Discrimination Schemes, Proc. 8th Int. Conf. on Document Analysis And Recognition, pp. 19--25, 2005. Google ScholarDigital Library
C. M. Thillou and B. Gosselin, Color text extraction from camera-captured images: the impact of the choice of the clustering distance, Proc. of 8th Int. Conf. on Document Analysis And Recognition, pp. 312--316, 2005. Google ScholarDigital Library
C. M. Thillou and B. Gosselin, Color text extraction with selective metric-based clustering, Computer, Vision and Image Understanding, vol. 107, no. 2, pp. 97--107, 2007. Google ScholarDigital Library
T. Kasar, J. Kumar and A. G. Ramakrishnan, Font and background color independent text binarization, Proc 2nd Camera-based Document Analysis and Recognition (CBDAR), pp. 3--9, 2007.Google Scholar
T. Kasar and A. G. Ramakrishnan, COCOCLUST: Contour-based color clustering for robust binarization of colored text, Proc 3rd Camera-based Document Analysis and Recognition (CBDAR), pp. 11--17, 2009.Google Scholar
L. Neumann and J. Matas, A Method for Text Localization and Recognition in Real-World Images, Proc. 10th Asian Conference on Computer Vision (ACCV), pp. 770--783, 2010. Google ScholarDigital Library
C. Zeng, W. Jia and X. He, An Algorithm for Colour-based Natural Scene Text Segmentation, Proc 4th Camera-based Document Analysis and Recognition (CBDAR), pp. 67--72, 2011. Google ScholarDigital Library
A. Mishra, K. Alahari and C. V. Jawahar, An MRF Model for Binarization of Natural Scene Text, Proc. 11th International Conference of Document Analysis and Recognition, pp. 11--16, September 2011. Google ScholarDigital Library
A. Shahab, F. Shafait and A. Dengel, ICDAR 2011 Robust Reading Competition - Challenge 2: Reading Text in Scene Images, In Proc. 11th International Conference of Document Analysis and Recognition, pp. 1491--1496, September 2011. Google ScholarDigital Library
K. Wang, B. Babenko and S. Belongie, End-to-End Scene Text Recognition, Proc. 13th International Conference on Computer Vision (ICCV), pp. 1457--1464, 2011. Google ScholarDigital Library
A. Mishra, K. Alahari and C. V. Jawahar, Top-Down and Bottom-Up Cues for Scene Text Recognition, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2012. Google ScholarDigital Library
D. Kumar, M. N. Anil Prasad and A. G. Ramkrishnan, Benchmarking recognition results on word image datasets, CoRR, vol. abs/1208.6137, 2012. http://arxiv.org/abs/1208.6137Google Scholar
"Abbyy Fine reader," http://www.abbyy.com/.Google Scholar
"Adobe Reader," http://www.adobe.com/products/acrobatpro/scanning-ocr-to-pdf.html.Google Scholar
IAPR TC11 Reading Systems-Datasets List, http://www.iapr-tc11/mediawiki/index.php/DatasetsGoogle Scholar
"Inzisoft," http://www.inzisoft.com/english/.Google Scholar
"KAIST AIPR," http://ai.kaist.ac.kr/home/.Google Scholar
"Nuance Omnipage reader," http://www.nuance.com/.Google Scholar
"Tesseract OCR engine," http://code.google.com/p/tesseract-ocr/.Google Scholar

Index Terms

MAPS: midline analysis and propagation of segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
        Video segmentation
      2. Computer vision tasks

Recommendations

OCR for printed Kannada text to machine editable format using database approach

This paper describes an Optical Character Recognition (OCR) system for printed text documents in Kannada, a South Indian language. The proposed OCR system for the recognition of printed Kannada text, which can handle all types of Kannada characters. The ...
Read More
Hybrid OCR Techniques for Cursive Script Languages - A Review and Applications
CICSYN '10: Proceedings of the 2010 2nd International Conference on Computational Intelligence, Communication Systems and Networks

Software-based Arabic optical character recognition (OCR) has been used quite successfully for many years. However, the hardware-based implementations of the OCR – which can be 10-100 times faster than the software-only method – seem to not have been ...
Read More
OCR for printed Kannada text to machine editable format using database approach
ICAI'08: Proceedings of the 9th WSEAS International Conference on International Conference on Automation and Information

This paper describes an Optical Character Recognition (OCR) system for printed text documents in Kannada, a South Indian language. The proposed OCR system for the recognition of printed Kannada text, which can handle all types of Kannada characters. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
December 2012
633 pages
ISBN:9781450316606
DOI:10.1145/2425333
Program Chairs:
Bill Triggs
CNRS, France
,
Kavita Bala
Cornell University
,
Sharat Chandran
IIT Bombay, India
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 December 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bayesian classification
ICDAR 2003 dataset
ICDAR 2011 dataset
max-flow
midline
min-max method
optical character recognition
propagation
segmentation
text recognition
word recognition
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate95of286submissions,33%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 14
  Total Citations
  View Citations
- 128
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

MAPS: midline analysis and propagation of segmentation

ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

OCR for printed Kannada text to machine editable format using database approach

Hybrid OCR Techniques for Cursive Script Languages - A Review and Applications

OCR for printed Kannada text to machine editable format using database approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

MAPS: midline analysis and propagation of segmentation

ICVGIP '12: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

OCR for printed Kannada text to machine editable format using database approach

Hybrid OCR Techniques for Cursive Script Languages - A Review and Applications

OCR for printed Kannada text to machine editable format using database approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media