ABSTRACT
Nowadays, document image retrieval systems are increasingly applicable by various businesses, governmental and academic organizations. ELEPAP (Hellenic Protection and Rehabilitation Centre for Disabled Children) is an organization which needs more efficient ways of managing its huge volume of archived documents. This paper deals with the preprocessing procedures of well-known OCR systems in order to extract specific features from ELEPAP's patients' cards. It is shown that our proposed methodology can provide good IT solutions for ELEPAP in order to extract information from its old archives.
- V. Govindaraju, H. Cao and A. Bhardwaj. Handwritten Document RetrievalStrategies, Proc. of ICDAR workshop on Noisy Text Analytics (AND), Spain, 2009. Google ScholarDigital Library
- N. Nikolaou, M. Makridis, B. Gatos, N. Stamatopoulos and N. Papamarkos. Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths. Image and Vision Computing, vol. 28, no. 4, 590--604, 2010. Google ScholarDigital Library
- B. Mund and K-H.Steinke. Processing Handwritten Words by Intelligent Use of OCR Results. Springer, LcNs. in Computer Science, Vol. 6171, Advances in Data Mining, 174--185, 2010. Google ScholarDigital Library
- E. Kavallieratou and E. Stamatatos, Improving the quality of degraded document images, in Proc. Int'l Conf. Document Image Analysis for Libraries (DIAL), (Lyon, France), 2006. Google ScholarDigital Library
- S. Vavilis, E. Kavallieratou. A tool for Tuning Binarization Techniques, ICDAR 2011.Google Scholar
- E. Kavallieratou, N. Fakotakis, and G. Kokkinakis. Skew angle estimation for printed and handwritten documents using the wigner-ville distribution. Image and Vision Computing, 20: 813--824, 2002.Google ScholarCross Ref
- A. Rehman, D. Mohammad, T. Saba. Skewed Line Detection and Removal Preserving Handwritten Strokes: A New Approach, College Science in India, 2009.Google Scholar
Index Terms
- An information extraction system from patient historical documents
Recommendations
Semantics-Based Content Extraction in Typewritten Historical Documents
ICDAR '05: Proceedings of the Eighth International Conference on Document Analysis and RecognitionThis paper presents a flexible approach to extracting content from scanned historical documents using semantic information. The final electronic document is the result of a "digital historical document lifecycle" process, where the expert knowledge of ...
Automatic keyphrase extraction for Arabic news documents based on KEA system
A keyphrase is a sequence of words that play an important role in the identification of the topics that are embedded in a given document. Keyphrase extraction is a process which extracts such phrases. This has many important applications such as document ...
HistDoc - a toolbox for processing images of historical documents
ICIAR'10: Proceedings of the 7th international conference on Image Analysis and Recognition - Volume Part IIHistDoc is a software tool designed to process images of historical documents. It has two operation modes: standalone mode - one can process one image a time; and batch mode - one can process thousands of documents automatically. This tool automatically ...
Comments