ABSTRACT
The paper discusses the problem of patent image retrieval. It describes the issues faced when extracting semantic data of images in patents, as well as an integration framework between the data thus extracted and semantic information extracted from text. Combining the two sources of knowledge is on the wish list of many patent information users, as current systems either search only the textual data, or have extremely limited image processing functionality. In practice in the patent domain, depictions of the product or method are often vital to the understanding of the invention. Yet they are almost completely unsearchable. They are tools enclosed in a glass case, at which we can look, but of which we cannot really make use. The IMPEx Project (Image Mining for Patent Exploration) cracks open this case with a new focus on processing this particular type of images. This paper presents the motivations, status and aims of the project.
- D. Alberts, C. B. Yang, D. Fobare-DePonio, K. Koubek, S. Robins, M. Rodgers, E. Simmons, and D. DeMarco. Current Challenges in Patent Information Retrieval, chapter 1: Introduction to Patent Searching - Practical Experience and Requirements for Searching the Patent Space. Springer Verlag, 2011.Google Scholar
- J. M. Barnard and G. M. Downs. Use of markush structure techniques to avoid enumeration in diversity analysis of large combinatorial libraries. http://www.daylight.com/meetings/mug97/Barnard/970227JB.html, (visited 03/2012) 1997.Google Scholar
- J. M. Barnard and P. M. Wright. Towards in-house searching of Markush structures from patents. World Patent Information, 31(2), 2009.Google Scholar
- D. Conte, P. Foggia, C. Sansone, and M. Vento. Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence, 18(4), 2004.Google Scholar
- Fairview Research. Alexandria patent data warehouse. http://www.intellogist.com/wiki/Alexandria, 2011.Google Scholar
- U. Garain and B. Chaudhuri. A corpus for ocr research on mathematical expressions. Int. J. Doc. Anal. Recognit., 7(4):241--259, Sept. 2005. Google ScholarDigital Library
- A. Hanbury, N. Bhatti, M. Lupu, and R. Mörzinger. Patent image retrieval: A survey. In Proc. of PaIR, 2011. Google ScholarDigital Library
- R. M. Haralick and L. G. Shapiro. Computer and Robot Vision. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edition, 1992. Google ScholarDigital Library
- B. Huet, G. Guarascio, N. J. Kern, and B. Mérialdo. Relational skeletons for retrieval in patent drawings. In ICIP (2), pages 737--740, 2001.Google Scholar
- A. Leach and V. Gillet. An Introduction to Chemoinformatics. Springer, 2007. Google ScholarDigital Library
- L. Li and C. L. Tan. Associating figures with descriptions for patent documents. In Proc. of DAS, 2010. Google ScholarDigital Library
- M. Lupu, J. Huang, J. Zhu, and J. Tait. TREC Chemical Information Retrieval - An Evaluation Effort for Chemical IR Systems. WPI Journal, 2011.Google Scholar
- M. Lupu, Z. Jiashu, J. Huang, H. Gurulingappa, I. Filipov, and J. Tait. Overview of the trec 2011 chemical ir track. In Proc. of TREC, 2011.Google Scholar
- R. Mörzinger, A. Horti, G. Thallinger, N. Bhatti, and A. Hanbury. Classifying patent images. In CLEF (Notebook Papers/Labs/Workshop), 2011.Google Scholar
- F. Piroi, M. Lupu, A. Hanbury, and V. Zenz. Clef-ip 2011: Retrieval in the intellectual property domain. In CLEF (Notebook Papers/Labs/Workshop), 2011.Google Scholar
- K. Riesen, X. Jiang, and H. Bunke. Exact and Inexact Graph Matching: Methodology and Applications. In C. C. Aggarwal and H. Wang, editors, Managing and Mining Graph Data, volume 40 of Advances in Database Systems. Springer, 2010.Google Scholar
- N. M. Sadawi, A. P. Sexton, and V. Sorge. Performance of MolRec at TREC 2011 Overview and Analysis of Results. In Proc. of TREC, 2011.Google Scholar
- P. Sidiropoulos, S. Vrochidis, and I. Kompatsiaris. Content-based binary image retrieval using the adaptive hierarchical density histogram. Pattern Recognition, 44(4):739--750, 2011. Google ScholarDigital Library
- V. Smolov, F. Zentsev, and M. Rybalkin. Imago: open-source toolkit for 2D chemical structure image recognition. In Proc. of TREC, 2011.Google Scholar
- A. Tiwari and V. Bansal. Patseek: Content based image retrieval system for patent database. In ICEB, pages 1167--1171, 2004.Google Scholar
- S. Vrochidis, S. Papadopoulos, A. Moumtzidou, P. Sidiropoulos, E. Pianta, and I. Kompatsiaris. Towards content-based patent image retrieval: A framework perspective. World Patent Information, 32(2):94--106, 2010.Google ScholarCross Ref
- M. Zimmermann. Chemical structure reconstruction with chemocr. In Proc. of TREC, 2011.Google Scholar
Index Terms
- Patent images - a glass-encased tool: opening the case
Recommendations
The Duration of Patent Examination at the European Patent Office
We analyze the duration and outcomes of patent examination at the European Patent Office utilizing an unusually rich data set covering a random sample of 215,265 applications filed between 1982 and 1998. In our empirical analysis, we distinguish between ...
Using genre-specific features for patent summaries
Targeted summarization technique for patent material.Segment as intra-sentence summarization unit.Exploitation of lexical chains across the whole patent document.Full-fledged text generation techniques for summarization. Patent search is recall-driven, ...
Patent overlay mapping: Visualizing technological distance
This paper presents a new global patent map that represents all technological categories and a method to locate patent data of individual organizations and technological fields on the global map. This overlay map technique may support competitive ...
Comments