ABSTRACT
Temporal information is very common in textual documents, and thus, identifying, normalizing, and organizing temporal expressions is an important task in IR. Although there are some tools for temporal tagging, there is a lack in research focusing on the relevance of temporal expressions. Besides counting their frequency and verifying whether they satisfy a temporal search query, temporal expressions are often considered in isolation only. There are no methods to calculate the relevance of temporal expressions, neither in general nor with respect to a query.
In this paper, we present an approach to identify top relevant temporal expressions in documents using expression-, document-, corpus-, and query-based features. We present two relevance functions: one to calculate relevance scores for temporal expressions in general, and one with respect to a search query, which consists of a textual part, a temporal part, or both. Using two evaluation scenarios, we demonstrate the effectiveness of our approach.
- O. Alonso, M. Gertz, and R. A. Baeza-Yates. Temporal Analysis of Document Collections: Framework and Applications. In SPIRE'10, pages 290--296, 2010. Google ScholarDigital Library
- O. Alonso, M. Gertz, and R. A. Baeza-Yates. Enhancing Document Snippets Using Temporal Information. In SPIRE'11, pages 26--31, 2011. Google ScholarDigital Library
- O. Alonso, J. Strötgen, R. Baeza-Yates, and M. Gertz. Temporal Information Retrieval: Challenges and Opportunities. In TWAW'11, pages 1--8, 2011.Google Scholar
- I. Arikan, S. J. Bedathur, and K. Berberich. Time Will Tell: Leveraging Temporal Expressions in IR. In WSDM'09, 2009.Google Scholar
- K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A Language Modeling Approach for Temporal Information Needs. In ECIR'10, pages 13--25, 2010. Google ScholarDigital Library
- Y. Bestgen and W. Vonk. Temporal Adverbials as Segmentation Markers in Discourse Comprehension. Journal of Memory and Language, 42(1):74--87, 2000.Google ScholarCross Ref
- H. Llorens, E. Saquete, and B. Navarro. TIPSem (English and Spanish): Evaluating CRFs and Semantic Roles in TempEval-2. In SemEval'10, pages 284--291, 2010. Google ScholarDigital Library
- Lucene. http://lucene.apache.org/.Google Scholar
- J. Makkonen, H. Ahonen-myka, and M. Salmenkivi. Topic Detection and Tracking with Spatio-Temporal Evidence. In ECIR'03, pages 251--265, 2003. Google ScholarDigital Library
- C. man Au Yeung and A. Jatowt. Studying How the Past is Remembered: Towards Computational History through Large Scale Text Mining. In CIKM, pages 1231--1240, 2011. Google ScholarDigital Library
- I. Mani, J. Pustejovsky, and R. Gaizauskas, editors. The Language of Time. Oxford University Press, 2005.Google Scholar
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarDigital Library
- C. D. Manning and H. Schuetze. Foundations of Statistical Natural Language Processing. The MIT Press, 1 edition, June 1999. Google ScholarDigital Library
- D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving Search Relevance for Implicitly Temporal Queries. In SIGIR '09, pages 700--701, 2009. Google ScholarDigital Library
- S. Nunes, C. Ribeiro, and G. David. Use of Temporal Expressions in Web Search. In ECIR'08, pages 580--584, 2008. Google ScholarDigital Library
- OpenNLP. http://opennlp.sourceforge.net/.Google Scholar
- S.-T. Park, D. M. Pennock, C. L. Giles, and R. Krovetz. Analysis of Lexical Signatures for Improving Information Persistence on the World Wide Web. ACM Trans. Inf. Syst., 22(4):540--572, 2004. Google ScholarDigital Library
- F. Schilder and C. Habel. From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In Proceedings of the Workshop on Temporal and Spatial Information Processing, pages 65--72, 2001. Google ScholarDigital Library
- M. Shokouhi. Detecting seasonal queries by time-series analysis. In SIGIR, pages 1171--1172, 2011. Google ScholarDigital Library
- J. Strötgen and M. Gertz. HeidelTime: High Quality Rule-based Extraction and Normalization of Temporal Expressions. In SemEval'10, 2010.Google Scholar
- J. Strötgen and M. Gertz. TimeTrails: A System for Exploring Spatio-Temporal Information in Documents. PVLDB, 3(2):1569--1572, 2010. Google ScholarDigital Library
- J. Strötgen and M. Gertz. Multilingual and Cross-domain Temporal Tagging. Language Resources and Evaluation, accepted for journal publication, 2012.Google Scholar
- J. Strötgen, M. Gertz, and C. Junghans. An Event-centric Model for Multilingual Document Similarity. In SIGIR'11, pages 953--962, 2011. Google ScholarDigital Library
- TimeML. http://www.timeml.org/.Google Scholar
- UIMA. http://uima.apache.org/.Google Scholar
- M. Verhagen, R. Gaizauskas, F. Schilder, M. Hepple, G. Katz, and J. Pustejovsky. SemEval-2007 Task 15: TempEval Temporal Relation Identification. In SemEval'07, pages 75--80, 2007. Google ScholarDigital Library
- M. Verhagen and J. Pustejovsky. Temporal Processing with the TARSQI Toolkit. In Coling 2008: Companion volume: Demonstrations, pages 189--192, 2008. Google ScholarDigital Library
- M. Verhagen, R. Sauri, T. Caselli, and J. Pustejovsky. SemEval-2010 Task 13: TempEval-2. In SemEval'10, pages 57--62, 2010. Google ScholarDigital Library
- Wikipedia Featured Articles. http://en.wikipedia.org/wiki/Wikipedia: Featured_articles.Google Scholar
Index Terms
- Identification of top relevant temporal expressions in documents
Recommendations
Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization
This paper presents an improvement in the temporal expression (TE) recognition phase of a knowledge based system at a multilingual level. For this purpose, the combination of different approaches applied to the recognition of temporal expressions are ...
Automatic transformation from TIDES to TimeML annotation
Until recently, most systems performing temporal extraction and reasoning from text have focused on recognizing and normalizing temporal expressions alone, for which the TIDES annotation scheme has been adopted. Temporal awareness of a text, however, ...
Temporal document retrieval model for business news archives
Special issue: Cross-language information retrievalTemporal expressions occurring in business news, such as "last week" or "at the end of this month," carry important information about the time context of the news document and were proved to be useful for document retrieval. We found that about 10% of ...
Comments