research-article

Identification of top relevant temporal expressions in documents

Authors:
Jannik Strötgen

Heidelberg University, Germany

Heidelberg University, Germany
View Profile

,
Omar Alonso

Microsoft Corp., Mountain View, CA

Microsoft Corp., Mountain View, CA
View Profile

,
Michael Gertz

Heidelberg University, Germany

Heidelberg University, Germany
View Profile

TempWeb '12: Proceedings of the 2nd Temporal Web Analytics WorkshopApril 2012Pages 33–40https://doi.org/10.1145/2169095.2169102

Published:17 April 2012Publication History

TempWeb '12: Proceedings of the 2nd Temporal Web Analytics Workshop

Pages 33–40

ABSTRACT

Temporal information is very common in textual documents, and thus, identifying, normalizing, and organizing temporal expressions is an important task in IR. Although there are some tools for temporal tagging, there is a lack in research focusing on the relevance of temporal expressions. Besides counting their frequency and verifying whether they satisfy a temporal search query, temporal expressions are often considered in isolation only. There are no methods to calculate the relevance of temporal expressions, neither in general nor with respect to a query.

In this paper, we present an approach to identify top relevant temporal expressions in documents using expression-, document-, corpus-, and query-based features. We present two relevance functions: one to calculate relevance scores for temporal expressions in general, and one with respect to a search query, which consists of a textual part, a temporal part, or both. Using two evaluation scenarios, we demonstrate the effectiveness of our approach.

References

O. Alonso, M. Gertz, and R. A. Baeza-Yates. Temporal Analysis of Document Collections: Framework and Applications. In SPIRE'10, pages 290--296, 2010. Google ScholarDigital Library
O. Alonso, M. Gertz, and R. A. Baeza-Yates. Enhancing Document Snippets Using Temporal Information. In SPIRE'11, pages 26--31, 2011. Google ScholarDigital Library
O. Alonso, J. Strötgen, R. Baeza-Yates, and M. Gertz. Temporal Information Retrieval: Challenges and Opportunities. In TWAW'11, pages 1--8, 2011.Google Scholar
I. Arikan, S. J. Bedathur, and K. Berberich. Time Will Tell: Leveraging Temporal Expressions in IR. In WSDM'09, 2009.Google Scholar
K. Berberich, S. J. Bedathur, O. Alonso, and G. Weikum. A Language Modeling Approach for Temporal Information Needs. In ECIR'10, pages 13--25, 2010. Google ScholarDigital Library
Y. Bestgen and W. Vonk. Temporal Adverbials as Segmentation Markers in Discourse Comprehension. Journal of Memory and Language, 42(1):74--87, 2000.Google ScholarCross Ref
H. Llorens, E. Saquete, and B. Navarro. TIPSem (English and Spanish): Evaluating CRFs and Semantic Roles in TempEval-2. In SemEval'10, pages 284--291, 2010. Google ScholarDigital Library
Lucene. http://lucene.apache.org/.Google Scholar
J. Makkonen, H. Ahonen-myka, and M. Salmenkivi. Topic Detection and Tracking with Spatio-Temporal Evidence. In ECIR'03, pages 251--265, 2003. Google ScholarDigital Library
C. man Au Yeung and A. Jatowt. Studying How the Past is Remembered: Towards Computational History through Large Scale Text Mining. In CIKM, pages 1231--1240, 2011. Google ScholarDigital Library
I. Mani, J. Pustejovsky, and R. Gaizauskas, editors. The Language of Time. Oxford University Press, 2005.Google Scholar
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarDigital Library
C. D. Manning and H. Schuetze. Foundations of Statistical Natural Language Processing. The MIT Press, 1 edition, June 1999. Google ScholarDigital Library
D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving Search Relevance for Implicitly Temporal Queries. In SIGIR '09, pages 700--701, 2009. Google ScholarDigital Library
S. Nunes, C. Ribeiro, and G. David. Use of Temporal Expressions in Web Search. In ECIR'08, pages 580--584, 2008. Google ScholarDigital Library
OpenNLP. http://opennlp.sourceforge.net/.Google Scholar
S.-T. Park, D. M. Pennock, C. L. Giles, and R. Krovetz. Analysis of Lexical Signatures for Improving Information Persistence on the World Wide Web. ACM Trans. Inf. Syst., 22(4):540--572, 2004. Google ScholarDigital Library
F. Schilder and C. Habel. From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In Proceedings of the Workshop on Temporal and Spatial Information Processing, pages 65--72, 2001. Google ScholarDigital Library
M. Shokouhi. Detecting seasonal queries by time-series analysis. In SIGIR, pages 1171--1172, 2011. Google ScholarDigital Library
J. Strötgen and M. Gertz. HeidelTime: High Quality Rule-based Extraction and Normalization of Temporal Expressions. In SemEval'10, 2010.Google Scholar
J. Strötgen and M. Gertz. TimeTrails: A System for Exploring Spatio-Temporal Information in Documents. PVLDB, 3(2):1569--1572, 2010. Google ScholarDigital Library
J. Strötgen and M. Gertz. Multilingual and Cross-domain Temporal Tagging. Language Resources and Evaluation, accepted for journal publication, 2012.Google Scholar
J. Strötgen, M. Gertz, and C. Junghans. An Event-centric Model for Multilingual Document Similarity. In SIGIR'11, pages 953--962, 2011. Google ScholarDigital Library
TimeML. http://www.timeml.org/.Google Scholar
UIMA. http://uima.apache.org/.Google Scholar
M. Verhagen, R. Gaizauskas, F. Schilder, M. Hepple, G. Katz, and J. Pustejovsky. SemEval-2007 Task 15: TempEval Temporal Relation Identification. In SemEval'07, pages 75--80, 2007. Google ScholarDigital Library
M. Verhagen and J. Pustejovsky. Temporal Processing with the TARSQI Toolkit. In Coling 2008: Companion volume: Demonstrations, pages 189--192, 2008. Google ScholarDigital Library
M. Verhagen, R. Sauri, T. Caselli, and J. Pustejovsky. SemEval-2010 Task 13: TempEval-2. In SemEval'10, pages 57--62, 2010. Google ScholarDigital Library
Wikipedia Featured Articles. http://en.wikipedia.org/wiki/Wikipedia: Featured_articles.Google Scholar

Index Terms

Identification of top relevant temporal expressions in documents
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization

This paper presents an improvement in the temporal expression (TE) recognition phase of a knowledge based system at a multilingual level. For this purpose, the combination of different approaches applied to the recognition of temporal expressions are ...
Read More
Automatic transformation from TIDES to TimeML annotation

Until recently, most systems performing temporal extraction and reasoning from text have focused on recognizing and normalizing temporal expressions alone, for which the TIDES annotation scheme has been adopted. Temporal awareness of a text, however, ...
Read More
Temporal document retrieval model for business news archives
Special issue: Cross-language information retrieval

Temporal expressions occurring in business news, such as "last week" or "at the end of this month," carry important information about the time context of the news document and were proved to be useful for document retrieval. We found that about 10% of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
TempWeb '12: Proceedings of the 2nd Temporal Web Analytics Workshop
April 2012
55 pages
ISBN:9781450311885
DOI:10.1145/2169095
Conference Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Julien Masanès
Internet Memory Foundation, France and Netherlands
,
Marc Spaniol
Max-Planck-Institut für Informatik, Germany
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 April 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ranking
relevance scoring
snippets
temporal expressions
temporal information
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 415
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Identification of top relevant temporal expressions in documents

TempWeb '12: Proceedings of the 2nd Temporal Web Analytics Workshop

ABSTRACT

References

Cited By

Index Terms

Recommendations

Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization

Automatic transformation from TIDES to TimeML annotation

Temporal document retrieval model for business news archives