research-article

Clustering and exploring search results using timeline constructions

Authors:
Omar Alonso

University of California, Davis, Davis, CA, USA

University of California, Davis, Davis, CA, USA
View Profile

,
Michael Gertz

University of Heidelberg, Heidelberg, Germany

University of Heidelberg, Heidelberg, Germany
View Profile

,
Ricardo Baeza-Yates

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementNovember 2009Pages 97–106https://doi.org/10.1145/1645953.1645968

Published:02 November 2009Publication History

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Pages 97–106

ABSTRACT

Time is an important dimension of any information space and can be very useful in information retrieval and in particular clustering and exploration of search results. Search result clustering is a feature integrated in some of today's search engines, allowing users to further explore search results. However, only little work has been done on exploiting temporal information embedded in documents for the presentation, clustering, and exploration of search results along well-defined timelines.

In this paper, we present an add-on to traditional information retrieval applications in which we exploit various temporal information associated with documents to present and cluster documents along timelines. Temporal information expressed in the form of, e.g., date and time tokens or temporal references, appear in documents as part of the textual context or metadata. Using temporal entity extraction techniques, we show how temporal expressions are made explicit and used in the construction of multiple-granularity timelines. We discuss how hit-list based search results can be clustered according to temporal aspects, anchored in the constructed timelines, and how time-based document clusters can be used to explore search results that include temporal snippets. We also outline a prototypical implementation and evaluation that demonstrates the feasibility and functionality of our framework.

References

Alembic: http://www.mitre.org/tech/alembic-workbench/Google Scholar
R. Al-Kamha and D. Embley: Grouping Search--Engine Returned Citations for Person-NameQueries. In 6th ACM International Workshop on Web Information and Data Management (WIDM 2004), ACM, 96--103, 2004. Google ScholarDigital Library
J. Allan, R. Gupta and V. Khandelwal: Temporal Summaries of News Topics. In Proc. of the 24th International ACM SIGIR Conference, ACM, 10--18, 2001. Google ScholarDigital Library
O. Alonso, R. Baeza-Yates, and M. Gertz: Effectiveness of Temporal Snippets. WSSP Workshop, WWW Madrid, 2009.Google Scholar
O. Alonso, M. Gertz, and R. Baeza-Yates: On the Value of Temporal Information in Temporal Information Retrieval. SIGIR Forum, 41(2):35--41, 2007 Google ScholarDigital Library
O. Alonso, D. E. Rose, and B. Stewart: Crowd sourcing for Relevance Evaluation SIGIR Forum (42):2, 12--18, 2008. Google ScholarDigital Library
I. Arikan, S. Bedathur, and K. Berberich: Time Will Tell: Leveraging Temporal Expressions in IR. WSDM Late Breaking Results, Barcelona, 2009.Google Scholar
A. Aula, N. Jhaveri, and M. Kaki: Information Search Re-access Strategies of Experienced Web Users. In phProc. of the 14th World Wide Web Conference, ACM,583--592, 2005. Google ScholarDigital Library
R. Baeza-Yates: Searching the Future. In SIGIR Workshop MF/IR, 2005.Google Scholar
C. Carpineto, S. Osinski, G. Romano, and D. Weiss: A Survey of Web Clustering Engines. In ACM Computing Surveys, 41(3), 2009. Google ScholarDigital Library
R. Catizone, A. Dalli, and Y. Wilks: Evaluating Automatically Generated Timelines from the Web. In 5th International Conference on Language Resources and Evaluation, 2006.Google Scholar
DMOZ http://www.dmoz.org/.Google Scholar
M. Dubinko et al.: Visualizing Tags over Time. In Proc. of 15th World Wide Web Conference, ACM,193--202, 2006. Google ScholarDigital Library
P. Ferragina and A. Gulli: A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering. In 14th International Conference on World Wide Web (Special interest tracks and posters), 801--810, 2005. Google ScholarDigital Library
GUTime, http://complingone.georgetown.edu/linguist/Google Scholar
A. Jain, M. Murthy, and P. Flynn: Data Clustering: A Survey. ACM Computing Surveys, 31(3):264--323, 1999. Google ScholarDigital Library
D. Koen and W. Bender: Time Frames: Temporal Augmentation of the News. IBM System Journal, 39(4):597--616, 2000. Google ScholarDigital Library
P.J. Kalczynski and A. Chou: Temporal Document Retrieval Model for Business News Archives. Information Processing&Management 41, 635--650, 2005. Google ScholarDigital Library
A. Kittur, E. H. Chi, and B. Suh: Crowd sourcing User Studies with Mechanical Turk. In Proc. 26th SIGCHI Conference on Human Factors in Computing Systems, 453--456, 2008. Google ScholarDigital Library
J. Makkonen and H. Ahonen-Myka: Utilizing Temporal Expressions in Topic Detection and Tracking. In phResearch and Advanced Technology for Digital Libraries, LNCS 2769, Springer, 393--404, 2003.Google Scholar
I. Mani, J. Pustejovsky, and R. Gaizauskas (Eds.): The Language of Time. Oxford University Press, 2005.Google Scholar
I. Mani, J. Pustejovsky, and B. Sundheim: Introduction to the Special Issue on Temporal Information Processing. ACM Trans. on Asian Language Inf. Processing,3(1):1--10, 2004. Google ScholarDigital Library
P. Pirolli: Information Foraging Theory. Oxford University Press, 2007.Google ScholarDigital Library
M. Pasca: Towards Temporal Web Search. ACM Symposium on Applied Computing, 1117--1121, 2008. Google ScholarDigital Library
J. Pustejovsky et al.: TimeML: Robust Specification of Event and Temporal Expressions in Text. New Directions in Question Answering, AAAI Spring Symp., 28--34, 2003.Google Scholar
J. Pustejovsky et al.: TimeBank 1.2 Documentation http://timeml.org/site/timebank/documentation-1.2.htmlGoogle Scholar
A. Qamra, B. Tseng, and E. Chang: Mining Blog Stories Using Community-Based and Temporal Clustering. In Proc. 15th ACM International Conference on Information and Knowledge Management, ACM, 58--67, 2006. Google ScholarDigital Library
M. Ringel, E. Cutrell, S. Dumais, and E. Horvitz: Milestones in Time: The Value of Landmarks in Retrieving Information from Personal Stores. In IFIP TC13 International Conference on Human-Computer Interaction, 2003.Google Scholar
F. Schilder and C. Habel: From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In ACL'01 Workshop on Temporal and Spatial Information Processing, 1--8, 2001. Google ScholarDigital Library
B. Shaparenko et al.: Identifying Temporal Patterns and Key Players in Document Collections. In Proc. IEEE ICDM Workshop on Temporal Data Mining:Algorithms, Theory and Applications (TDM-05), 165--174, 2005.Google Scholar
TimeML 1.2.1 Specification: http://www.timeml.orgGoogle Scholar
H. Toda and R. Kataoka: A Search Result Clustering Method using Informatively Named Entities. In 7th ACM International Workshop on Web Information and Data Management (WIDM 2005), ACM, 81--86, 2005. Google ScholarDigital Library
Vivisimo, http://www.vivisimo.com.Google Scholar
R. White, K. Kules, S. Drucker, and M. Schraefel (Eds). Supporting Exploratory Search. Communication of the ACM 49(4), April 2006. Google ScholarDigital Library
R. White, G. Marchionini and G. Muresan: Evaluating Exploratory Search Systems:A Special Topic Issue of Information Processing and Management. Information Processing and Management, 44(2), 433--436, 2008. Google ScholarDigital Library
O. Zamir and O. Etzioni: Web Document Clustering: A Feasibility Demonstration. In Proc. of 21st International ACM SIGIR Conference,ACM, 46--54, 1998. Google ScholarDigital Library

Index Terms

Clustering and exploring search results using timeline constructions
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Event-centric search and exploration in document collections
JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

Textual data ranging from corpora of digitized historic documents to large collections of news feeds provide a rich source for temporal and geographic information. Such types of information have recently gained a lot of interest in support of different ...
Read More
A language for manipulating clustered web documents results
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

We propose a novel conception language for exploring the results retrieved by several internet search services (like search engines) that cluster retrieved documents. The goal is to offer users a tool to discover relevant hidden relationships between ...
Read More
Handling temporal information in web search engines

TheWeb can be considered a vast repository of temporal information, as it daily receives a huge amount of new pages. Generally, users are interested in information related to a specific temporal interval. In the information retrieval area, researches ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
November 2009
2162 pages
ISBN:9781605585123
DOI:10.1145/1645953
General Chairs:
David Cheung
University of Hong Kong, Hong Kong
,
Il-Yeol Song
Drexel University, USA
,
Program Chairs:
Wesley Chu
UCLA, USA
,
Xiaohua Hu
Drexel University, USA
,
Jimmy Lin
University of Maryland, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
exploratory search
hit list clustering
temporal information
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 96
  Total Citations
  View Citations
- 1,420
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Clustering and exploring search results using timeline constructions

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Event-centric search and exploration in document collections

A language for manipulating clustered web documents results

Handling temporal information in web search engines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Clustering and exploring search results using timeline constructions

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Event-centric search and exploration in document collections

A language for manipulating clustered web documents results

Handling temporal information in web search engines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media