skip to main content
10.1145/1645953.1645968acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Clustering and exploring search results using timeline constructions

Published:02 November 2009Publication History

ABSTRACT

Time is an important dimension of any information space and can be very useful in information retrieval and in particular clustering and exploration of search results. Search result clustering is a feature integrated in some of today's search engines, allowing users to further explore search results. However, only little work has been done on exploiting temporal information embedded in documents for the presentation, clustering, and exploration of search results along well-defined timelines.

In this paper, we present an add-on to traditional information retrieval applications in which we exploit various temporal information associated with documents to present and cluster documents along timelines. Temporal information expressed in the form of, e.g., date and time tokens or temporal references, appear in documents as part of the textual context or metadata. Using temporal entity extraction techniques, we show how temporal expressions are made explicit and used in the construction of multiple-granularity timelines. We discuss how hit-list based search results can be clustered according to temporal aspects, anchored in the constructed timelines, and how time-based document clusters can be used to explore search results that include temporal snippets. We also outline a prototypical implementation and evaluation that demonstrates the feasibility and functionality of our framework.

References

  1. Alembic: http://www.mitre.org/tech/alembic-workbench/Google ScholarGoogle Scholar
  2. R. Al-Kamha and D. Embley: Grouping Search--Engine Returned Citations for Person-NameQueries. In 6th ACM International Workshop on Web Information and Data Management (WIDM 2004), ACM, 96--103, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Allan, R. Gupta and V. Khandelwal: Temporal Summaries of News Topics. In Proc. of the 24th International ACM SIGIR Conference, ACM, 10--18, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Alonso, R. Baeza-Yates, and M. Gertz: Effectiveness of Temporal Snippets. WSSP Workshop, WWW Madrid, 2009.Google ScholarGoogle Scholar
  5. O. Alonso, M. Gertz, and R. Baeza-Yates: On the Value of Temporal Information in Temporal Information Retrieval. SIGIR Forum, 41(2):35--41, 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. O. Alonso, D. E. Rose, and B. Stewart: Crowd sourcing for Relevance Evaluation SIGIR Forum (42):2, 12--18, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Arikan, S. Bedathur, and K. Berberich: Time Will Tell: Leveraging Temporal Expressions in IR. WSDM Late Breaking Results, Barcelona, 2009.Google ScholarGoogle Scholar
  8. A. Aula, N. Jhaveri, and M. Kaki: Information Search Re-access Strategies of Experienced Web Users. In phProc. of the 14th World Wide Web Conference, ACM,583--592, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Baeza-Yates: Searching the Future. In SIGIR Workshop MF/IR, 2005.Google ScholarGoogle Scholar
  10. C. Carpineto, S. Osinski, G. Romano, and D. Weiss: A Survey of Web Clustering Engines. In ACM Computing Surveys, 41(3), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Catizone, A. Dalli, and Y. Wilks: Evaluating Automatically Generated Timelines from the Web. In 5th International Conference on Language Resources and Evaluation, 2006.Google ScholarGoogle Scholar
  12. DMOZ http://www.dmoz.org/.Google ScholarGoogle Scholar
  13. M. Dubinko et al.: Visualizing Tags over Time. In Proc. of 15th World Wide Web Conference, ACM,193--202, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Ferragina and A. Gulli: A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering. In 14th International Conference on World Wide Web (Special interest tracks and posters), 801--810, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. GUTime, http://complingone.georgetown.edu/linguist/Google ScholarGoogle Scholar
  16. A. Jain, M. Murthy, and P. Flynn: Data Clustering: A Survey. ACM Computing Surveys, 31(3):264--323, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Koen and W. Bender: Time Frames: Temporal Augmentation of the News. IBM System Journal, 39(4):597--616, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P.J. Kalczynski and A. Chou: Temporal Document Retrieval Model for Business News Archives. Information Processing&Management 41, 635--650, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Kittur, E. H. Chi, and B. Suh: Crowd sourcing User Studies with Mechanical Turk. In Proc. 26th SIGCHI Conference on Human Factors in Computing Systems, 453--456, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Makkonen and H. Ahonen-Myka: Utilizing Temporal Expressions in Topic Detection and Tracking. In phResearch and Advanced Technology for Digital Libraries, LNCS 2769, Springer, 393--404, 2003.Google ScholarGoogle Scholar
  21. I. Mani, J. Pustejovsky, and R. Gaizauskas (Eds.): The Language of Time. Oxford University Press, 2005.Google ScholarGoogle Scholar
  22. I. Mani, J. Pustejovsky, and B. Sundheim: Introduction to the Special Issue on Temporal Information Processing. ACM Trans. on Asian Language Inf. Processing,3(1):1--10, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Pirolli: Information Foraging Theory. Oxford University Press, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Pasca: Towards Temporal Web Search. ACM Symposium on Applied Computing, 1117--1121, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Pustejovsky et al.: TimeML: Robust Specification of Event and Temporal Expressions in Text. New Directions in Question Answering, AAAI Spring Symp., 28--34, 2003.Google ScholarGoogle Scholar
  26. J. Pustejovsky et al.: TimeBank 1.2 Documentation http://timeml.org/site/timebank/documentation-1.2.htmlGoogle ScholarGoogle Scholar
  27. A. Qamra, B. Tseng, and E. Chang: Mining Blog Stories Using Community-Based and Temporal Clustering. In Proc. 15th ACM International Conference on Information and Knowledge Management, ACM, 58--67, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Ringel, E. Cutrell, S. Dumais, and E. Horvitz: Milestones in Time: The Value of Landmarks in Retrieving Information from Personal Stores. In IFIP TC13 International Conference on Human-Computer Interaction, 2003.Google ScholarGoogle Scholar
  29. F. Schilder and C. Habel: From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In ACL'01 Workshop on Temporal and Spatial Information Processing, 1--8, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Shaparenko et al.: Identifying Temporal Patterns and Key Players in Document Collections. In Proc. IEEE ICDM Workshop on Temporal Data Mining:Algorithms, Theory and Applications (TDM-05), 165--174, 2005.Google ScholarGoogle Scholar
  31. TimeML 1.2.1 Specification: http://www.timeml.orgGoogle ScholarGoogle Scholar
  32. H. Toda and R. Kataoka: A Search Result Clustering Method using Informatively Named Entities. In 7th ACM International Workshop on Web Information and Data Management (WIDM 2005), ACM, 81--86, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Vivisimo, http://www.vivisimo.com.Google ScholarGoogle Scholar
  34. R. White, K. Kules, S. Drucker, and M. Schraefel (Eds). Supporting Exploratory Search. Communication of the ACM 49(4), April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. White, G. Marchionini and G. Muresan: Evaluating Exploratory Search Systems:A Special Topic Issue of Information Processing and Management. Information Processing and Management, 44(2), 433--436, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. O. Zamir and O. Etzioni: Web Document Clustering: A Feasibility Demonstration. In Proc. of 21st International ACM SIGIR Conference,ACM, 46--54, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Clustering and exploring search results using timeline constructions

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
        November 2009
        2162 pages
        ISBN:9781605585123
        DOI:10.1145/1645953

        Copyright © 2009 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 November 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader