skip to main content
article

Topic tracking with time granularity reasoning

Authors Info & Claims
Published:01 December 2006Publication History
Skip Abstract Section

Abstract

Temporal information is an important attribute of a topic, and a topic usually exists in a limited period. Therefore, many researchers have explored the utilization of temporal information in topic detection and tracking (TDT). They use either a story's publication time or temporal expressions in text to derive temporal relatedness between two stories or a story and a topic. However, past research neglects the fact that people tend to express a time with different granularities as time lapses. Based on a careful investigation of temporal information in news streams, we propose a new strategy with time granularity reasoning for utilizing temporal information in topic tracking. A set of topic times, which as a whole represent the temporal attribute of a topic, are distinguished from others in the given on-topic stories. The temporal relatedness between a story and a topic is then determined by the highest coreference level between each time in the story and each topic time where the coreference level between a test time and a topic time is inferred from the two times themselves, their granularities, and the time distance between the topic time and the publication time of the story where the test time appears. Furthermore, the similarity value between an incoming story and a topic, that is the likelihood that a story is on-topic, can be adjusted only when the new story is both temporally and semantically related to the target topic. Experiments on two different TDT corpora show that our proposed method could make good use of temporal information in news stories, and it consistently outperforms the baseline centroid algorithm and other algorithms which consider temporal relatedness.

References

  1. Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y. 1998a. Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop.Google ScholarGoogle Scholar
  2. Allan, J., Papka, R., and Lavrenko, V. 1998b. Online new event detection and tracking. In Proceedings of the 21st ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'98). 37--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., LAvrenko, V., Hoberman, R., and Caputo, D. 1999. Topic Based Novelty Detection---1999 Summer Workshop at CLSP Final Report.Google ScholarGoogle Scholar
  4. Allan, J. (ed.). 2002. Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bettini, C., Jajodia, S., and Wang, S. X. 2000. Time Granularities in Databases, Data Mining, and Temporal Reasoning. Springer-Verlag Press, Berlin, Germany. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bettini, C. 2001. An investigation of time granularity. Bull. Italian Ass. AI (Special Issue on Management of Temporal Information). XIV, 1 (March).Google ScholarGoogle Scholar
  7. Brants, T., Chen, F., and Farahat, A. 2003. A system for new event detection. In Proceedings of the 26th International Conference on Research and Development in Information Retrieval (SIGIR'03). 330--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Braun, R. K. and Kaneshiro, R. 2003. Exploring topic pragmatics for new event detection in TDT-2003. In Proceedings of the Topic Detection and Tracking Workshop (Nov.) Gaithersburg, MD, 17--18.Google ScholarGoogle Scholar
  9. Carbonell, J., Yang, Y., Lafferty, J., Brown, R. D., Pierce, T., and Liu, X. 1999. CMU report on TDT-2: Segmentation, detection and tracking. In Proceedings of the DARPA Broadcast News Workshop. San Francisco, CA, 117--120.Google ScholarGoogle Scholar
  10. Combi, C., Franceschet, M., and Peron, A. 2004. Representing and reasoning about temporal granularities. J. Logic Compu. 14, 1, 51--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ferro, L., Gerber, L., Mani, I., Sundheim, B., and Wilson, G. 2005. TIDES: 2005 Standard for the Annotation of Temporal Expressions. Available at http://timex2.mitre.org/index.htm.Google ScholarGoogle Scholar
  12. Franz, M., Ittycheriah, A., McCarley J. S., and Ward, T. 2001. First story detection: Combining similarity and novelty-based approaches. S. TDT 2001 Workshop. IBM.Google ScholarGoogle Scholar
  13. Goralwalla, I. A., Leontiev, Y., Özsu, M. T., Szafron, D., and Combi, C. 2001. Temporal granularity: Completing the puzzle. J. Intelli. Inform. Syst. 16, 41--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hunter, A. 2002. Merging structured text using temporal knowledge. J. Data Knowl. Engin. 41, 29--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kim, P. et al. 2003. Extracting temporal information from Korean news articles for event detection and tracking. In Proceedings of the 20th International Conference on Computer Processing of Oriental Languages. 392--401.Google ScholarGoogle Scholar
  16. Kim, P. and Myaeng, S. H. 2004. Usefulness of temporal information automatically extracted from news articles for topic tracking. ACM Trans. Asian Lang. Inform. Process. 3, 4, 227--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kleinberg, J. 2002. Bursty and hierarchical structure in streams. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (SIGKDD'02). 91--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kumaran, G. and Allan, J. 2004. Text classification and named entities for new event detection. In Proceedings of the 27th International Conference on Research and Development in Information Retrieval (SIGIR'04). 297--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Larkey, L. S., Feng, F., Connell, M., and Lavrenko, V. 2004. Language-specific models in multilingual topic tracking. In Proceedings of the 27th International Conference on Research and Development in Information Retrieval (SIGIR'04). 402--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Li, B. 2003. Studies on topic tracking and detection in chinese news stories. Ph.D. thesis, Department of Computer Science and Technology, Peking University, Beijing, China (June).Google ScholarGoogle Scholar
  21. Li, B., Li, W., Lu, Q., and Wu, M. 2005. Profile-based event tracking. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05). 631--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Makkonen, J. and Ahonen-Myka, H. 2003. Utilizing temporal information in topic detection and tracking. In Proceedings of 7th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'03). 393--404.Google ScholarGoogle Scholar
  23. Makkonen, J., Ahonen-Myka, H., and Salmenkivi, M. 2004. Simple semantics in topic detection and tracking. Inform. Retriev. 7, 347--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mei, Q. and Zhai, C. 2005. Discovering evolutionary theme patterns from text: An exploration of temporal text mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (SIGKDD'05). 198--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. NIST. 2004. The 2004 Topic Detection and Tracking (TDT2004) Task Definition and Evaluation Plan. Available at http://www.nist.gov/speech/tests/tdt/tdt2004/.Google ScholarGoogle Scholar
  26. Salton, G. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Salton, G., Wong, A., and Yang, C. S. 1975. A vector space model for automatic indexing. Comm. ACM 18, 11, 613--620. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Witten, I. H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yang, Y., Pierce, T., and Carbonell, J. 1998. A study on retrospective and online event detection. In Proceedings of the 21st ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'98). 28--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yang, Y., Carbonell, J., Brown, R., Pierce, T., Archibald, B. T., and Liu, X. 1999. Learning approaches for detecting and tracking news events. IEEE Intell. Syst. 14, 4, 32--43. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Topic tracking with time granularity reasoning

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader