Abstract
Temporal information is an important attribute of a topic, and a topic usually exists in a limited period. Therefore, many researchers have explored the utilization of temporal information in topic detection and tracking (TDT). They use either a story's publication time or temporal expressions in text to derive temporal relatedness between two stories or a story and a topic. However, past research neglects the fact that people tend to express a time with different granularities as time lapses. Based on a careful investigation of temporal information in news streams, we propose a new strategy with time granularity reasoning for utilizing temporal information in topic tracking. A set of topic times, which as a whole represent the temporal attribute of a topic, are distinguished from others in the given on-topic stories. The temporal relatedness between a story and a topic is then determined by the highest coreference level between each time in the story and each topic time where the coreference level between a test time and a topic time is inferred from the two times themselves, their granularities, and the time distance between the topic time and the publication time of the story where the test time appears. Furthermore, the similarity value between an incoming story and a topic, that is the likelihood that a story is on-topic, can be adjusted only when the new story is both temporally and semantically related to the target topic. Experiments on two different TDT corpora show that our proposed method could make good use of temporal information in news stories, and it consistently outperforms the baseline centroid algorithm and other algorithms which consider temporal relatedness.
- Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y. 1998a. Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop.Google Scholar
- Allan, J., Papka, R., and Lavrenko, V. 1998b. Online new event detection and tracking. In Proceedings of the 21st ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'98). 37--45. Google ScholarDigital Library
- Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., LAvrenko, V., Hoberman, R., and Caputo, D. 1999. Topic Based Novelty Detection---1999 Summer Workshop at CLSP Final Report.Google Scholar
- Allan, J. (ed.). 2002. Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers. Google ScholarDigital Library
- Bettini, C., Jajodia, S., and Wang, S. X. 2000. Time Granularities in Databases, Data Mining, and Temporal Reasoning. Springer-Verlag Press, Berlin, Germany. Google ScholarDigital Library
- Bettini, C. 2001. An investigation of time granularity. Bull. Italian Ass. AI (Special Issue on Management of Temporal Information). XIV, 1 (March).Google Scholar
- Brants, T., Chen, F., and Farahat, A. 2003. A system for new event detection. In Proceedings of the 26th International Conference on Research and Development in Information Retrieval (SIGIR'03). 330--337. Google ScholarDigital Library
- Braun, R. K. and Kaneshiro, R. 2003. Exploring topic pragmatics for new event detection in TDT-2003. In Proceedings of the Topic Detection and Tracking Workshop (Nov.) Gaithersburg, MD, 17--18.Google Scholar
- Carbonell, J., Yang, Y., Lafferty, J., Brown, R. D., Pierce, T., and Liu, X. 1999. CMU report on TDT-2: Segmentation, detection and tracking. In Proceedings of the DARPA Broadcast News Workshop. San Francisco, CA, 117--120.Google Scholar
- Combi, C., Franceschet, M., and Peron, A. 2004. Representing and reasoning about temporal granularities. J. Logic Compu. 14, 1, 51--77. Google ScholarDigital Library
- Ferro, L., Gerber, L., Mani, I., Sundheim, B., and Wilson, G. 2005. TIDES: 2005 Standard for the Annotation of Temporal Expressions. Available at http://timex2.mitre.org/index.htm.Google Scholar
- Franz, M., Ittycheriah, A., McCarley J. S., and Ward, T. 2001. First story detection: Combining similarity and novelty-based approaches. S. TDT 2001 Workshop. IBM.Google Scholar
- Goralwalla, I. A., Leontiev, Y., Özsu, M. T., Szafron, D., and Combi, C. 2001. Temporal granularity: Completing the puzzle. J. Intelli. Inform. Syst. 16, 41--63. Google ScholarDigital Library
- Hunter, A. 2002. Merging structured text using temporal knowledge. J. Data Knowl. Engin. 41, 29--66. Google ScholarDigital Library
- Kim, P. et al. 2003. Extracting temporal information from Korean news articles for event detection and tracking. In Proceedings of the 20th International Conference on Computer Processing of Oriental Languages. 392--401.Google Scholar
- Kim, P. and Myaeng, S. H. 2004. Usefulness of temporal information automatically extracted from news articles for topic tracking. ACM Trans. Asian Lang. Inform. Process. 3, 4, 227--242. Google ScholarDigital Library
- Kleinberg, J. 2002. Bursty and hierarchical structure in streams. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (SIGKDD'02). 91--101. Google ScholarDigital Library
- Kumaran, G. and Allan, J. 2004. Text classification and named entities for new event detection. In Proceedings of the 27th International Conference on Research and Development in Information Retrieval (SIGIR'04). 297--304. Google ScholarDigital Library
- Larkey, L. S., Feng, F., Connell, M., and Lavrenko, V. 2004. Language-specific models in multilingual topic tracking. In Proceedings of the 27th International Conference on Research and Development in Information Retrieval (SIGIR'04). 402--409. Google ScholarDigital Library
- Li, B. 2003. Studies on topic tracking and detection in chinese news stories. Ph.D. thesis, Department of Computer Science and Technology, Peking University, Beijing, China (June).Google Scholar
- Li, B., Li, W., Lu, Q., and Wu, M. 2005. Profile-based event tracking. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05). 631--632. Google ScholarDigital Library
- Makkonen, J. and Ahonen-Myka, H. 2003. Utilizing temporal information in topic detection and tracking. In Proceedings of 7th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'03). 393--404.Google Scholar
- Makkonen, J., Ahonen-Myka, H., and Salmenkivi, M. 2004. Simple semantics in topic detection and tracking. Inform. Retriev. 7, 347--368. Google ScholarDigital Library
- Mei, Q. and Zhai, C. 2005. Discovering evolutionary theme patterns from text: An exploration of temporal text mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (SIGKDD'05). 198--207. Google ScholarDigital Library
- NIST. 2004. The 2004 Topic Detection and Tracking (TDT2004) Task Definition and Evaluation Plan. Available at http://www.nist.gov/speech/tests/tdt/tdt2004/.Google Scholar
- Salton, G. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, PA. Google ScholarDigital Library
- Salton, G., Wong, A., and Yang, C. S. 1975. A vector space model for automatic indexing. Comm. ACM 18, 11, 613--620. Google ScholarDigital Library
- Witten, I. H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
- Yang, Y., Pierce, T., and Carbonell, J. 1998. A study on retrospective and online event detection. In Proceedings of the 21st ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'98). 28--36. Google ScholarDigital Library
- Yang, Y., Carbonell, J., Brown, R., Pierce, T., Archibald, B. T., and Liu, X. 1999. Learning approaches for detecting and tracking news events. IEEE Intell. Syst. 14, 4, 32--43. Google ScholarDigital Library
Index Terms
Topic tracking with time granularity reasoning
Recommendations
Enhancing topic tracking with temporal information
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrievalIn this paper, we propose a new strategy with time granularity reasoning for utilizing temporal information in topic tracking. Compared with previous ones, our work has four distinguished characteristics. Firstly, we try to determine a set of topic ...
Topic tracking based on keywords dependency profile
AIRS'08: Proceedings of the 4th Asia information retrieval conference on Information retrieval technologyTopic tracking is an important task of Topic Detection and Tracking (TDT). Its purpose is to detect stories, from a stream of news, related to known topics. Each topic is "known" by its association with several sample stories that discuss it. In this ...
Topic Tracking Algorithm Based on Topic Structure Characteristics
HPCCT '22: Proceedings of the 2022 6th High Performance Computing and Cluster Technologies ConferenceTopic tracking task is used for public opinion monitoring, and its key technology is text classification algorithm. However, existing text classification algorithms need large-scale train corpus during training, while topic tracking task only provides a ...
Comments