ABSTRACT
In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments. It could also provide general sentiment models that are applicable to any ad hoc topics. With a specifically designed HMM structure, the sentiment models and topic models estimated with TSM can be utilized to extract topic life cycles and sentiment dynamics. Empirical experiments on different Weblog datasets show that this approach is effective for modeling the topic facets and sentiments and extracting their dynamics from Weblog collections. The TSM model is quite general; it can be applied to any text collections with a mixture of topics and sentiments, thus has many potential applications, such as search result summarization, opinion tracking, and user behavior prediction.
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003. Google ScholarCross Ref
- Y. Choi, C. Cardie, E. Riloff, and S. Patwardhan. Identifying sources of opinions with conditional random fields and extraction patterns. In Proceedings of HLT-EMNLP 2005, 2005. Google ScholarDigital Library
- A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statist. Soc. B, 39:1--38, 1977.Google ScholarCross Ref
- K. Eguchi and V. Lavrenko. Sentiment retrieval using generative models. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 345--354, July 2006. Google ScholarDigital Library
- C. Engstrom. Topic dependence in sentiment classification. masters thesis. university of cambridge. 2004.Google Scholar
- D. Gruhl, R. Guha, R. Kumar, J. Novak, and A. Tomkins. The predictive power of online chatter. In Proceedings of KDD '05, pages 78--87, 2005. Google ScholarDigital Library
- D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In Proceedings of the 13th International Conference on World Wide Web, pages 491--501, 2004. Google ScholarDigital Library
- M. A. Hearst. Clustering versus faceted categories for information exploration. Commun. ACM, 49(4):59--61, 2006. Google ScholarDigital Library
- T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of SIGIR '99, pages 50--57, 1999. Google ScholarDigital Library
- R. Krovetz. Viewing morphology as an inference process. In Proceedings of SIGIR '93, pages 191--202, 1993. Google ScholarDigital Library
- R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proceedings of the 12th International Conference on World Wide Web, pages 568--576, 2003. Google ScholarDigital Library
- W. Li and A. McCallum. Pachinko allocation: Dag-structured mixture models of topic correlations. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pages 577--584, 2006. Google ScholarDigital Library
- B. Liu, M. Hu, and J. Cheng. Opinion observer: analyzing and comparing opinions on the web. In WWW '05: Proceedings of the 14th international conference on World Wide Web, pages 342--351, 2005. Google ScholarDigital Library
- G. J. McLachlan and T. Krishnan. The EM Algorithm and Extensions. Wiley, 1997.Google Scholar
- Q. Mei, C. Liu, H. Su, and C. Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 533--542, 2006. Google ScholarDigital Library
- Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceedings of KDD '05, pages 198--207, 2005. Google ScholarDigital Library
- Q. Mei and C. Zhai. A mixture model for contextual text mining. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 649--655, 2006. Google ScholarDigital Library
- G. Mishne and M. de Rijke. MoodViews: Tools for blog mood analysis. In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW 2006), pages 153--154, 2006.Google Scholar
- G. Mishne and N. Glance. Predicting movie sales from blogger sentiment. In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW 2006), 2006.Google Scholar
- Opinmind. http://www.opinmind.com.Google Scholar
- B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271--278, 2004. Google ScholarDigital Library
- B. Pang and L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL, pages 115--124, 2005. Google ScholarDigital Library
- B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79--86, 2002. Google ScholarDigital Library
- L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proc. of the IEEE, 77(2):257--285, Feb. 1989.Google ScholarCross Ref
- T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 162--169, 2006. Google ScholarDigital Library
- J. Wiebe, T. Wilson, and C. Cardie. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation (formerly Computers and the Humanities), 39, 2005.Google Scholar
- J. Yi, T. Nasukawa, R. C. Bunescu, and W. Niblack. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In Proceedings of ICDM 2003, pages 427--434, 2003. Google ScholarDigital Library
- C. Zhai, A. Velivelli, and B. Yu. A cross-collection mixture model for comparative text mining. In Proceedings of KDD '04, pages 743--748, 2004. Google ScholarDigital Library
Index Terms
- Topic sentiment mixture: modeling facets and opinions in weblogs
Recommendations
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
A joint model for topic-sentiment modeling from text
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied ComputingTraditional topic models, like LDA and PLSA, have been efficiently extended to capture further aspects of text in addition to the latent topics (e.g., time evolution, sentiment etc.). In this paper, we discuss the issue of joint topic-sentiment ...
A Joint Model for Topic-Sentiment Evolution over Time
ICDM '14: Proceedings of the 2014 IEEE International Conference on Data MiningMost existing topic models focus either on extracting static topic-sentiment conjunctions or topic-wise evolution over time leaving out topic-sentiment dynamics and missing the opportunity to provide a more in-depth analysis of textual data. In this ...
Comments