skip to main content
10.1145/1242572.1242596acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

Topic sentiment mixture: modeling facets and opinions in weblogs

Authors Info & Claims
Published:08 May 2007Publication History

ABSTRACT

In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments. It could also provide general sentiment models that are applicable to any ad hoc topics. With a specifically designed HMM structure, the sentiment models and topic models estimated with TSM can be utilized to extract topic life cycles and sentiment dynamics. Empirical experiments on different Weblog datasets show that this approach is effective for modeling the topic facets and sentiments and extracting their dynamics from Weblog collections. The TSM model is quite general; it can be applied to any text collections with a mixture of topics and sentiments, thus has many potential applications, such as search result summarization, opinion tracking, and user behavior prediction.

References

  1. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003. Google ScholarGoogle ScholarCross RefCross Ref
  2. Y. Choi, C. Cardie, E. Riloff, and S. Patwardhan. Identifying sources of opinions with conditional random fields and extraction patterns. In Proceedings of HLT-EMNLP 2005, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statist. Soc. B, 39:1--38, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  4. K. Eguchi and V. Lavrenko. Sentiment retrieval using generative models. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 345--354, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Engstrom. Topic dependence in sentiment classification. masters thesis. university of cambridge. 2004.Google ScholarGoogle Scholar
  6. D. Gruhl, R. Guha, R. Kumar, J. Novak, and A. Tomkins. The predictive power of online chatter. In Proceedings of KDD '05, pages 78--87, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In Proceedings of the 13th International Conference on World Wide Web, pages 491--501, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. A. Hearst. Clustering versus faceted categories for information exploration. Commun. ACM, 49(4):59--61, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of SIGIR '99, pages 50--57, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Krovetz. Viewing morphology as an inference process. In Proceedings of SIGIR '93, pages 191--202, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proceedings of the 12th International Conference on World Wide Web, pages 568--576, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Li and A. McCallum. Pachinko allocation: Dag-structured mixture models of topic correlations. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pages 577--584, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Liu, M. Hu, and J. Cheng. Opinion observer: analyzing and comparing opinions on the web. In WWW '05: Proceedings of the 14th international conference on World Wide Web, pages 342--351, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. J. McLachlan and T. Krishnan. The EM Algorithm and Extensions. Wiley, 1997.Google ScholarGoogle Scholar
  15. Q. Mei, C. Liu, H. Su, and C. Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 533--542, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceedings of KDD '05, pages 198--207, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Q. Mei and C. Zhai. A mixture model for contextual text mining. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 649--655, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Mishne and M. de Rijke. MoodViews: Tools for blog mood analysis. In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW 2006), pages 153--154, 2006.Google ScholarGoogle Scholar
  19. G. Mishne and N. Glance. Predicting movie sales from blogger sentiment. In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW 2006), 2006.Google ScholarGoogle Scholar
  20. Opinmind. http://www.opinmind.com.Google ScholarGoogle Scholar
  21. B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271--278, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Pang and L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL, pages 115--124, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79--86, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proc. of the IEEE, 77(2):257--285, Feb. 1989.Google ScholarGoogle ScholarCross RefCross Ref
  25. T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 162--169, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Wiebe, T. Wilson, and C. Cardie. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation (formerly Computers and the Humanities), 39, 2005.Google ScholarGoogle Scholar
  27. J. Yi, T. Nasukawa, R. C. Bunescu, and W. Niblack. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In Proceedings of ICDM 2003, pages 427--434, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Zhai, A. Velivelli, and B. Yu. A cross-collection mixture model for comparative text mining. In Proceedings of KDD '04, pages 743--748, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Topic sentiment mixture: modeling facets and opinions in weblogs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '07: Proceedings of the 16th international conference on World Wide Web
      May 2007
      1382 pages
      ISBN:9781595936547
      DOI:10.1145/1242572

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 May 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

      Upcoming Conference

      WWW '24
      The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore , Singapore

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader