skip to main content
research-article

Interactive Topic Modeling for Exploring Asynchronous Online Conversations: Design and Evaluation of ConVisIT

Published:22 February 2016Publication History
Skip Abstract Section

Abstract

Since the mid-2000s, there has been exponential growth of asynchronous online conversations, thanks to the rise of social media. Analyzing and gaining insights from such conversations can be quite challenging for a user, especially when the discussion becomes very long. A promising solution to this problem is topic modeling, since it may help the user to understand quickly what was discussed in a long conversation and to explore the comments of interest. However, the results of topic modeling can be noisy, and they may not match the user’s current information needs. To address this problem, we propose a novel topic modeling system for asynchronous conversations that revises the model on the fly on the basis of users’ feedback. We then integrate this system with interactive visualization techniques to support the user in exploring long conversations, as well as in revising the topic model when the current results are not adequate to fulfill the user’s information needs. Finally, we report on an evaluation with real users that compared the resulting system with both a traditional interface and an interactive visual interface that does not support human-in-the-loop topic modeling. Both the quantitative results and the subjective feedback from the participants illustrate the potential benefits of our interactive topic modeling approach for exploring conversations, relative to its counterparts.

Skip Supplemental Material Section

Supplemental Material

References

  1. David Andrzejewski, Xiaojin Zhu, and Mark Craven. 2009. Incorporating domain knowledge into topic modeling via Dirichlet forest priors. In Proceedings of the International Conference on Machine Learning. 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Machine Learn. Res. 3 (2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Giuseppe Carenini, Gabriel Murray, and Raymond Ng. 2011. Methods for Mining and Summarizing Text Conversations. Morgan Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Sheelagh Carpendale. 2008. Evaluating information visualizations. In Information Visualization. 19--45.Google ScholarGoogle Scholar
  5. Jaegul Choo, Changhyun Lee, Chandan K. Reddy, and Haesun Park. 2013. Utopian: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Visual. Comput. Graph. 19, 12 (2013), 1992--2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jason Chuang, Sonal Gupta, Christopher Manning, and Jeffrey Heer. 2013a. Topic model diagnostics: Assessing domain relevance via topical alignment. In Proceedings of the Conference on Machine Learning. 612--620.Google ScholarGoogle Scholar
  7. Jason Chuang, Yuening Hu, Ashley Jin, John D. Wilkerson, Daniel A. McFarland, Christopher D. Manning, and Jeffrey Heer. 2013b. Document exploration with topic modeling: Designing interactive visualizations to support effective analysis workflows. In NIPS Workshop on Topic Models: Computation, Application and Evaluation.Google ScholarGoogle Scholar
  8. Jason Chuang, Christopher D. Manning, and Jeffrey Heer. 2012. Termite: Visualization techniques for assessing textual topic models. In Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI). 74--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Andy Cockburn, Amy Karlson, and Benjamin B. Bederson. 2008. A review of overview+ detail, zooming, and focus+ context interfaces. ACM Comput. Surv. (CSUR) 41, 1 (2008), 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kushal Dave, Martin Wattenberg, and Michael Muller. 2004. Flash forums and forumreader: Navigating a new kind of large-scale online discussion. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW). 232--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Wenwen Dou, Li Yu, Xiaoyu Wang, Zhiqiang Ma, and William Ribarsky. 2013. HierarchicalTopics: Visually exploring large text collections using topic hierarchies. IEEE Trans. Visual. Comput. Graph. 19, 12 (2013), 2002--2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Alex Endert, Patrick Fiaux, and Chris North. 2012. Semantic interaction for visual text analytics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). ACM, New York, NY, 473--482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Joseph L. Fleiss, Bruce Levin, and Myunghee Cho Paik. 2013. Statistical Methods for Rates and Proportions. John Wiley & Sons, New York, NY.Google ScholarGoogle Scholar
  14. Michel Galley, Kathleen McKeown, Eric Fosler-Lussier, and Hongyan Jing. 2003. Discourse segmentation of multi-party conversation. In Proceedings of the Annual Meeting on Association for Computational Linguistics. 562--569. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Havre, E. Hetzler, P. Whitney, and L. Nowell. 2002. ThemeRiver: Visualizing thematic changes in large document collections. IEEE Trans. Visual. Comput. Graph. 8, 1 (2002), 9--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jeffrey Heer and George G. Robertson. 2007. Animated transitions in statistical data graphics. IEEE Trans. Visual. Comput. Graph. 13, 6 (2007), 1240--1247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Hoque and G. Carenini. 2014. ConVis: A visual text analytic system for exploring blog conversations. Comput. Graph. Forum (Proc. EuroVis) 33, 3 (2014), 221--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. Hoque and G. Carenini. 2015. ConVisIT: Interactive topic modeling for exploring asynchronous online conversations. In Proceedings of the ACM Conference on Intelligent User Interfaces. ACM, New York, NY, 169--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yuening Hu, Jordan Boyd-Graber, Brianna Satinoff, and Alison Smith. 2014. Interactive topic modeling. Machine Learn. 95, 3 (2014), 423--469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Quentin Jones, Gilad Ravid, and Sheizaf Rafaeli. 2004. Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical exploration. Inform. Syst. Res. 15, 2 (2004), 194--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shafiq Joty, Giuseppe Carenini, and Raymond T. Ng. 2013. Topic segmentation and labeling in asynchronous conversations. J. Artificial Intell. Res. 47 (2013), 521--573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Maurits Clemens Kaptein, Clifford Nass, and Panos Markopoulos. 2010. Powerful and consistent analysis of Likert-type ratingscales. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 2391--2394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Bernard Kerr. 2003. Thread arcs: An email thread visualization. In IEEE Symposium on Information Visualization. 211--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. 2012. Empirical studies in information visualization: Seven scenarios. IEEE Trans. Visual. Comput. Graph. 18, 9 (2012), 1520--1536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hanseung Lee, Jaeyeon Kihm, Jaegul Choo, John Stasko, and Haesun Park. 2012. iVisClustering: An interactive visual document clustering via topic modeling. In Computer Graphics Forum, Vol. 31. Wiley Online Library, New York, NY, 1155--1164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tamara Munzner. 2014. Visualization Analysis and Design. CRC Press, Boca Raton, FL.Google ScholarGoogle Scholar
  27. Mark E. J. Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2 (2004), 026113.Google ScholarGoogle ScholarCross RefCross Ref
  28. Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126.Google ScholarGoogle ScholarCross RefCross Ref
  29. Shimei Pan, Michelle X. Zhou, Yangqiu Song, Weihong Qian, Fei Wang, and Shixia Liu. 2013. Optimizing temporal topic segmentation for intelligent text visualization. In Proceedings of the ACM Conference on Intelligent User Interfaces. ACM, New York, NY, 339--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Vıctor Pascual-Cid and Andreas Kaltenbrunner. 2009. Exploring asynchronous online discussions through hierarchical visualisation. In IEEE Conference on Information Visualization. 191--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Peter Pirolli, Patricia Schank, Marti Hearst, and Christine Diehl. 1996. Scatter/gather browsing communicates the topic structure of a very large text collection. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’96). 213--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D. Manning. 2009. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP). 248--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Warren Sack. 2000. Conversation map: An interface for very-large-scale conversations. J. Manag. Inform. Syst. 17, 3 (2000), 73--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design study methodology: Reflections from the trenches and the stacks. IEEE Trans. Visual. Comput. Graph. 18, 12 (2012), 2431--2440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Machine Intell. 22, 8 (2000), 888--905. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Markus Steinberger, Manuela Waldner, Marc Streit, Alexander Lex, and Dieter Schmalstieg. 2011. Context-preserving visual links. IEEE Trans. Visual. Comput. Graphi. 17, 12 (2011), 2249--2258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 2 (2011), 267--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Gina Danielle Venolia and Carman Neustaedter. 2003. Understanding sequence and reply relationships within email conversations: A mixed-model visualization. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 361--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Fernanda B. Viégas, Scott Golder, and Judith Donath. 2006. Visualizing email content: Portraying relationships from conversational histories. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 979--988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Martin Wattenberg and David Millen. 2003. Conversation thumbnails for large-scale discussions. In Extended Abstracts on SIGCHI Conference on Human Factors in Computing Systems (CHI). 742--743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. 2010. Tiara: A visual exploratory text analytic system. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining. 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yi Yang, Shimei Pan, Yangqiu Song, Jie Lu, and Mercan Topkara. 2015. User-directed non-disruptive topic model update for effective exploration of dynamic content. In Proceedings of the ACM Conference on Intelligent User Interfaces. 158--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ding Zhou, Sergey A. Orshanskiy, Hongyuan Zha, and C. Lee Giles. 2007. Co-ranking authors and documents in a heterogeneous network. In Proceedings of the 7th IEEE International Conference on Data Mining. 739--744. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Interactive Topic Modeling for Exploring Asynchronous Online Conversations: Design and Evaluation of ConVisIT

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Interactive Intelligent Systems
              ACM Transactions on Interactive Intelligent Systems  Volume 6, Issue 1
              Special Issue on New Directions in Eye Gaze for Interactive Intelligent Systems (Part 2 of 2), Regular Articles and Special Issue on Highlights of IUI 2015 (Part 1 of 2)
              May 2016
              219 pages
              ISSN:2160-6455
              EISSN:2160-6463
              DOI:10.1145/2896319
              Issue’s Table of Contents

              Copyright © 2016 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 22 February 2016
              • Revised: 1 November 2015
              • Accepted: 1 November 2015
              • Received: 1 July 2015
              Published in tiis Volume 6, Issue 1

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader