Abstract
Since the mid-2000s, there has been exponential growth of asynchronous online conversations, thanks to the rise of social media. Analyzing and gaining insights from such conversations can be quite challenging for a user, especially when the discussion becomes very long. A promising solution to this problem is topic modeling, since it may help the user to understand quickly what was discussed in a long conversation and to explore the comments of interest. However, the results of topic modeling can be noisy, and they may not match the user’s current information needs. To address this problem, we propose a novel topic modeling system for asynchronous conversations that revises the model on the fly on the basis of users’ feedback. We then integrate this system with interactive visualization techniques to support the user in exploring long conversations, as well as in revising the topic model when the current results are not adequate to fulfill the user’s information needs. Finally, we report on an evaluation with real users that compared the resulting system with both a traditional interface and an interactive visual interface that does not support human-in-the-loop topic modeling. Both the quantitative results and the subjective feedback from the participants illustrate the potential benefits of our interactive topic modeling approach for exploring conversations, relative to its counterparts.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Interactive Topic Modeling for Exploring Asynchronous Online Conversations: Design and Evaluation of ConVisIT
- David Andrzejewski, Xiaojin Zhu, and Mark Craven. 2009. Incorporating domain knowledge into topic modeling via Dirichlet forest priors. In Proceedings of the International Conference on Machine Learning. 25--32. Google ScholarDigital Library
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Machine Learn. Res. 3 (2003), 993--1022. Google ScholarDigital Library
- Giuseppe Carenini, Gabriel Murray, and Raymond Ng. 2011. Methods for Mining and Summarizing Text Conversations. Morgan Claypool. Google ScholarDigital Library
- Sheelagh Carpendale. 2008. Evaluating information visualizations. In Information Visualization. 19--45.Google Scholar
- Jaegul Choo, Changhyun Lee, Chandan K. Reddy, and Haesun Park. 2013. Utopian: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Visual. Comput. Graph. 19, 12 (2013), 1992--2001. Google ScholarDigital Library
- Jason Chuang, Sonal Gupta, Christopher Manning, and Jeffrey Heer. 2013a. Topic model diagnostics: Assessing domain relevance via topical alignment. In Proceedings of the Conference on Machine Learning. 612--620.Google Scholar
- Jason Chuang, Yuening Hu, Ashley Jin, John D. Wilkerson, Daniel A. McFarland, Christopher D. Manning, and Jeffrey Heer. 2013b. Document exploration with topic modeling: Designing interactive visualizations to support effective analysis workflows. In NIPS Workshop on Topic Models: Computation, Application and Evaluation.Google Scholar
- Jason Chuang, Christopher D. Manning, and Jeffrey Heer. 2012. Termite: Visualization techniques for assessing textual topic models. In Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI). 74--77. Google ScholarDigital Library
- Andy Cockburn, Amy Karlson, and Benjamin B. Bederson. 2008. A review of overview+ detail, zooming, and focus+ context interfaces. ACM Comput. Surv. (CSUR) 41, 1 (2008), 2. Google ScholarDigital Library
- Kushal Dave, Martin Wattenberg, and Michael Muller. 2004. Flash forums and forumreader: Navigating a new kind of large-scale online discussion. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW). 232--241. Google ScholarDigital Library
- Wenwen Dou, Li Yu, Xiaoyu Wang, Zhiqiang Ma, and William Ribarsky. 2013. HierarchicalTopics: Visually exploring large text collections using topic hierarchies. IEEE Trans. Visual. Comput. Graph. 19, 12 (2013), 2002--2011. Google ScholarDigital Library
- Alex Endert, Patrick Fiaux, and Chris North. 2012. Semantic interaction for visual text analytics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). ACM, New York, NY, 473--482. Google ScholarDigital Library
- Joseph L. Fleiss, Bruce Levin, and Myunghee Cho Paik. 2013. Statistical Methods for Rates and Proportions. John Wiley & Sons, New York, NY.Google Scholar
- Michel Galley, Kathleen McKeown, Eric Fosler-Lussier, and Hongyan Jing. 2003. Discourse segmentation of multi-party conversation. In Proceedings of the Annual Meeting on Association for Computational Linguistics. 562--569. Google ScholarDigital Library
- S. Havre, E. Hetzler, P. Whitney, and L. Nowell. 2002. ThemeRiver: Visualizing thematic changes in large document collections. IEEE Trans. Visual. Comput. Graph. 8, 1 (2002), 9--20. Google ScholarDigital Library
- Jeffrey Heer and George G. Robertson. 2007. Animated transitions in statistical data graphics. IEEE Trans. Visual. Comput. Graph. 13, 6 (2007), 1240--1247. Google ScholarDigital Library
- E. Hoque and G. Carenini. 2014. ConVis: A visual text analytic system for exploring blog conversations. Comput. Graph. Forum (Proc. EuroVis) 33, 3 (2014), 221--230. Google ScholarDigital Library
- E. Hoque and G. Carenini. 2015. ConVisIT: Interactive topic modeling for exploring asynchronous online conversations. In Proceedings of the ACM Conference on Intelligent User Interfaces. ACM, New York, NY, 169--180. Google ScholarDigital Library
- Yuening Hu, Jordan Boyd-Graber, Brianna Satinoff, and Alison Smith. 2014. Interactive topic modeling. Machine Learn. 95, 3 (2014), 423--469. Google ScholarDigital Library
- Quentin Jones, Gilad Ravid, and Sheizaf Rafaeli. 2004. Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical exploration. Inform. Syst. Res. 15, 2 (2004), 194--210. Google ScholarDigital Library
- Shafiq Joty, Giuseppe Carenini, and Raymond T. Ng. 2013. Topic segmentation and labeling in asynchronous conversations. J. Artificial Intell. Res. 47 (2013), 521--573. Google ScholarDigital Library
- Maurits Clemens Kaptein, Clifford Nass, and Panos Markopoulos. 2010. Powerful and consistent analysis of Likert-type ratingscales. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 2391--2394. Google ScholarDigital Library
- Bernard Kerr. 2003. Thread arcs: An email thread visualization. In IEEE Symposium on Information Visualization. 211--218. Google ScholarDigital Library
- H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. 2012. Empirical studies in information visualization: Seven scenarios. IEEE Trans. Visual. Comput. Graph. 18, 9 (2012), 1520--1536. Google ScholarDigital Library
- Hanseung Lee, Jaeyeon Kihm, Jaegul Choo, John Stasko, and Haesun Park. 2012. iVisClustering: An interactive visual document clustering via topic modeling. In Computer Graphics Forum, Vol. 31. Wiley Online Library, New York, NY, 1155--1164. Google ScholarDigital Library
- Tamara Munzner. 2014. Visualization Analysis and Design. CRC Press, Boca Raton, FL.Google Scholar
- Mark E. J. Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2 (2004), 026113.Google ScholarCross Ref
- Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126.Google ScholarCross Ref
- Shimei Pan, Michelle X. Zhou, Yangqiu Song, Weihong Qian, Fei Wang, and Shixia Liu. 2013. Optimizing temporal topic segmentation for intelligent text visualization. In Proceedings of the ACM Conference on Intelligent User Interfaces. ACM, New York, NY, 339--350. Google ScholarDigital Library
- Vıctor Pascual-Cid and Andreas Kaltenbrunner. 2009. Exploring asynchronous online discussions through hierarchical visualisation. In IEEE Conference on Information Visualization. 191--196. Google ScholarDigital Library
- Peter Pirolli, Patricia Schank, Marti Hearst, and Christine Diehl. 1996. Scatter/gather browsing communicates the topic structure of a very large text collection. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’96). 213--220. Google ScholarDigital Library
- Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D. Manning. 2009. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP). 248--256. Google ScholarDigital Library
- Warren Sack. 2000. Conversation map: An interface for very-large-scale conversations. J. Manag. Inform. Syst. 17, 3 (2000), 73--92. Google ScholarDigital Library
- Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design study methodology: Reflections from the trenches and the stacks. IEEE Trans. Visual. Comput. Graph. 18, 12 (2012), 2431--2440. Google ScholarDigital Library
- Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Machine Intell. 22, 8 (2000), 888--905. Google ScholarDigital Library
- Markus Steinberger, Manuela Waldner, Marc Streit, Alexander Lex, and Dieter Schmalstieg. 2011. Context-preserving visual links. IEEE Trans. Visual. Comput. Graphi. 17, 12 (2011), 2249--2258. Google ScholarDigital Library
- Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 2 (2011), 267--307. Google ScholarDigital Library
- Gina Danielle Venolia and Carman Neustaedter. 2003. Understanding sequence and reply relationships within email conversations: A mixed-model visualization. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 361--368. Google ScholarDigital Library
- Fernanda B. Viégas, Scott Golder, and Judith Donath. 2006. Visualizing email content: Portraying relationships from conversational histories. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 979--988. Google ScholarDigital Library
- Martin Wattenberg and David Millen. 2003. Conversation thumbnails for large-scale discussions. In Extended Abstracts on SIGCHI Conference on Human Factors in Computing Systems (CHI). 742--743. Google ScholarDigital Library
- Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. 2010. Tiara: A visual exploratory text analytic system. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining. 153--162. Google ScholarDigital Library
- Yi Yang, Shimei Pan, Yangqiu Song, Jie Lu, and Mercan Topkara. 2015. User-directed non-disruptive topic model update for effective exploration of dynamic content. In Proceedings of the ACM Conference on Intelligent User Interfaces. 158--168. Google ScholarDigital Library
- Ding Zhou, Sergey A. Orshanskiy, Hongyuan Zha, and C. Lee Giles. 2007. Co-ranking authors and documents in a heterogeneous network. In Proceedings of the 7th IEEE International Conference on Data Mining. 739--744. Google ScholarDigital Library
Index Terms
- Interactive Topic Modeling for Exploring Asynchronous Online Conversations: Design and Evaluation of ConVisIT
Recommendations
ConVisIT: Interactive Topic Modeling for Exploring Asynchronous Online Conversations
IUI '15: Proceedings of the 20th International Conference on Intelligent User InterfacesIn the last decade, there has been an exponential growth of asynchronous online conversations thanks to the rise of social media. Analyzing and gaining insights from such conversations can be quite challenging for a user, especially when the discussion ...
Visual Text Analytics for Asynchronous Online Conversations
IUI '15 Companion: Companion Proceedings of the 20th International Conference on Intelligent User InterfacesIn the last decade, there has been an exponential growth of online conversations thanks to the rise of social media. Analyzing and gaining insights from such conversations can be quite challenging for a user, especially when the discussions become very ...
Visual Text Analytics for Online Conversations: Design, Evaluation, and Applications
IUI '16 Companion: Companion Publication of the 21st International Conference on Intelligent User InterfacesAnalyzing and gaining insights from a large amount of textual conversations can be quite challenging for a user, especially when the discussions become very long. During my doctoral research, I have focused on integrating Information Visualization (...
Comments