research-article

Metrics and Algorithms for Routing Questions to User Communities

Author:
Aditya Pal

IBM Research, San Jose, CA

IBM Research, San Jose, CA
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 33 Issue 3Article No.: 14pp 1–29https://doi.org/10.1145/2724706

Published:09 March 2015Publication History

ACM Transactions on Information Systems

Abstract

An online community consists of a group of users who share a common interest, background, or experience, and their collective goal is to contribute toward the welfare of the community members. Several websites allow their users to create and manage niche communities, such as Yahoo! Groups, Facebook Groups, Google+ Circles, and WebMD Forums. These community services also exist within enterprises, such as IBM Connections. Question answering within these communities enables their members to exchange knowledge and information with other community members. However, the onus of finding the right community for question asking lies with an individual user. The overwhelming number of communities necessitates the need for a good question routing strategy so that new questions get routed to an appropriately focused community and thus get resolved in a reasonable time frame.

In this article, we consider the novel problem of routing a question to the right community and propose a framework for selecting and ranking the relevant communities for a question. We propose several novel features for modeling the three main entities of the system: questions, users, and communities. We propose features such as language attributes, inclination to respond, user familiarity, and difficulty of a question; based on these features, we propose similarity metrics between the routed question and the system entities. We introduce a Cutoff-Aggregation (CA) algorithm that aggregates the entity similarity within a community to compute that community's relevance. We introduce two k-nearest-neighbor (knn) algorithms that are a natural instantiation of the CA algorithm, which are computationally efficient and evaluate several ranking algorithms over the aggregate similarity scores computed by the two knn algorithms. We propose clustering techniques to speed up our recommendation framework and show how pipelining can improve the model performance. We demonstrate the effectiveness of our framework on two large real-world datasets.

References

Sihem Amer-Yahia, Senjuti Basu Roy, Ashish Chawlat, Gautam Das, and Cong Yu. 2009. Group recommendation: Semantics and efficiency. Proceedings of the VLDB Endowment 2, 1 (Aug. 2009), 754--765. Google ScholarDigital Library
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2001. Latent Dirichlet allocation. In Advances in Neural Information Processing Systems 14 Neural Information Processing Systems: Natural and Synthetic (NIPS'01). MIT Press, 601--608.Google Scholar
Manuel Blum, Robert W. Floyd, Vaughan R. Pratt, Ronald L. Rivest, and Robert Endre Tarjan. 1972. Linear time bounds for median computations. In Proceedings of the 4th Annual ACM Symposium on Theory of Computing. ACM, 119--124. Google ScholarDigital Library
Mohamed Bouguessa, Benoît Dumoulin, and Shengrui Wang. 2008. Identifying authoritative actors in question-answering forums: The case of Yahoo! answers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08). ACM, 866--874. Google ScholarDigital Library
Yunbo Cao, Huizhong Duan, Chin yew Lin, Yong Yu, and Hsiao wuen Hon. 2008. Recommending questions using the mdl-based tree cut model. In Proceeding of the 17th International Conference on World Wide Web (WWW'08). ACM, 81--90. Google ScholarDigital Library
Shuo Chang and Aditya Pal. 2013. Routing questions for collaborative answering in community question answering. In Advances in Social Networks Analysis and Mining (ASONAM'13). ACM, 494--501. Google ScholarDigital Library
Kenneth Ward Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In 2nd Applied Natural Language Processing Conference (ANLP'88). ACL, 136--143. Google ScholarDigital Library
Don Coppersmith, Lisa Fleischer, and Atri Rudra. 2006. Ordering by weighted number of wins gives a good ranking for weighted tournaments. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'06). ACM, 776--782. Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Communications of ACM 51, 1 (2008), 107--113. Google ScholarDigital Library
Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2004. Kernel k-means: Spectral clustering and normalized cuts. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04). ACM, 551--556. Google ScholarDigital Library
Ronald Fagin, Ravi Kumar, and D. Sivakumar. 2003. Comparing top k lists. SIAM Journal of Discrete Mathematics 17, 1 (2003), 134--160. Google ScholarDigital Library
Mike Gartrell, Xinyu Xing, Qin Lv, Aaron Beach, Richard Han, Shivakant Mishra, and Karim Seada. 2010. Enhancing group recommendation by incorporating social relationship interactions. In Proceedings of the 2010 International ACM SIGGROUP Conference on Supporting Group Work (GROUP'10). ACM, 97--106. Google ScholarDigital Library
Jagadeesh Gorla, Neal Lathia, Stephen Robertson, and Jun Wang. 2013. Probabilistic group recommendation via information matching. In Proceedings of the 22nd International World Wide Web Conference, (WWW'13). 495--504. Google ScholarDigital Library
Michael Grant and Stephen Boyd. 2008. Graph implementations for nonsmooth convex programs. In Recent Advances in Learning and Control, V. Blondel, S. Boyd, and H. Kimura (Eds.). Springer-Verlag Limited, 95--110. http://stanford.edu/ boyd/graph_dcp.html.Google Scholar
Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu. 2008. Tapping on the potential of q&a community by recommending answer providers. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08). ACM, 921--930. Google ScholarDigital Library
Ralf Herbrich, Tom Minka, and Thore Graepel. 2007. TrueSkill^TM: A Bayesian skill rating system. In Advances in Neural Information Processing Systems 19 (NIPS'06). MIT Press, 569--576.Google Scholar
Liangjie Hong, Ron Bekkerman, Joseph Adler, and Brian D. Davison. 2012. Learning to rank social update streams. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'12). ACM, 651--660. Google ScholarDigital Library
Matthew A. Jaro. 1989. Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida. Journal of the American Statistics Association 84, 406 (1989), 414--420.Google ScholarCross Ref
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'02). ACM, 133--142. Google ScholarDigital Library
Pawel Jurczyk and Eugene Agichtein. 2007. Discovering authorities in question answer communities by using link analysis. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM, 919--922. Google ScholarDigital Library
Ritwik Kumar, Arunava Banerjee, Baba C. Vemuri, and Hanspeter Pfister. 2011. Maximizing all margins: Pushing face recognition with kernel plurality. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11). IEEE, 2375--2382. Google ScholarDigital Library
Liang-Cheng Lai and Hung-Yu Kao. 2012. Question routing by modeling user expertise and activity in cQA services. In The 26th Annual Conference of the Japanese Society for Artificial Intelligence.Google Scholar
Baichuan Li, Irwin King, and Michael R. Lyu. 2011. Question routing in community question answering: Putting category in its place. In Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM'11). ACM, 2041--2044. Google ScholarDigital Library
Wei Li, Charles Zhang, and Songlin Hu. 2010. G-Finder: Routing programming questions closer to the experts. In ACM Sigplan Notices, Vol. 45. ACM, 62--73. Google ScholarDigital Library
Jing Liu, Young-In Song, and Chin-Yew Lin. 2011. Competition-based user expertise score estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'11). ACM, New York, NY, 425--434. Google ScholarDigital Library
Jing Liu, Quan Wang, Chin-Yew Lin, and Hsiao-Wuen Hon. 2013. Question Difficulty Estimation in Community Question Answering Services. In EMNLP. ACL, 85--90.Google Scholar
Qiaoling Liu and Eugene Agichtein. 2011. Modeling answerer behavior in collaborative question answering systems. In ECIR (Lecture Notes in Computer Science), Vol. 6611. Springer, 67--79. Google ScholarDigital Library
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schtze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York, NY. Google ScholarDigital Library
George Lann Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions I. Mathematical Programming 14, 1 (1978), 265--294.Google ScholarDigital Library
Mark O'Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of user. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW'01). Kluwer Academic, 199--218. Google ScholarDigital Library
Aditya Pal and Scott Counts. 2011. Identifying topical authorities in microblogs. In Proceedings of the 4th International Conference on Web Search and Web Data Mining (WSDM'11). ACM, 45--54. Google ScholarDigital Library
Aditya Pal, F. Maxwell Harper, and Joseph A. Konstan. 2012. Exploring question selection bias to identify experts and potential experts in community question answering. ACM Transactions on Information Systems 30, 2 (2012), 10:1--10:28. Google ScholarDigital Library
Aditya Pal and Joseph A. Konstan. 2010. Expert identification in community question answering: Exploring question selection bias. In Proceedings of the 19th ACM Conference on Information and Knowledge Management, (CIKM). ACM, 1505--1508. Google ScholarDigital Library
Aditya Pal, Fei Wang, Michelle X. Zhou, Jeffrey Nichols, and Barton A. Smith. 2013. Question routing to user communities. In Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management (CIKM'13). ACM, New York, NY, 2357--2362. Google ScholarDigital Library
Jenny Preece and Diane Maloney Krichmar. 2005. Online communities: Design, theory and practice. Journal of Computer Mediated Communication 10, 4 (2005).Google ScholarCross Ref
David J. Rogers and Taffee T. Tanimoto. 1960. A computer program for classifying plants. Science 132, 3434 (Oct. 1960), 1115--1118.Google ScholarCross Ref
Lee Sproull and Manuel Arriaga. 2007. Online communities. In The Handbook of Computer Networks, H. Bidgoli (Ed.). Wiley Publishing.Google Scholar
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining. Addison-Wesley Longman, Boston, MA.Google Scholar
Mao Ye, Xingjie Liu, and Wang-Chien Lee. 2012. Exploring social influence for recommendation: A generative model approach. In The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, 671--680. Google ScholarDigital Library
Dell Zhang and Wee Sun Lee. 2003. Question classification using support vector machines. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'03). ACM, 26--32.Google ScholarCross Ref
Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. Expertise networks in online communities: structure and algorithms. In Proceedings of the 16th International Conference on World Wide Web (WWW'07). ACM, 221--230. Google ScholarDigital Library
Yanhong Zhou, Gao Cong, Bin Cui, Christian S. Jensen, and Junjie Yao. 2009. Routing questions to the right users in online communities. In Proceedings of the 25th International Conference on Data Engineering (ICDE'09). IEEE, 700--711. Google ScholarDigital Library

Index Terms

Metrics and Algorithms for Routing Questions to User Communities

Recommendations

Question routing to user communities
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

An online community consists of a group of users who share a common interest, background, or experience and their collective goal is to contribute towards the welfare of the community members. Question answering is an important feature that enables ...
Read More
Increasing Activity in Enterprise Online Communities Using Content Recommendation

Although online communities have become popular both on the web and within enterprises, many of them often experience low levels of activity and engagement from their members. Previous studies identified the important role of community leaders in ...
Read More
Clinical Questions in Online Health Communities: The Case of "See your doctor" Threads
CSCW '15: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing

Online health communities are known to provide psychosocial support. However, concerns for misinformation being shared around clinical information persist. An existing practice addressing this concern includes monitoring and, as needed, discouraging ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 33, Issue 3
March 2015
184 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2737814
Editor:
Maarten de Rijke
University of Amsterdam, The Netherlands
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 March 2015
- Revised: 1 January 2015
- Accepted: 1 January 2015
- Received: 1 March 2014
Published in tois Volume 33, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Question answering
community question routing
group recommendation
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 365
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Metrics and Algorithms for Routing Questions to User Communities

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Question routing to user communities

Increasing Activity in Enterprise Online Communities Using Content Recommendation

Clinical Questions in Online Health Communities: The Case of "See your doctor" Threads