research-article

Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System

Authors:
Rui Yan

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Yiping Song

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

,
Hua Wu

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China
View Profile

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalJuly 2016Pages 55–64https://doi.org/10.1145/2911451.2911542

Published:07 July 2016Publication History

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Pages 55–64

ABSTRACT

To establish an automatic conversation system between humans and computers is regarded as one of the most hardcore problems in computer science, which involves interdisciplinary techniques in information retrieval, natural language processing, artificial intelligence, etc. The challenges lie in how to respond so as to maintain a relevant and continuous conversation with humans. Along with the prosperity of Web 2.0, we are now able to collect extremely massive conversational data, which are publicly available. It casts a great opportunity to launch automatic conversation systems. Owing to the diversity of Web resources, a retrieval-based conversation system will be able to find at least some responses from the massive repository for any user inputs. Given a human issued message, i.e., query, our system would provide a reply after adequate training and learning of how to respond. In this paper, we propose a retrieval-based conversation system with the deep learning-to-respond schema through a deep neural network framework driven by web data. The proposed model is general and unified for different conversation scenarios in open domain. We incorporate the impact of multiple data inputs, and formulate various features and factors with optimization into the deep learning framework. In the experiments, we investigate the effectiveness of the proposed deep neural network structures with better combinations of all different evidence. We demonstrate significant performance improvement against a series of standard and state-of-art baselines in terms of p@1, MAP, nDCG, and MRR for conversational purposes.

References

Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1--127, 2009. Google ScholarDigital Library
F. Bessho, T. Harada, and Y. Kuniyoshi. Dialog system using real-time crowdsourcing and Twitter large-scale corpus. In SIGDIAL, pages 227--231, 2012. Google ScholarDigital Library
G. Cong, L. Wang, C.-Y. Lin, Y.-I. Song, and Y. Sun. Finding question-answer pairs from online forums. In SIGIR, pages 467--474. Google ScholarDigital Library
A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proc. Acoustics, Speech and Signal Processing, pages 6645--6649, 2013.Google ScholarCross Ref
H. He, K. Gimpel, and J. Lin. Multi-perspective sentence similarity modeling with convolutional neural networks. In EMNLP, pages 1576--1586, 2015.Google ScholarCross Ref
R. Higashinaka, K. Imamura, T. Meguro, C. Miyazaki, N. Kobayashi, H. Sugiyama, T. Hirano, T. Makino, and Y. Matsuo. Towards an open domain conversational system fully based on natural language processing. In COLING, 2014.Google Scholar
B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neural network architectures for matching natural language sentences. In NIPS, pages 2042--2050, 2014. Google ScholarDigital Library
K. Järvelin and J. Kek\"al\"ainen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002. Google ScholarDigital Library
Z. Ji, Z. Lu, and H. Li. An information retrieval approach to short text conversation. CoRR, abs/1408.6988, 2014.Google Scholar
N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.Google Scholar
C.-J. Lee, Q. Ai, W. B. Croft, and D. Sheldon. An optimization framework for merging multiple result lists. In CIKM '15, pages 303--312, 2015. Google ScholarDigital Library
A. Leuski, R. Patel, D. Traum, and B. Kennedy. Building effective question answering characters. In SIGDIAL, pages 18--27, 2009. Google ScholarDigital Library
A. Leuski and D. Traum. NPCEditor: Creating virtual human dialogue using information retrieval techniques. AI Magazine, 32(2):42--56, 2011.Google ScholarDigital Library
H. Li and J. Xu. Semantic matching in search. Foundations and Trends in Information Retrieval, 8:89, 2014. Google ScholarDigital Library
J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015.Google Scholar
X. Li, L. Mou, R. Yan, and M. Zhang. Stalematebreaker: A proactive content-introducing approach to automatic human-computer conversation. In IJCAI, 2016.Google Scholar
Z. Lu and H. Li. A deep architecture for matching short texts. In NIPS, pages 1367--1375, 2013.Google ScholarDigital Library
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarCross Ref
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013.Google Scholar
L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin. Convolutional neural networks over tree structures for programming language processing. In AAAI, pages 1287--1292, 2016.Google ScholarDigital Library
L. Mou, H. Peng, G. Li, Y. Xu, L. Zhang, and Z. Jin. Discriminative neural sentence modeling by tree-based convolution. In EMNLP, pages 2315--2325, 2015.Google ScholarCross Ref
L. Mou, M. Rui, G. Li, Y. Xu, L. Zhang, R. Yan, and Z. Jin. Recognizing entailment and contradiction by tree-based convolution. arXiv preprint arXiv:1512.08422, 2015.Google Scholar
M. Nakano, N. Miyazaki, N. Yasuda, A. Sugiyama, J.-i. Hirasawa, K. Dohsaka, and K. Aikawa. WIT: A toolkit for building robust and real-time spoken dialogue systems. In SIGDIAL, pages 150--159. Google ScholarDigital Library
E. Nouri, R. Artstein, A. Leuski, and D. R. Traum. Augmenting conversational characters with generated question-answer pairs. In AAAI Fall Symposium: Question Generation, 2011.Google Scholar
H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, and R. Ward. Deep sentence embedding using the long short term memory network: Analysis and application to information retrieval. arXiv preprint arXiv:1502.06922, 2015.Google ScholarDigital Library
A. Ritter, C. Cherry, and W. B. Dolan. Data-driven response generation in social media. In EMNLP, pages 583--593, 2011. Google ScholarDigital Library
T. Rocktäschel, E. Grefenstette, K. M. Hermann, T. Kočiskỳ, and P. Blunsom. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664, 2015.Google Scholar
A. Severyn and A. Moschitti. Learning to rank short text pairs with convolutional deep neural networks. In SIGIR '15, pages 373--382. Google ScholarDigital Library
L. Shang, Z. Lu, and H. Li. Neural responding machine for short-text conversation. In ACL-IJCNLP, pages 1577--1586, 2015.Google ScholarCross Ref
R. Socher, J. Pennington, E. H. Huang, A. Y. Ng, and C. D. Manning. Semi-supervised recursive autoencoders for predicting sentiment distributions. In EMNLP, pages 151--161, 2011. Google ScholarDigital Library
H. Sugiyama, T. Meguro, R. Higashinaka, and Y. Minami. Open-domain utterance generation for conversational dialogue systems using Web-scale dependency structures. In SIGDIAL, pages 334--338, 2013.Google Scholar
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104--3112, 2014. Google ScholarDigital Library
M. A. Walker, R. Passonneau, and J. E. Boland. Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In ACL, pages 515--522, 2001. Google ScholarDigital Library
R. S. Wallace. The Anatomy of ALICE. Springer, 2009.Google Scholar
H. Wang, Z. Lu, H. Li, and E. Chen. A dataset for research on short-text conversations. In EMNLP, pages 935--945, 2013.Google Scholar
J. Williams, A. Raux, D. Ramachandran, and A. Black. The dialog state tracking challenge. In SIGDIAL, pages 404--413, 2013.Google Scholar
Y. Xu, R. Jia, L. Mou, G. Li, Y. Chen, Y. Lu, and Z. Jin. Improved relation classification by deep recurrent neural networks with data augmentation. arXiv preprint arXiv:1601.03651, 2016.Google Scholar
Y. Xu, L. Mou, G. Li, Y. Chen, H. Peng, and Z. Jin. Classifying relations via long short term memory networks along shortest dependency paths. In EMNLP, 2015.Google ScholarCross Ref
R. Yan. i, poet: Automatic poetry composition through recurrent neural networks with iterative polishing schema. In IJCAI, 2016.Google Scholar
R. Yan, M. Lapata, and X. Li. Tweet recommendation with graph co-ranking. In ACL, pages 516--525, 2012. Google ScholarDigital Library
R. Yan, C.-T. Li, H.-P. Hsieh, P. Hu, X. Hu, and T. He. Socialized language model smoothing via bi-directional influence propagation on social networks. In WWW '16, pages 1395--1405, 2016. Google ScholarDigital Library
R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In SIGIR '11, pages 745--754, 2011. Google ScholarDigital Library
R. Yan, I. E. Yen, C.-T. Li, S. Zhao, and X. Hu. Tackling the achilles heel of social networks: Influence propagation based language model smoothing. In WWW '15, pages 1318--1328, 2015. Google ScholarDigital Library
K. Zhai and D. J. Williams. Discovering latent structure in task-oriented dialogues. In ACL, pages 36--46, 2014.Google ScholarCross Ref
B. Zhang, J. Su, D. Xiong, Y. Lu, H. Duan, and J. Yao. Shallow convolutional neural network for implicit discourse relation recognition. In EMNLP, pages 2230--2235, 2015.Google ScholarCross Ref

Index Terms

Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Retrieval models and ranking
      1. Learning to rank
  2. World Wide Web
    1. Web applications
      1. Internet communications tools
        Chat

Recommendations

Joint Learning of Response Ranking and Next Utterance Suggestion in Human-Computer Conversation System
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conversation systems are of growing importance since they enable an easy interaction interface between humans and computers: using natural languages. To build a conversation system with adequate intelligence is challenging, and requires abundant ...
Read More
A NeuRetrieval Model for Human-Computer Conversations
WWW '18: Companion Proceedings of the The Web Conference 2018

To establish an automatic conversation system between human and computer is regarded as one of the most hardcore problems in computer science. It requires interdisciplinary techniques of information retrieval, natural language processing, data ...
Read More
Multimodal and Crossmodal Representation Learning from Textual and Visual Features with Bidirectional Deep Neural Networks for Video Hyperlinking
iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

Video hyperlinking represents a classical example of multimodal problems. Common approaches to such problems are early fusion of the initial modalities and crossmodal translation from one modality to the other. Recently, deep neural networks, especially ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
General Chairs:
Raffaele Perego
ISTI-CNR, Italy
,
Fabrizio Sebastiani
Qatar Computing Research Institute, HBKU, Qatar
,
Program Chairs:
Javed Aslam
Northeastern University, US
,
Ian Ruthven
University of Strathclyde, UK
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
contextual modeling
conversation system
deep neural networks
learning-to-respond
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 168
  Total Citations
  View Citations
- 1,910
  Total Downloads
- Downloads (Last 12 months)78
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Joint Learning of Response Ranking and Next Utterance Suggestion in Human-Computer Conversation System

A NeuRetrieval Model for Human-Computer Conversations

Multimodal and Crossmodal Representation Learning from Textual and Visual Features with Bidirectional Deep Neural Networks for Video Hyperlinking