ABSTRACT
To establish an automatic conversation system between humans and computers is regarded as one of the most hardcore problems in computer science, which involves interdisciplinary techniques in information retrieval, natural language processing, artificial intelligence, etc. The challenges lie in how to respond so as to maintain a relevant and continuous conversation with humans. Along with the prosperity of Web 2.0, we are now able to collect extremely massive conversational data, which are publicly available. It casts a great opportunity to launch automatic conversation systems. Owing to the diversity of Web resources, a retrieval-based conversation system will be able to find at least some responses from the massive repository for any user inputs. Given a human issued message, i.e., query, our system would provide a reply after adequate training and learning of how to respond. In this paper, we propose a retrieval-based conversation system with the deep learning-to-respond schema through a deep neural network framework driven by web data. The proposed model is general and unified for different conversation scenarios in open domain. We incorporate the impact of multiple data inputs, and formulate various features and factors with optimization into the deep learning framework. In the experiments, we investigate the effectiveness of the proposed deep neural network structures with better combinations of all different evidence. We demonstrate significant performance improvement against a series of standard and state-of-art baselines in terms of p@1, MAP, nDCG, and MRR for conversational purposes.
- Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1--127, 2009. Google ScholarDigital Library
- F. Bessho, T. Harada, and Y. Kuniyoshi. Dialog system using real-time crowdsourcing and Twitter large-scale corpus. In SIGDIAL, pages 227--231, 2012. Google ScholarDigital Library
- G. Cong, L. Wang, C.-Y. Lin, Y.-I. Song, and Y. Sun. Finding question-answer pairs from online forums. In SIGIR, pages 467--474. Google ScholarDigital Library
- A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proc. Acoustics, Speech and Signal Processing, pages 6645--6649, 2013.Google ScholarCross Ref
- H. He, K. Gimpel, and J. Lin. Multi-perspective sentence similarity modeling with convolutional neural networks. In EMNLP, pages 1576--1586, 2015.Google ScholarCross Ref
- R. Higashinaka, K. Imamura, T. Meguro, C. Miyazaki, N. Kobayashi, H. Sugiyama, T. Hirano, T. Makino, and Y. Matsuo. Towards an open domain conversational system fully based on natural language processing. In COLING, 2014.Google Scholar
- B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neural network architectures for matching natural language sentences. In NIPS, pages 2042--2050, 2014. Google ScholarDigital Library
- K. Järvelin and J. Kek\"al\"ainen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002. Google ScholarDigital Library
- Z. Ji, Z. Lu, and H. Li. An information retrieval approach to short text conversation. CoRR, abs/1408.6988, 2014.Google Scholar
- N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.Google Scholar
- C.-J. Lee, Q. Ai, W. B. Croft, and D. Sheldon. An optimization framework for merging multiple result lists. In CIKM '15, pages 303--312, 2015. Google ScholarDigital Library
- A. Leuski, R. Patel, D. Traum, and B. Kennedy. Building effective question answering characters. In SIGDIAL, pages 18--27, 2009. Google ScholarDigital Library
- A. Leuski and D. Traum. NPCEditor: Creating virtual human dialogue using information retrieval techniques. AI Magazine, 32(2):42--56, 2011.Google ScholarDigital Library
- H. Li and J. Xu. Semantic matching in search. Foundations and Trends in Information Retrieval, 8:89, 2014. Google ScholarDigital Library
- J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015.Google Scholar
- X. Li, L. Mou, R. Yan, and M. Zhang. Stalematebreaker: A proactive content-introducing approach to automatic human-computer conversation. In IJCAI, 2016.Google Scholar
- Z. Lu and H. Li. A deep architecture for matching short texts. In NIPS, pages 1367--1375, 2013.Google ScholarDigital Library
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarCross Ref
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013.Google Scholar
- L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin. Convolutional neural networks over tree structures for programming language processing. In AAAI, pages 1287--1292, 2016.Google ScholarDigital Library
- L. Mou, H. Peng, G. Li, Y. Xu, L. Zhang, and Z. Jin. Discriminative neural sentence modeling by tree-based convolution. In EMNLP, pages 2315--2325, 2015.Google ScholarCross Ref
- L. Mou, M. Rui, G. Li, Y. Xu, L. Zhang, R. Yan, and Z. Jin. Recognizing entailment and contradiction by tree-based convolution. arXiv preprint arXiv:1512.08422, 2015.Google Scholar
- M. Nakano, N. Miyazaki, N. Yasuda, A. Sugiyama, J.-i. Hirasawa, K. Dohsaka, and K. Aikawa. WIT: A toolkit for building robust and real-time spoken dialogue systems. In SIGDIAL, pages 150--159. Google ScholarDigital Library
- E. Nouri, R. Artstein, A. Leuski, and D. R. Traum. Augmenting conversational characters with generated question-answer pairs. In AAAI Fall Symposium: Question Generation, 2011.Google Scholar
- H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, and R. Ward. Deep sentence embedding using the long short term memory network: Analysis and application to information retrieval. arXiv preprint arXiv:1502.06922, 2015.Google ScholarDigital Library
- A. Ritter, C. Cherry, and W. B. Dolan. Data-driven response generation in social media. In EMNLP, pages 583--593, 2011. Google ScholarDigital Library
- T. Rocktäschel, E. Grefenstette, K. M. Hermann, T. Kočiskỳ, and P. Blunsom. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664, 2015.Google Scholar
- A. Severyn and A. Moschitti. Learning to rank short text pairs with convolutional deep neural networks. In SIGIR '15, pages 373--382. Google ScholarDigital Library
- L. Shang, Z. Lu, and H. Li. Neural responding machine for short-text conversation. In ACL-IJCNLP, pages 1577--1586, 2015.Google ScholarCross Ref
- R. Socher, J. Pennington, E. H. Huang, A. Y. Ng, and C. D. Manning. Semi-supervised recursive autoencoders for predicting sentiment distributions. In EMNLP, pages 151--161, 2011. Google ScholarDigital Library
- H. Sugiyama, T. Meguro, R. Higashinaka, and Y. Minami. Open-domain utterance generation for conversational dialogue systems using Web-scale dependency structures. In SIGDIAL, pages 334--338, 2013.Google Scholar
- I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104--3112, 2014. Google ScholarDigital Library
- M. A. Walker, R. Passonneau, and J. E. Boland. Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In ACL, pages 515--522, 2001. Google ScholarDigital Library
- R. S. Wallace. The Anatomy of ALICE. Springer, 2009.Google Scholar
- H. Wang, Z. Lu, H. Li, and E. Chen. A dataset for research on short-text conversations. In EMNLP, pages 935--945, 2013.Google Scholar
- J. Williams, A. Raux, D. Ramachandran, and A. Black. The dialog state tracking challenge. In SIGDIAL, pages 404--413, 2013.Google Scholar
- Y. Xu, R. Jia, L. Mou, G. Li, Y. Chen, Y. Lu, and Z. Jin. Improved relation classification by deep recurrent neural networks with data augmentation. arXiv preprint arXiv:1601.03651, 2016.Google Scholar
- Y. Xu, L. Mou, G. Li, Y. Chen, H. Peng, and Z. Jin. Classifying relations via long short term memory networks along shortest dependency paths. In EMNLP, 2015.Google ScholarCross Ref
- R. Yan. i, poet: Automatic poetry composition through recurrent neural networks with iterative polishing schema. In IJCAI, 2016.Google Scholar
- R. Yan, M. Lapata, and X. Li. Tweet recommendation with graph co-ranking. In ACL, pages 516--525, 2012. Google ScholarDigital Library
- R. Yan, C.-T. Li, H.-P. Hsieh, P. Hu, X. Hu, and T. He. Socialized language model smoothing via bi-directional influence propagation on social networks. In WWW '16, pages 1395--1405, 2016. Google ScholarDigital Library
- R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In SIGIR '11, pages 745--754, 2011. Google ScholarDigital Library
- R. Yan, I. E. Yen, C.-T. Li, S. Zhao, and X. Hu. Tackling the achilles heel of social networks: Influence propagation based language model smoothing. In WWW '15, pages 1318--1328, 2015. Google ScholarDigital Library
- K. Zhai and D. J. Williams. Discovering latent structure in task-oriented dialogues. In ACL, pages 36--46, 2014.Google ScholarCross Ref
- B. Zhang, J. Su, D. Xiong, Y. Lu, H. Duan, and J. Yao. Shallow convolutional neural network for implicit discourse relation recognition. In EMNLP, pages 2230--2235, 2015.Google ScholarCross Ref
Index Terms
- Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System
Recommendations
Joint Learning of Response Ranking and Next Utterance Suggestion in Human-Computer Conversation System
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalConversation systems are of growing importance since they enable an easy interaction interface between humans and computers: using natural languages. To build a conversation system with adequate intelligence is challenging, and requires abundant ...
A NeuRetrieval Model for Human-Computer Conversations
WWW '18: Companion Proceedings of the The Web Conference 2018To establish an automatic conversation system between human and computer is regarded as one of the most hardcore problems in computer science. It requires interdisciplinary techniques of information retrieval, natural language processing, data ...
Multimodal and Crossmodal Representation Learning from Textual and Visual Features with Bidirectional Deep Neural Networks for Video Hyperlinking
iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia FusionVideo hyperlinking represents a classical example of multimodal problems. Common approaches to such problems are early fusion of the initial modalities and crossmodal translation from one modality to the other. Recently, deep neural networks, especially ...
Comments