skip to main content
10.1145/3292500.3330683acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Automatic Dialogue Summary Generation for Customer Service

Published:25 July 2019Publication History

ABSTRACT

Dialogue summarization extracts useful information from a dialogue. It helps people quickly capture the highlights of a dialogue without going through long and sometimes twisted utterances. For customer service, it saves human resources currently required to write dialogue summaries. A main challenge of dialogue summarization is to design a mechanism to ensure the logic, integrity, and correctness of the summaries. In this paper, we introduce auxiliary key point sequences to solve this problem. A key point sequence describes the logic of the summary. In our training procedure, a key point sequence acts as an auxiliary label. It helps the model learn the logic of the summary. In the prediction procedure, our model predicts the key point sequence first and then uses it to guide the prediction of the summary. Along with the auxiliary key point sequence, we propose a novel Leader-Writer network. The Leader net predicts the key point sequence, and the Writer net predicts the summary based on the decoded key point sequence. The Leader net ensures the summary is logical and integral. The Writer net focuses on generating fluent sentences. We test our model on customer service scenarios. The results show that our model outperforms other models not only on BLEU and ROUGE-L score but also on logic and integrity.

Skip Supplemental Material Section

Supplemental Material

p1957-liu.mp4

mp4

1.1 GB

References

  1. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).Google ScholarGoogle Scholar
  2. Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2016. An actor-critic algorithm for sequence prediction. arXiv preprint arXiv:1607.07086 (2016).Google ScholarGoogle Scholar
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google ScholarGoogle Scholar
  4. Siddhartha Banerjee, Prasenjit Mitra, and Kazunari Sugiyama. 2015. Multi-document abstractive summarization using ilp based multi-sentence compression.. In Proc. IJCAI. 1208--1214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Michele Banko, Vibhu O Mittal, and Michael J Witbrock. 2000. Headline generation based on statistical translation. In Proc. ACL . 318--325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Taylor Berg-Kirkpatrick, Dan Gillick, and Dan Klein. 2011. Jointly learning to extract and compress. In Proc. ACL. 481--490. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yen-Chun Chen and Mohit Bansal. 2018. Fast abstractive summarization with reinforce-selected sentence rewriting. arXiv preprint arXiv:1805.11080 (2018).Google ScholarGoogle Scholar
  8. Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. Proc. EMNLP (2014).Google ScholarGoogle ScholarCross RefCross Ref
  9. Sumit Chopra, Michael Auli, and Alexander M Rush. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In Proc. NAACL . 93--98.Google ScholarGoogle ScholarCross RefCross Ref
  10. Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. 2018. A discourse-aware attention model for abstractive summarization of long documents. In Proc. NAACL-HLT, Vol. 2. 615--621.Google ScholarGoogle Scholar
  11. Katja Filippova, Enrique Alfonseca, Carlos A Colmenares, Lukasz Kaiser, and Oriol Vinyals. 2015. Sentence compression by deletion with lstms. In Proc. EMNLP. 360--368.Google ScholarGoogle ScholarCross RefCross Ref
  12. Chih-Wen Goo and Yun-Nung Chen. 2018. Abstractive dialogue summarization with sentence-gated modeling optimized by dialogue acts. arXiv preprint arXiv:1809.05715 (2018).Google ScholarGoogle Scholar
  13. Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proc. ACL . 1631--1640.Google ScholarGoogle ScholarCross RefCross Ref
  14. Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. 2016. Pointing the unknown words. In Proc. ACL. 140--149.Google ScholarGoogle ScholarCross RefCross Ref
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. CVPR. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  16. Dan Hendrycks and Kevin Gimpel. 2016. Bridging nonlinearities and stochastic regularizers with gaussian error linear units. arXiv preprint 1606.08415 (2016).Google ScholarGoogle Scholar
  17. Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. 2017. Reinforced mnemonic reader for machine reading comprehension. arXiv preprint arXiv:1705.02798 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hongyan Jing and Kathleen R McKeown. 2000. Cut and paste based text summarization. In Proc. NAACL. 178--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Diederik P Kingma and Jimmy Ba. 2014. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  20. Kevin Knight and Daniel Marcu. 2000. Statistics-based summarization-step one: sentence compression. AAAI/IAAI , Vol. 2000 (2000), 703--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chin-Yew Lin. 2004. Rouge: a package for automatic evaluation of summaries. Text Summarization Branches Out (2004).Google ScholarGoogle Scholar
  22. Yishu Miao and Phil Blunsom. 2016. Language as a latent variable: discrete generative models for sentence compression. (2016), 319--328.Google ScholarGoogle Scholar
  23. Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Caglar Gulcehre, and Bing Xiang. 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. (2016), 280--290.Google ScholarGoogle Scholar
  24. Tatsuro Oya, Yashar Mehdad, Giuseppe Carenini, and Raymond Ng. 2014. A template-based abstractive meeting summarization: leveraging summary and source text relationships. In Proc. INLG. 45--53.Google ScholarGoogle ScholarCross RefCross Ref
  25. Haojie Pan, Junpei Zhou, Zhou Zhao, Yan Liu, Deng Cai, and Min Yang. 2018. Dial2desc: end-to-end dialogue description generation. arXiv preprint arXiv:1811.00185 (2018).Google ScholarGoogle Scholar
  26. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proc. ACL . 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Romain Paulus, Caiming Xiong, and Richard Socher. 2017. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304 (2017).Google ScholarGoogle Scholar
  28. Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732 (2015).Google ScholarGoogle Scholar
  29. Steven J Rennie, Etienne Marcheret, Youssef Mroueh, Jarret Ross, and Vaibhava Goel. 2017. Self-critical sequence training for image captioning. In Proc. CVPR. 3.Google ScholarGoogle ScholarCross RefCross Ref
  30. Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. (2015), 379--389.Google ScholarGoogle Scholar
  31. Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: summarization with pointer-generator networks. In Proc. ACL . 1073--1083.Google ScholarGoogle Scholar
  32. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research , Vol. 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proc. NeurIPS. 5998--6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Proc. NeurIPS. 2692--2700. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. 2017. Gated self-matching networks for reading comprehension and question answering. In Proc. ACL. 189--198.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ronald J Williams and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural computation , Vol. 1, 2 (1989), 270--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shasha Xie, Yang Liu, and Hui Lin. 2008. Evaluating the effectiveness of features and sampling in extractive meeting summarization. In Proc. SLT. 157--160.Google ScholarGoogle Scholar
  38. Wenyuan Zeng, Wenjie Luo, Sanja Fidler, and Raquel Urtasun. 2016. Efficient summarization with read-again and copy mechanism. arXiv preprint arXiv:1611.03382 (2016).Google ScholarGoogle Scholar

Index Terms

  1. Automatic Dialogue Summary Generation for Customer Service

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
        July 2019
        3305 pages
        ISBN:9781450362016
        DOI:10.1145/3292500

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 July 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader