skip to main content
10.1145/3209978.3209997acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Dialogue Act Recognition via CRF-Attentive Structured Network

Published:27 June 2018Publication History

ABSTRACT

Dialogue Act Recognition (DAR) is a challenging problem in dialogue interpretation, which aims to associate semantic labels to utterances and characterize the speaker's intention. Currently, many existing approaches formulate the DAR problem ranging from multi-classification to structured prediction, which suffer from handcrafted feature extensions and attentive contextual dependencies. In this paper, we tackle the problem of DAR from the viewpoint of extending richer Conditional Random Field (CRF) structured dependencies without abandoning end-to-end training. We incorporate hierarchical semantic inference with memory mechanism on the utterance modeling at multiple levels. We then utilize the structured attention network on the linear-chain CRF to dynamically separate the utterances into cliques. The extensive experiments on two primary benchmark datasets Switchboard Dialogue Act (SWDA) and Meeting Recorder Dialogue Act (MRDA) datasets show that our method achieves better performance than other state-of-the-art solutions to the problem.

References

  1. Miltiadis Allamanis, Hao Peng, and Charles Sutton . 2016. A convolutional attention network for extreme summarization of source code International Conference on Machine Learning. 2091--2100.Google ScholarGoogle Scholar
  2. Jeremy Ang, Yang Liu, and Elizabeth Shriberg . 2005. Automatic Dialog Act Segmentation and Classification in Multiparty Meetings. In ICASSP.Google ScholarGoogle Scholar
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio . 2014. Neural Machine Translation by Jointly Learning to Align and Translate. Computer Science (2014).Google ScholarGoogle Scholar
  4. Phil Blunsom and Nal Kalchbrenner . 2013. Recurrent Convolutional Neural Networks for Discourse Compositionality Proceedings of the 2013 Workshop on Continuous Vector Space Models and their Compositionality. Proceedings of the 2013 Workshop on Continuous Vector Space Models and their Compositionality.Google ScholarGoogle Scholar
  5. Kristy Elizabeth Boyer, Eunyoung Ha, Michael D Wallis, Robert Phillips, Mladen A Vouk, and James C Lester . 2009. Discovering Tutorial Dialogue Strategies with Hidden Markov Models. AIED. 141--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yun-Nung Chen, William Yang Wang, and Alexander I Rudnicky . 2013. An empirical investigation of sparse log-linear models for improved dialogue act classification. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 8317--8321.Google ScholarGoogle Scholar
  7. Zheqian Chen, Rongqin Yang, Bin Cao, Zhou Zhao, Deng Cai, and Xiaofei He . 2017. Smarnet: Teaching Machines to Read and Comprehend Like Human. (2017).Google ScholarGoogle Scholar
  8. Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio . 2014. On the Properties of Neural Machine Translation: Encoder--Decoder Approaches. Syntax, Semantics and Structure in Statistical Translation (2014), 103.Google ScholarGoogle Scholar
  9. Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov . {n. d.}. Gated-Attention Readers for Text Comprehension. (. {n. d.}).Google ScholarGoogle Scholar
  10. Orhan Firat, Kyunghyun Cho, and Yoshua Bengio . 2016. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism Proceedings of NAACL-HLT. 866--875.Google ScholarGoogle Scholar
  11. Michel Galley . 2006. A skip-chain conditional random field for ranking meeting utterances by importance Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 364--372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeroen Geertzen, Volha Petukhova, and Harry Bunt . 2007. A multidimensional approach to utterance segmentation and dialogue act classification. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, Antwerp. 140--149.Google ScholarGoogle Scholar
  13. Sergio Grau, Emilio Sanchis, Maria Jose Castro, and David Vilar . 2004. Dialogue act classification using a Bayesian approach 9th Conference Speech and Computer.Google ScholarGoogle Scholar
  14. Ryuichiro Higashinaka, Kenji Imamura, Toyomi Meguro, Chiaki Miyazaki, Nozomi Kobayashi, Hiroaki Sugiyama, Toru Hirano, Toshiro Makino, and Yoshihiro Matsuo . 2014. Towards an open-domain conversational system fully based on natural language processing.. In COLING. 928--939.Google ScholarGoogle Scholar
  15. Sepp Hochreiter and Jürgen Schmidhuber . 1997. Long short-term memory. Neural computation Vol. 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zhiheng Huang, Wei Xu, and Kai Yu . {n. d.}. Bidirectional LSTM-CRF Models for Sequence Tagging. (. {n. d.}).Google ScholarGoogle Scholar
  17. Yangfeng Ji, Gholamreza Haffari, and Jacob Eisenstein . 2016 a. A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models HLT-NAACL.Google ScholarGoogle Scholar
  18. Yangfeng Ji, Gholamreza Haffari, and Jacob Eisenstein . 2016 b. A Latent Variable Recurrent Neural Network for Discourse Relation Language Models Proceedings of NAACL-HLT. 332--342.Google ScholarGoogle Scholar
  19. Nal Kalchbrenner and Phil Blunsom . 2013. Recurrent Convolutional Neural Networks for Discourse Compositionality. ACL 2013 (2013), 119.Google ScholarGoogle Scholar
  20. Hamed Khanpour, Nishitha Guntakandla, and Rodney Nielsen . 2016. Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network. In COLING.Google ScholarGoogle Scholar
  21. Yoon Kim, Carl Denton, and Luong Hoang Alexander M Rush . {n. d.}. Structured Attention Networks. (. {n. d.}).Google ScholarGoogle Scholar
  22. Yoon Kim, Yacine Jernite, David Sontag, and Alexander M Rush . 2016. Character-Aware Neural Language Models.. In AAAI. 2741--2749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Harshit Kumar, Arvind Agarwal, Riddhiman Dasgupta, Sachindra Joshi, and Arun Kumar . 2017 a. Dialogue Act Sequence Labeling using Hierarchical encoder with CRF. (2017).Google ScholarGoogle Scholar
  24. Harshit Kumar, Arvind Agarwal, Riddhiman Dasgupta, Sachindra Joshi, and Arun Kumar . 2017 b. Dialogue Act Sequence Labeling using Hierarchical encoder with CRF. CoRR Vol. abs/1709.04250 (2017).Google ScholarGoogle Scholar
  25. Ji Young Lee and Franck Dernoncourt . 2016 a. Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks HLT-NAACL.Google ScholarGoogle Scholar
  26. Ji Young Lee and Franck Dernoncourt . 2016 b. Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks Proceedings of NAACL-HLT. 515--520.Google ScholarGoogle Scholar
  27. Piroska Lendvai and Jeroen Geertzen . 2007. Token-based chunking of turn-internal dialogue act sequences Proceedings of the 8th SIGDIAL Workshop on Discourse and Dialogue. 174--181.Google ScholarGoogle Scholar
  28. Fei Liu, Timothy Baldwin, and Trevor Cohn . 2017. Capturing Long-range Contextual Dependencies with Memory-enhanced Conditional Random Fields. (2017).Google ScholarGoogle Scholar
  29. Minh-Thang Luong, Hieu Pham, and Christopher D Manning . {n. d.}. Effective Approaches to Attention-based Neural Machine Translation. (. {n. d.}).Google ScholarGoogle Scholar
  30. Xuezhe Ma and Eduard Hovy . {n. d.}. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. (. {n. d.}).Google ScholarGoogle Scholar
  31. Dmitrijs Milajevs and Matthew Purver . 2014. Investigating the contribution of distributional semantic information for dialogue act classification. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC). 40--47.Google ScholarGoogle ScholarCross RefCross Ref
  32. Boyuan Pan, Hao Li, Zhou Zhao, Bin Cao, Deng Cai, and Xiaofei He . 2017. MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension. (2017).Google ScholarGoogle Scholar
  33. Jeffrey Pennington, Richard Socher, and Christopher Manning . 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  34. Norbert Reithinger and Martin Klesen . 1997. Dialogue act classification using language models. EuroSpeech.Google ScholarGoogle Scholar
  35. Alexander M Rush, SEAS Harvard, Sumit Chopra, and Jason Weston . {n. d.}. A Neural Attention Model for Sentence Summarization. (. {n. d.}).Google ScholarGoogle Scholar
  36. Riccardo Serafin, Barbara Di Eugenio, and Michael Glass . 2003. Latent Semantic Analysis for dialogue act classification Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers-Volume 2. Association for Computational Linguistics, 94--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov . 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research Vol. 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, Paul Taylor, Rachel Martin, Carol Van Ess-Dykema, and Marie Meteer . 2006. Dialogue act modeling for automatic tagging and recognition of conversational speech. Dialogue Vol. 26, 3 (2006).Google ScholarGoogle Scholar
  39. Dinoj Surendran and Gina-Anne Levow . 2006. Dialog act tagging with support vector machines and hidden Markov models. Interspeech.Google ScholarGoogle Scholar
  40. Maryam Tavafi, Yashar Mehdad, Shafiq Joty, Giuseppe Carenini, and Raymond Ng . {n. d.}. Dialogue Act Recognition in Synchronous and Asynchronous Conversations. (. {n. d.}).Google ScholarGoogle Scholar
  41. A. Viterbi . 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans.informat.theory Vol. 13, 2 (1967), 260--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Linlin Wang, Zhu Cao, Gerard de Melo, and Zhiyuan Liu . 2016. Relation Classification via Multi-Level Attention CNNs. ACL (1).Google ScholarGoogle Scholar
  43. Nick Webb, Mark Hepple, and Yorick Wilks . 2005. Dialogue act classification based on intra-utterance features Proceedings of the AAAI Workshop on Spoken Language Understanding, Vol. Vol. 4. 5.Google ScholarGoogle Scholar
  44. Hongyang Xue, Zhou Zhao, and Deng Cai . 2017. Unifying the Video and Question Attentions for Open-Ended Video Question Answering. IEEE Transactions on Image Processing Vol. 26 (2017), 5656--5666.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J Smola, and Eduard H Hovy . 2016. Hierarchical Attention Networks for Document Classification. HLT-NAACL. 1480--1489.Google ScholarGoogle Scholar
  46. Matthew D Zeiler . {n. d.}. ADADELTA: AN ADAPTIVE LEARNING RATE METHOD. (. {n. d.}).Google ScholarGoogle Scholar
  47. Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi . 2017 b. Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders. In ACL.Google ScholarGoogle Scholar
  48. Zhou Zhao, Hanqing Lu, Vincent Wenchen Zheng, Deng Cai, Xiaofei He, and Yueting Zhuang . 2017 a. Community-Based Question Answering via Asymmetric Multi-Faceted Ranking Network Learning AAAI.Google ScholarGoogle Scholar
  49. Zhou Zhao, Qifan Yang, Hanqing Lu, Tim Weninger, Deng Cai, Xiaofei He, and Yueting Zhuang . 2018. Social-Aware Movie Recommendation via Multimodal Network Learning. IEEE Transactions on Multimedia Vol. 20 (2018), 430--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu . 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Vol. Vol. 2. 207--212.Google ScholarGoogle ScholarCross RefCross Ref
  51. Yucan Zhou, Qinghua Hu, Jie Liu, and Yuan Jia . 2015. Combining heterogeneous deep neural networks with conditional random fields for Chinese dialogue act recognition. Neurocomputing Vol. 168 (2015), 408--417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yu Zhu, Hao Li, Yikang Liao, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai . 2017. What to Do Next: Modeling User Behaviors by Time-LSTM IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Matthias Zimmermann . 2009. Joint segmentation and classification of dialog acts using conditional random fields Tenth Annual Conference of the International Speech Communication Association.Google ScholarGoogle Scholar

Index Terms

  1. Dialogue Act Recognition via CRF-Attentive Structured Network

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
            June 2018
            1509 pages
            ISBN:9781450356572
            DOI:10.1145/3209978

            Copyright © 2018 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 June 2018

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader