skip to main content
10.1145/2939672.2939801acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Smart Reply: Automated Response Suggestion for Email

Published:13 August 2016Publication History

ABSTRACT

In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of all mobile responses. It is designed to work at very high throughput and process hundreds of millions of messages daily. The system exploits state-of-the-art, large-scale deep learning.

We describe the architecture of the system as well as the challenges that we faced while building it, like response diversity and scalability. We also introduce a new method for semantic clustering of user-generated content that requires only a modest amount of explicitly labeled data.

References

  1. M. Abadi, A. Agarwal, P. Barham, and et al. Tensorflow: Large-scale machine learning on heterogeneous systems. 2015.Google ScholarGoogle Scholar
  2. I. G. P. Affairs. Interconnected world: Communication & social networking. Press Release, March 2012. http://www.ipsos-na.com/news-polls/pressrelease.aspx?id=5564.Google ScholarGoogle Scholar
  3. Y. Artzi, P. Pantel, and M. Gamon. Predicting responses to microblog posts. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 602--606, Montréal, Canada, June 2012. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Backstrom, J. Kleinberg, L. Lee, and C. Danescu-Niculescu-Mizil. Characterizing and curating conversation threads: Expansion, focus, volume, re-entry. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM '13, pages 13--22, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Bengio, O. Delalleau, and N. Le Roux. Label propagation and quadratic criterion. In O. Chapelle, B. Schölkopf, and A. Zien, editors, Semi-Supervised Learning, pages 193--216. MIT Press, 2006.Google ScholarGoogle Scholar
  6. W. Chan, N. Jaitly, Q. V. Le, and O. Vinyals. Listen, attend, and spell. arXiv:1508.01211, abs/1508.01211, 2015.Google ScholarGoogle Scholar
  7. Z. Chen, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Identifying intention posts in discussion forums. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1041--1050, Atlanta, Georgia, June 2013. Association for Computational Linguistics.Google ScholarGoogle Scholar
  8. J. Duchi, E. Hazad, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 12, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Goldberg. A primer on neural network models for natural language processing. CoRR, abs/1510.00726, 2015.Google ScholarGoogle Scholar
  10. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735--1780, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. M. Katz. Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35:400--401, 1987.Google ScholarGoogle ScholarCross RefCross Ref
  12. X. Li. Understanding the semantic structure of noun phrase queries. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pages 1337--1345, Uppsala, Sweden, July 2010. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807--814, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Pang and S. Ravi. Revisiting the predictability of language: Response completion in social media. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1489--1499, Jeju Island, Korea, July 2012. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Ravi and Q. Diao. Large scale distributed semi-supervised learning using streaming approximation. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 2016.Google ScholarGoogle Scholar
  16. A. Ritter, C. Cherry, and W. B. Dolan. Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, July 2011. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Saha Roy, R. Katare, N. Ganguly, S. Laxman, and M. Choudhury. Discovering and understanding word level user intent in web search queries. Web Semant., 30(C):22--38, Jan. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Sak, A. Senior, and F. Beaufays. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), 2014.Google ScholarGoogle Scholar
  19. I. V. Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau. Hierarchical neural network generative models for movie dialogues. In arXiv preprint arXiv:1507.04808, 2015.Google ScholarGoogle Scholar
  20. L. Shang, Z. Lu, and H. Li. Neural responding machine for short-text conversation. In In Proceedings of ACL-IJCNLP, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  21. A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, J. Gao, and B. Dolan. A neural network approach to context-sensitive generation of conversation responses. In In Proceedings of NAACL-HLT, 2015.Google ScholarGoogle Scholar
  22. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929--1958, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems (NIPS), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. O. Vinyals and Q. V. Le. A neural conversation model. In ICML Deep Learning Workshop, 2015.Google ScholarGoogle Scholar
  25. O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.Google ScholarGoogle ScholarCross RefCross Ref
  26. J. B. Wendt, M. Bendersky, L. Garcia-Pueyo, V. Josifovski, B. Miklos, I. Krka, A. Saikia, J. Yang, M.-A. Cartright, and S. Ravi. Hierarchical label propagation and discovery for machine generated email. In Proceedings of the International Conference on Web Search and Data Mining (WSDM) (2016), 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the International Conference on Machine Learning (ICML), pages 912--919, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Smart Reply: Automated Response Suggestion for Email

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
            August 2016
            2176 pages
            ISBN:9781450342322
            DOI:10.1145/2939672

            Copyright © 2016 Owner/Author

            Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 August 2016

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            KDD '16 Paper Acceptance Rate66of1,115submissions,6%Overall Acceptance Rate1,133of8,635submissions,13%

            Upcoming Conference

            KDD '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader