Smart Reply: Automated Response Suggestion for Email

Authors:
Anjuli Kannan

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Karol Kurach

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Sujith Ravi

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Tobias Kaufmann

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Andrew Tomkins

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Balint Miklos

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Greg Corrado

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Laszlo Lukacs

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Marina Ganea

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Peter Young

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Vivek Ramavajjala

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2016Pages 955–964https://doi.org/10.1145/2939672.2939801

Published:13 August 2016Publication History

KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 955–964

ABSTRACT

In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of all mobile responses. It is designed to work at very high throughput and process hundreds of millions of messages daily. The system exploits state-of-the-art, large-scale deep learning.

We describe the architecture of the system as well as the challenges that we faced while building it, like response diversity and scalability. We also introduce a new method for semantic clustering of user-generated content that requires only a modest amount of explicitly labeled data.

References

M. Abadi, A. Agarwal, P. Barham, and et al. Tensorflow: Large-scale machine learning on heterogeneous systems. 2015.Google Scholar
I. G. P. Affairs. Interconnected world: Communication & social networking. Press Release, March 2012. http://www.ipsos-na.com/news-polls/pressrelease.aspx?id=5564.Google Scholar
Y. Artzi, P. Pantel, and M. Gamon. Predicting responses to microblog posts. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 602--606, Montréal, Canada, June 2012. Association for Computational Linguistics. Google ScholarDigital Library
L. Backstrom, J. Kleinberg, L. Lee, and C. Danescu-Niculescu-Mizil. Characterizing and curating conversation threads: Expansion, focus, volume, re-entry. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM '13, pages 13--22, 2013. Google ScholarDigital Library
Y. Bengio, O. Delalleau, and N. Le Roux. Label propagation and quadratic criterion. In O. Chapelle, B. Schölkopf, and A. Zien, editors, Semi-Supervised Learning, pages 193--216. MIT Press, 2006.Google Scholar
W. Chan, N. Jaitly, Q. V. Le, and O. Vinyals. Listen, attend, and spell. arXiv:1508.01211, abs/1508.01211, 2015.Google Scholar
Z. Chen, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Identifying intention posts in discussion forums. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1041--1050, Atlanta, Georgia, June 2013. Association for Computational Linguistics.Google Scholar
J. Duchi, E. Hazad, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 12, 2011. Google ScholarDigital Library
Y. Goldberg. A primer on neural network models for natural language processing. CoRR, abs/1510.00726, 2015.Google Scholar
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735--1780, 1997. Google ScholarDigital Library
S. M. Katz. Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35:400--401, 1987.Google ScholarCross Ref
X. Li. Understanding the semantic structure of noun phrase queries. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pages 1337--1345, Uppsala, Sweden, July 2010. Association for Computational Linguistics. Google ScholarDigital Library
V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807--814, 2010.Google ScholarDigital Library
B. Pang and S. Ravi. Revisiting the predictability of language: Response completion in social media. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1489--1499, Jeju Island, Korea, July 2012. Association for Computational Linguistics. Google ScholarDigital Library
S. Ravi and Q. Diao. Large scale distributed semi-supervised learning using streaming approximation. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 2016.Google Scholar
A. Ritter, C. Cherry, and W. B. Dolan. Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, July 2011. Association for Computational Linguistics. Google ScholarDigital Library
R. Saha Roy, R. Katare, N. Ganguly, S. Laxman, and M. Choudhury. Discovering and understanding word level user intent in web search queries. Web Semant., 30(C):22--38, Jan. 2015. Google ScholarDigital Library
H. Sak, A. Senior, and F. Beaufays. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), 2014.Google Scholar
I. V. Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau. Hierarchical neural network generative models for movie dialogues. In arXiv preprint arXiv:1507.04808, 2015.Google Scholar
L. Shang, Z. Lu, and H. Li. Neural responding machine for short-text conversation. In In Proceedings of ACL-IJCNLP, 2015.Google ScholarCross Ref
A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, J. Gao, and B. Dolan. A neural network approach to context-sensitive generation of conversation responses. In In Proceedings of NAACL-HLT, 2015.Google Scholar
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929--1958, 2014. Google ScholarDigital Library
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems (NIPS), 2014. Google ScholarDigital Library
O. Vinyals and Q. V. Le. A neural conversation model. In ICML Deep Learning Workshop, 2015.Google Scholar
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.Google ScholarCross Ref
J. B. Wendt, M. Bendersky, L. Garcia-Pueyo, V. Josifovski, B. Miklos, I. Krka, A. Saikia, J. Yang, M.-A. Cartright, and S. Ravi. Hierarchical label propagation and discovery for machine generated email. In Proceedings of the International Conference on Web Search and Data Mining (WSDM) (2016), 2016. Google ScholarDigital Library
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the International Conference on Machine Learning (ICML), pages 912--919, 2003.Google Scholar

Index Terms

Smart Reply: Automated Response Suggestion for Email
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Clustering
  2. World Wide Web
    1. Web applications
      1. Internet communications tools
        Email

Recommendations

To Reply or To Reply All: Understanding Replying Behavior in Group Email Communication
CSCW '16: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing

“Reply” and “Reply All” buttons in email provide the convenience of a quick reply to those included in the email. Yet this very convenience can be troublesome to both the individual and the group involved if receivers intentionally or unintentionally ...
Read More
Preventing Spam Email by Delivery Limitation in RMX
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications Symposium

On the rule-based email exchange system called RMX, similar to general mailing lists, anyone can send emails by sending to an address unique to RMX. However, there is a security problem that we cannot prevent spam emails and accidentally sending email ...
Read More
RFC 7504: SMTP 521 and 556 Reply Codes
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2016
2176 pages
ISBN:9781450342322
DOI:10.1145/2939672
General Chairs:
Balaji Krishnapuram
IBM
,
Mohak Shah
Bosch
,
Program Chairs:
Alex Smola
Amazon
,
Charu Aggarwal
IBM
,
Dou Shen
Baidu
,
Rajeev Rastogi
Amazon
Copyright © 2016 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 August 2016
Check for updates
Author Tags
clustering
deep learning
email
lstm
semantics
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '16 Paper Acceptance Rate66of1,115submissions,6%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 131
  Total Citations
  View Citations
- 3,392
  Total Downloads
- Downloads (Last 12 months)453
- Downloads (Last 6 weeks)58
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Smart Reply: Automated Response Suggestion for Email

KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

To Reply or To Reply All: Understanding Replying Behavior in Group Email Communication

Preventing Spam Email by Delivery Limitation in RMX

RFC 7504: SMTP 521 and 556 Reply Codes