research-article

Open Access

Learning topic models -- provably and efficiently

Authors:
Sanjeev Arora

Princeton University, Princeton, NJ

Princeton University, Princeton, NJ
View Profile

,
Rong Ge

Duke University, Durham, NC

Duke University, Durham, NC
View Profile

,
Yoni Halpern

Google, Cambridge, MA

Google, Cambridge, MA
View Profile

,
David Mimno

Cornell University, Ithaca, NY

Cornell University, Ithaca, NY
View Profile

,
Ankur Moitra

MIT, Cambridge, MA

MIT, Cambridge, MA
View Profile

,
David Sontag

MIT, Cambridge, MA

MIT, Cambridge, MA
View Profile

,
Yichen Wu

View Profile

,
Michael Zhu

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

Authors Info & Claims

Communications of the ACM Volume 61 Issue 4April 2018pp 85–93https://doi.org/10.1145/3186262

Published:26 March 2018Publication History

Communications of the ACM

References

Ahmed, A., Aly, M., Gonzalez, J., Narayanamurthy, S., Smola, A.J. Scalable inference in latent variable models. In WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining (New York, NY, USA, 2012), ACM, 123--132. Google ScholarDigital Library
Anandkumar, A., Foster, D., Hsu, D., Kakade, S., Liu, Y. Two SVDs suffice: Spectral decompositions for probabilistic topic modeling and latent dirichlet allocation. In NIPS (2012). Google ScholarDigital Library
Arora, S., Ge, R., Kannan, R., Moitra, A. Computing a nonnegative matrix factorization---Provably. In STOC (2012), 145--162. Google ScholarDigital Library
Arora, S., Ge, R., Moitra, A. Learning topic models---Going beyond SVD. In FOCS (2012). Google ScholarDigital Library
Blei, D. Introduction to probabilistic topic models. Commun. ACM (2012), 77--84. Google ScholarDigital Library
Blei, D., Lafferty, J. A correlated topic model of science. Ann. Appl. Stat. (2007), 17--35.Google Scholar
Blei, D., Ng, A., Jordan, M. Latent dirichlet allocation. J. Mach. Learn. Res. (2003), 993--1022. Preliminary version in NIPS 2001. Google ScholarDigital Library
Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R. Indexing by latent semantic analysis. JASIS (1990), 391--407.Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B (1977), 1--38.Google Scholar
Ding, W., Rohban, M.H., Ishwar, P., Saligrama, V. Efficient distributed topic modeling with provable guarantees. JMLR (2014), 167--175.Google Scholar
Donoho, D., Stodden, V. When does non-negative matrix factorization give the correct decomposition into parts? In NIPS (2003). Google ScholarDigital Library
Ge, R., Zou, J. Intersecting faces: Non-negative matrix factorization with new guarantees. In Proceedings of The 32nd International Conference on Machine Learning (2015), 2295--2303. Google ScholarDigital Library
Griffiths, T.L., Steyvers, M. Finding scientific topics. Proc. Natl. Acad. Sci. 101 (2004), 5228--5235.Google ScholarCross Ref
Hoffman, M.D., Blei, D.M. Structured stochastic variational inference. In 18th International Conference on Artificial Intelligence and Statistics (2015).Google Scholar
Kalai, A.T., Moitra, A., Valiant, G. Disentangling gaussians. Commun. ACM 55, 2 (Feb. 2012), 113--120. Google ScholarDigital Library
Kivinen, J., Warmuth, M.K. Exponentiated gradient versus gradient descent for linear predictors. Inform. and Comput. 132 (1995). Google ScholarDigital Library
Lee, M., Bindel, D., Mimno, D.M. Robust spectral inference for joint stochastic matrix factorization. In NIPS (2015). Google ScholarDigital Library
Lee, M., Mimno, D. Low-dimensional embeddings for interpretable anchor-based topic inference. In EMNLP (2014).Google Scholar
Li, W., McCallum, A. Pachinko allocation: Dag-structured mixture models of topic correlations. In ICML (2007), 633--640. Google ScholarDigital Library
McCallum, A. Mallet: A machine learning for language toolkit (2002). http://mallet.cs.umass.edu.Google Scholar
Mimno, D., Wallach, H., Talley, E., Leenders, M., McCallum, A. Optimizing semantic coherence in topic models. In EMNLP (2011). Google ScholarDigital Library
Nguyen, T., Hu, Y., Boyd-Graber, J. Anchors regularized: Adding robustness and extensibility to scalable topic-modeling algorithms. In ACL (2014).Google Scholar
Pearson, K. Contributions to the mathematical theory of evolution. Philos. Trans. R. Soc. Lond. A. 185 (1894), 71--110.Google ScholarCross Ref
Roberts, M.E., Stewart, B.M., Tingley, D. Navigating the local modes of big data: The case of topic models. In Data Science for Politics, Policy and Government (Cambridge University Press, New York, 2014).Google Scholar
Sontag, D., Roy, D. Complexity of inference in latent dirichlet allocation. In NIPS (2011), 1008--1016. Google ScholarDigital Library
Vavasis, S. On the complexity of nonnegative matrix factorization. SIAM J. Optim. (2009), 1364--1377.Google Scholar
Zhou, T., Bilmes, J.A., Guestrin, C. Divide-and-conquer learning by anchoring a conical hull. In NIPS (2014), 1242--1250. Google ScholarDigital Library

Index Terms

Learning topic models -- provably and efficiently
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Topic modeling
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

MOOC and Blended Learning Models: Analysis from a Stakeholders' Perspective

Interest is growing in educational designs that blend MOOCs with on-campus teaching and researchers are seeking to incorporate the spirit of a MOOC into a hybrid model. This article reports on the current experience of a higher education institution ...
Read More
Open Social Learner Models for Self-Regulated Learning and Learning Motivation
UMAP '16: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization

Open Learner Models (OLM) have demonstrated a multitude of benefits supporting metacognition and engaging learners. Although researchers have study different representations of OLM, a broader view that situates OLM in Self-Regulated Learning (SRL) is ...
Read More
Personalized E-learning system with self-regulated learning assisted mechanisms for promoting learning performance

With the rapid development of Internet technologies, the conventional computer-assisted learning (CAL) is gradually moving toward to web-based learning. Additionally, instructors typically base their teaching methods to simultaneously interact with all ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Communications of the ACM Volume 61, Issue 4
April 2018
88 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3200906
Editor:
Andrew A. Chien
Association for Computing Machinery, New York, NY
Issue’s Table of Contents
Copyright © 2018 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 March 2018
Check for updates
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 17,007
  Total Downloads
- Downloads (Last 12 months)4,818
- Downloads (Last 6 weeks)91
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Learning topic models -- provably and efficiently

Communications of the ACM

References

Cited By

Index Terms

Recommendations

MOOC and Blended Learning Models: Analysis from a Stakeholders' Perspective

Open Social Learner Models for Self-Regulated Learning and Learning Motivation

Personalized E-learning system with self-regulated learning assisted mechanisms for promoting learning performance

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Learning topic models -- provably and efficiently

Communications of the ACM

References

Cited By

Index Terms

Recommendations

MOOC and Blended Learning Models: Analysis from a Stakeholders' Perspective

Open Social Learner Models for Self-Regulated Learning and Learning Motivation

Personalized E-learning system with self-regulated learning assisted mechanisms for promoting learning performance

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media