research-article

Free Access

Probabilistic topic models

Author:
David M. Blei

Princeton University, Princeton, N.J.

Princeton University, Princeton, N.J.
View Profile

Authors Info & Claims

Communications of the ACM Volume 55 Issue 4April 2012pp 77–84https://doi.org/10.1145/2133806.2133826

Published:01 April 2012Publication History

Communications of the ACM

Abstract

Surveying a suite of algorithms that offer a solution to managing large document archives.

References

Asuncion, A., Welling, M., Smyth, P., Teh, Y. On smoothing and inference for topic models. In Uncertainty in Artificial Intelligence (2009). Google ScholarDigital Library
Bart, E., Welling, M., Perona, P. Unsupervised organization of image collections: Taxonomies and beyond. Trans. Pattern Recognit. Mach. Intell. 33, 11 (2010) (2301--2315). Google ScholarDigital Library
Blei, D., Griffiths, T., Jordan, M. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57, 2 (2010), 1--30. Google ScholarDigital Library
Blei, D., Jordan, M. Modeling annotated data. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2003), ACM Press, 127--134. Google ScholarDigital Library
Blei, D., Lafferty, J. Dynamic topic models. In International Conference on Machine Learning (2006), ACM, New York, NY, USA, 113--120. Google ScholarDigital Library
Blei, D., Lafferty, J. A correlated topic model of Science. Ann. Appl. Stat., 1, 1 (2007), 17--35.Google ScholarCross Ref
Blei, D., McAuliffe, J. Supervised topic models. In Neural Information Processing Systems (2007).Google Scholar
Blei, D., Ng, A., Jordan, M. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (January 2003), 993--1022. Google ScholarDigital Library
Box, G. Sampling and Bayes' inference in scientific modeling and robustness. J. Roy. Stat. Soc. 143, 4 (1980), 383--430.Google Scholar
Boyd-Graber, J., Blei, D. Syntactic topic models. In Neural Information Processing Systems (2009).Google Scholar
Buntine, W. Variational extensions to EM and multinomial PCA. In European Conference on Machine Learning (2002). Google ScholarDigital Library
Buntine, W., Jakulin, A. Discrete component analysis. Subspace, Latent Structure and Feature Selection. C. Saunders, M. Grobelink, S. Gunn, and J. Shawe-Taylor, Eds. Springer, 2006. Google ScholarDigital Library
Chang, J., Blei, D. Hierarchical relational models for document networks. Ann. Appl. Stat. 4, 1 (2010).Google ScholarCross Ref
Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41, 6 (1990), 391--407.Google ScholarCross Ref
Doyle, G., Elkan, C., Accounting for burstiness in topic models. In International Conference on Machine Learning (2009), ACM, 281--288.. Google ScholarDigital Library
Fei-Fei, L., Perona, P. A Bayesian hierarchical model for learning natural scene categories. In IEEE Computer Vision and Pattern Recognition (2005), 524--531. Google ScholarDigital Library
Gerrish, S., Blei, D. A language-based approach to measuring scholarly impact. In International Conference on Machine Learning (2010).Google Scholar
Griffiths, T., Steyvers, M., Blei, D., Tenenbaum, J. Integrating topics and syntax. Advances in Neural Information Processing Systems 17. L. K. Saul, Y. Weiss, and L. Bottou, eds. MIT Press, Cambridge, MA, 2005, 537--544.Google Scholar
Grimmer, J. A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in senate press releases. Polit. Anal. 18, 1 (2010), 1.Google ScholarCross Ref
Hoffman, M., Blei, D., Bach, F. On-line learning for latent Dirichlet allocation. In Neural Information Processing Systems (2010).Google Scholar
Hofmann, T. Probabilistic latent semantic analysis. In Uncertainty in Artificial Intelligence (UAI) (1999). Google ScholarDigital Library
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L. Introduction to variational methods for graphical models. Mach. Learn. 37 (1999), 183--233. Google ScholarDigital Library
Li, J., Wang, C., Lim, Y., Blei, D., Fei-Fei, L., Building and using a semantivisual image hierarchy. In Computer Vision and Pattern Recognition (2010).Google ScholarCross Ref
Li, W., McCallum, A. Pachinko allocation: DAG-structured mixture models of topic correlations. In International Conference on Machine Learning (2006), 577--584. Google ScholarDigital Library
Mimno, D., McCallum, A. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. In Uncertainty in Artificial Intelligence (2008).Google Scholar
Newman, D., Chemudugunta, C., Smyth, P. Statistical entity-topic models. In Knowledge Discovery and Data Mining (2006). Google ScholarDigital Library
Pritchard, J., Stephens, M., Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155 (June 2000), 945--959.Google ScholarCross Ref
Reisinger, J., Waters, A., Silverthorn, B., Mooney, R. Spherical topic models. In International Conference on Machine Learning (2010).Google Scholar
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smith, P., The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (2004), AUAI Press, 487--494. Google ScholarDigital Library
Rubin, D. Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 4 (1984), 1151--1172.Google ScholarCross Ref
Sivic, J., Russell, B., Zisserman, A., Freeman, W., Efros, A., Unsupervised discovery of visual object class hierarchies. In Conference on Computer Vision and Pattern Recognition (2008).Google ScholarCross Ref
Socher, R., Gershman, S., Perotte, A., Sederberg, P., Blei, D., Norman, K. A Bayesian analysis of dynamics in free recall. In Advances in Neural Information Processing Systems 22. Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, Eds, 2009.Google Scholar
Steyvers, M., Griffiths, T. Probabilistic topic models. Latent Semantic Analysis: A Road to Meaning. T. Landauer, D. McNamara, S. Dennis, and W. Kintsch, eds. Lawrence Erlbaum, 2006.Google Scholar
Teh, Y., Jordan, M., Beal, M., Blei, D. Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 476 (2006), 1566--1581.Google ScholarCross Ref
Wainwright, M., Jordan, M. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1--2) (2008), 1--305. Google ScholarDigital Library
Wallach, H. Topic modeling: Beyond bag of words. In Proceedings of the 23rd International Conference on Machine Learning (2006). Google ScholarDigital Library
Wang, C., Blei, D. Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process. Advances in Neural Information Processing Systems 22. Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, Eds. 2009, 1982--1989.Google Scholar
Wang, C., Thiesson, B., Meek, C., Blei, D. Markov topic models. In Artificial Intelligence and Statistics (2009).Google Scholar

Index Terms

Probabilistic topic models
1. Information systems
  1. Information retrieval
  2. Information systems applications

Recommendations

Probabilistic topic models
KDD '11 Tutorials: Proceedings of the 17th ACM SIGKDD International Conference Tutorials

Probabilistic topic modeling provides a suite of tools for the unsupervised analysis of large collections of documents. Topic modeling algorithms can uncover the underlying themes of a collection and decompose its documents according to those themes. ...
Read More
Probabilistic topic models with biased propagation on heterogeneous information networks
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

With the development of Web applications, textual documents are not only getting richer, but also ubiquitously interconnected with users and other objects in various ways, which brings about text-rich heterogeneous information networks. Topic models ...
Read More
Topic evolution based on the probabilistic topic model: a review

Accurately representing the quantity and characteristics of users' interest in certain topics is an important problem facing topic evolution researchers, particularly as it applies to modern online environments. Search engines can provide information ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Communications of the ACM Volume 55, Issue 4
April 2012
110 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2133806
Issue’s Table of Contents

Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Popular
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3,329
  Total Citations
  View Citations
- 137,520
  Total Downloads
- Downloads (Last 12 months)3,570
- Downloads (Last 6 weeks)805
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Probabilistic topic models

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Probabilistic topic models

Probabilistic topic models with biased propagation on heterogeneous information networks

Topic evolution based on the probabilistic topic model: a review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Probabilistic topic models

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Probabilistic topic models

Probabilistic topic models with biased propagation on heterogeneous information networks

Topic evolution based on the probabilistic topic model: a review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media