skip to main content
research-article
Free Access

Probabilistic topic models

Published:01 April 2012Publication History
Skip Abstract Section

Abstract

Surveying a suite of algorithms that offer a solution to managing large document archives.

References

  1. Asuncion, A., Welling, M., Smyth, P., Teh, Y. On smoothing and inference for topic models. In Uncertainty in Artificial Intelligence (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bart, E., Welling, M., Perona, P. Unsupervised organization of image collections: Taxonomies and beyond. Trans. Pattern Recognit. Mach. Intell. 33, 11 (2010) (2301--2315). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Blei, D., Griffiths, T., Jordan, M. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57, 2 (2010), 1--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Blei, D., Jordan, M. Modeling annotated data. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2003), ACM Press, 127--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Blei, D., Lafferty, J. Dynamic topic models. In International Conference on Machine Learning (2006), ACM, New York, NY, USA, 113--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blei, D., Lafferty, J. A correlated topic model of Science. Ann. Appl. Stat., 1, 1 (2007), 17--35.Google ScholarGoogle ScholarCross RefCross Ref
  7. Blei, D., McAuliffe, J. Supervised topic models. In Neural Information Processing Systems (2007).Google ScholarGoogle Scholar
  8. Blei, D., Ng, A., Jordan, M. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (January 2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Box, G. Sampling and Bayes' inference in scientific modeling and robustness. J. Roy. Stat. Soc. 143, 4 (1980), 383--430.Google ScholarGoogle Scholar
  10. Boyd-Graber, J., Blei, D. Syntactic topic models. In Neural Information Processing Systems (2009).Google ScholarGoogle Scholar
  11. Buntine, W. Variational extensions to EM and multinomial PCA. In European Conference on Machine Learning (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Buntine, W., Jakulin, A. Discrete component analysis. Subspace, Latent Structure and Feature Selection. C. Saunders, M. Grobelink, S. Gunn, and J. Shawe-Taylor, Eds. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chang, J., Blei, D. Hierarchical relational models for document networks. Ann. Appl. Stat. 4, 1 (2010).Google ScholarGoogle ScholarCross RefCross Ref
  14. Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41, 6 (1990), 391--407.Google ScholarGoogle ScholarCross RefCross Ref
  15. Doyle, G., Elkan, C., Accounting for burstiness in topic models. In International Conference on Machine Learning (2009), ACM, 281--288.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Fei-Fei, L., Perona, P. A Bayesian hierarchical model for learning natural scene categories. In IEEE Computer Vision and Pattern Recognition (2005), 524--531. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gerrish, S., Blei, D. A language-based approach to measuring scholarly impact. In International Conference on Machine Learning (2010).Google ScholarGoogle Scholar
  18. Griffiths, T., Steyvers, M., Blei, D., Tenenbaum, J. Integrating topics and syntax. Advances in Neural Information Processing Systems 17. L. K. Saul, Y. Weiss, and L. Bottou, eds. MIT Press, Cambridge, MA, 2005, 537--544.Google ScholarGoogle Scholar
  19. Grimmer, J. A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in senate press releases. Polit. Anal. 18, 1 (2010), 1.Google ScholarGoogle ScholarCross RefCross Ref
  20. Hoffman, M., Blei, D., Bach, F. On-line learning for latent Dirichlet allocation. In Neural Information Processing Systems (2010).Google ScholarGoogle Scholar
  21. Hofmann, T. Probabilistic latent semantic analysis. In Uncertainty in Artificial Intelligence (UAI) (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L. Introduction to variational methods for graphical models. Mach. Learn. 37 (1999), 183--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Li, J., Wang, C., Lim, Y., Blei, D., Fei-Fei, L., Building and using a semantivisual image hierarchy. In Computer Vision and Pattern Recognition (2010).Google ScholarGoogle ScholarCross RefCross Ref
  24. Li, W., McCallum, A. Pachinko allocation: DAG-structured mixture models of topic correlations. In International Conference on Machine Learning (2006), 577--584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Mimno, D., McCallum, A. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. In Uncertainty in Artificial Intelligence (2008).Google ScholarGoogle Scholar
  26. Newman, D., Chemudugunta, C., Smyth, P. Statistical entity-topic models. In Knowledge Discovery and Data Mining (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pritchard, J., Stephens, M., Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155 (June 2000), 945--959.Google ScholarGoogle ScholarCross RefCross Ref
  28. Reisinger, J., Waters, A., Silverthorn, B., Mooney, R. Spherical topic models. In International Conference on Machine Learning (2010).Google ScholarGoogle Scholar
  29. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smith, P., The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (2004), AUAI Press, 487--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rubin, D. Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 4 (1984), 1151--1172.Google ScholarGoogle ScholarCross RefCross Ref
  31. Sivic, J., Russell, B., Zisserman, A., Freeman, W., Efros, A., Unsupervised discovery of visual object class hierarchies. In Conference on Computer Vision and Pattern Recognition (2008).Google ScholarGoogle ScholarCross RefCross Ref
  32. Socher, R., Gershman, S., Perotte, A., Sederberg, P., Blei, D., Norman, K. A Bayesian analysis of dynamics in free recall. In Advances in Neural Information Processing Systems 22. Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, Eds, 2009.Google ScholarGoogle Scholar
  33. Steyvers, M., Griffiths, T. Probabilistic topic models. Latent Semantic Analysis: A Road to Meaning. T. Landauer, D. McNamara, S. Dennis, and W. Kintsch, eds. Lawrence Erlbaum, 2006.Google ScholarGoogle Scholar
  34. Teh, Y., Jordan, M., Beal, M., Blei, D. Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 476 (2006), 1566--1581.Google ScholarGoogle ScholarCross RefCross Ref
  35. Wainwright, M., Jordan, M. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1--2) (2008), 1--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wallach, H. Topic modeling: Beyond bag of words. In Proceedings of the 23rd International Conference on Machine Learning (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wang, C., Blei, D. Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process. Advances in Neural Information Processing Systems 22. Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, Eds. 2009, 1982--1989.Google ScholarGoogle Scholar
  38. Wang, C., Thiesson, B., Meek, C., Blei, D. Markov topic models. In Artificial Intelligence and Statistics (2009).Google ScholarGoogle Scholar

Index Terms

  1. Probabilistic topic models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Communications of the ACM
        Communications of the ACM  Volume 55, Issue 4
        April 2012
        110 pages
        ISSN:0001-0782
        EISSN:1557-7317
        DOI:10.1145/2133806
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 April 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Popular
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format