skip to main content
10.5555/1620754.1620766dlproceedingsArticle/Chapter ViewAbstractPublication PagesnaaclConference Proceedingsconference-collections
research-article
Free Access

Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction

Published:31 May 2009Publication History

ABSTRACT

We present a family of priors over probabilistic grammar weights, called the shared logistic normal distribution. This family extends the partitioned logistic normal distribution, enabling factored covariance between the probabilities of different derivation events in the probabilistic grammar, providing a new way to encode prior knowledge about an unknown grammar. We describe a variational EM algorithm for learning a probabilistic grammar based on this family of priors. We then experiment with unsupervised dependency grammar induction and show significant improvements using our model for both monolingual learning and bilingual learning with a non-parallel, multilingual corpus.

References

  1. J. Aitchison. 1986. The Statistical Analysis of Compositional Data. Chapman and Hall, London. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. M. Blei and J. D. Lafferty. 2006. Correlated topic models. In Proc. of NIPS.Google ScholarGoogle Scholar
  3. D. M. Blei, A. Ng, and M. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Burkett and D. Klein. 2008. Two languages are better than one (for syntactic parsing). In Proc. of EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Charniak and M. Johnson. 2005. Coarse-to-fine n-best parsing and maxent discriminative reranking. In Proc. of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. B. Cohen and N. A. Smith. 2009. Inference for probabilistic grammars with shared logistic normal distributions. Technical report, Carnegie Mellon University.Google ScholarGoogle Scholar
  7. S. B. Cohen, K. Gimpel, and N. A. Smith. 2008. Logistic normal priors for unsupervised probabilistic grammar induction. In NIPS.Google ScholarGoogle Scholar
  8. M. Collins. 2003. Head-driven statistical models for natural language processing. Computational Linguistics, 29:589--637. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Dagan. 1991. Two languages are more informative than one. In Proc. of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Eisner. 2002. Transformational priors over grammars. In Proc. of EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. R. Finkel, T. Grenager, and C. D. Manning. 2007. The infinite tree. In Proc. of ACL.Google ScholarGoogle Scholar
  12. J. Goodman. 1996. Parsing algorithms and metrics. In Proc. of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Haghighi, P. Liang, T. Berg-Kirkpatrick, and D. Klein. 2008. Learning bilingual lexicons from monolingual corpora. In Proc. of ACL.Google ScholarGoogle Scholar
  14. W. P. Headden, M. Johnson, and D. McClosky. 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. In Proc. of NAACL-HLT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. E. Hinton. 1999. Products of experts. In Proc. of ICANN.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. Johnson, T. L. Griffiths, and S. Goldwater. 2006. Adaptor grammars: A framework for specifying compositional nonparameteric Bayesian models. In NIPS.Google ScholarGoogle Scholar
  17. M. Johnson, T. L. Griffiths, and S. Goldwater. 2007. Bayesian inference for PCFGs via Markov chain Monte Carlo. In Proc. of NAACL.Google ScholarGoogle Scholar
  18. M. Johnson. 2007. Why doesn't EM find good HMM POS-taggers? In Proc. EMNLP-CoNLL.Google ScholarGoogle Scholar
  19. M. I. Jordan, Z. Ghahramani, T. S. Jaakola, and L. K. Saul. 1999. An introduction to variational methods for graphical models. Machine Learning, 37(2):183--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Klein and C. D. Manning. 2004. Corpus-based induction of syntactic structure: Models of dependency and constituency. In Proc. of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Kurihara and T. Sato. 2006. Variational Bayesian grammar induction for natural language. In Proc. of ICGI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Liang, S. Petrov, M. Jordan, and D. Klein. 2007. The infinite PCFG using hierarchical Dirichlet processes. In Proc. of EMNLP.Google ScholarGoogle Scholar
  23. M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn treebank. Computational Linguistics, 19:313--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. A. Smith and N. A. Smith. 2004. Bilingual parsing with factored estimation: Using English to parse Korean. In Proc. of EMNLP, pages 49--56.Google ScholarGoogle Scholar
  25. N. A. Smith. 2006. Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text. Ph.D. thesis, Johns Hopkins University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Snyder and R. Barzilay. 2008. Unsupervised multilingual learning for morphological segmentation. In Proc. of ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. W. Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proc. of COLING-ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Wang, N. A. Smith, and T. Mitamura. 2007. What is the Jeopardy model? a quasi-synchronous grammar for question answering. In Proc. of EMNLP.Google ScholarGoogle Scholar
  29. D. Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comp. Ling., 23(3):377--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. Xue, F. Xia, F.-D. Chiou, and M. Palmer. 2004. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. Natural Language Engineering, 10(4):1--30.Google ScholarGoogle Scholar

Index Terms

  1. Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image DL Hosted proceedings
                  NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
                  May 2009
                  716 pages
                  ISBN:9781932432411

                  Publisher

                  Association for Computational Linguistics

                  United States

                  Publication History

                  • Published: 31 May 2009

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  Overall Acceptance Rate21of29submissions,72%

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader