ABSTRACT
We present a family of priors over probabilistic grammar weights, called the shared logistic normal distribution. This family extends the partitioned logistic normal distribution, enabling factored covariance between the probabilities of different derivation events in the probabilistic grammar, providing a new way to encode prior knowledge about an unknown grammar. We describe a variational EM algorithm for learning a probabilistic grammar based on this family of priors. We then experiment with unsupervised dependency grammar induction and show significant improvements using our model for both monolingual learning and bilingual learning with a non-parallel, multilingual corpus.
- J. Aitchison. 1986. The Statistical Analysis of Compositional Data. Chapman and Hall, London. Google ScholarDigital Library
- D. M. Blei and J. D. Lafferty. 2006. Correlated topic models. In Proc. of NIPS.Google Scholar
- D. M. Blei, A. Ng, and M. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022. Google ScholarDigital Library
- D. Burkett and D. Klein. 2008. Two languages are better than one (for syntactic parsing). In Proc. of EMNLP. Google ScholarDigital Library
- E. Charniak and M. Johnson. 2005. Coarse-to-fine n-best parsing and maxent discriminative reranking. In Proc. of ACL. Google ScholarDigital Library
- S. B. Cohen and N. A. Smith. 2009. Inference for probabilistic grammars with shared logistic normal distributions. Technical report, Carnegie Mellon University.Google Scholar
- S. B. Cohen, K. Gimpel, and N. A. Smith. 2008. Logistic normal priors for unsupervised probabilistic grammar induction. In NIPS.Google Scholar
- M. Collins. 2003. Head-driven statistical models for natural language processing. Computational Linguistics, 29:589--637. Google ScholarDigital Library
- I. Dagan. 1991. Two languages are more informative than one. In Proc. of ACL. Google ScholarDigital Library
- J. Eisner. 2002. Transformational priors over grammars. In Proc. of EMNLP. Google ScholarDigital Library
- J. R. Finkel, T. Grenager, and C. D. Manning. 2007. The infinite tree. In Proc. of ACL.Google Scholar
- J. Goodman. 1996. Parsing algorithms and metrics. In Proc. of ACL. Google ScholarDigital Library
- A. Haghighi, P. Liang, T. Berg-Kirkpatrick, and D. Klein. 2008. Learning bilingual lexicons from monolingual corpora. In Proc. of ACL.Google Scholar
- W. P. Headden, M. Johnson, and D. McClosky. 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. In Proc. of NAACL-HLT. Google ScholarDigital Library
- G. E. Hinton. 1999. Products of experts. In Proc. of ICANN.Google ScholarCross Ref
- M. Johnson, T. L. Griffiths, and S. Goldwater. 2006. Adaptor grammars: A framework for specifying compositional nonparameteric Bayesian models. In NIPS.Google Scholar
- M. Johnson, T. L. Griffiths, and S. Goldwater. 2007. Bayesian inference for PCFGs via Markov chain Monte Carlo. In Proc. of NAACL.Google Scholar
- M. Johnson. 2007. Why doesn't EM find good HMM POS-taggers? In Proc. EMNLP-CoNLL.Google Scholar
- M. I. Jordan, Z. Ghahramani, T. S. Jaakola, and L. K. Saul. 1999. An introduction to variational methods for graphical models. Machine Learning, 37(2):183--233. Google ScholarDigital Library
- D. Klein and C. D. Manning. 2004. Corpus-based induction of syntactic structure: Models of dependency and constituency. In Proc. of ACL. Google ScholarDigital Library
- K. Kurihara and T. Sato. 2006. Variational Bayesian grammar induction for natural language. In Proc. of ICGI. Google ScholarDigital Library
- P. Liang, S. Petrov, M. Jordan, and D. Klein. 2007. The infinite PCFG using hierarchical Dirichlet processes. In Proc. of EMNLP.Google Scholar
- M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn treebank. Computational Linguistics, 19:313--330. Google ScholarDigital Library
- D. A. Smith and N. A. Smith. 2004. Bilingual parsing with factored estimation: Using English to parse Korean. In Proc. of EMNLP, pages 49--56.Google Scholar
- N. A. Smith. 2006. Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text. Ph.D. thesis, Johns Hopkins University. Google ScholarDigital Library
- B. Snyder and R. Barzilay. 2008. Unsupervised multilingual learning for morphological segmentation. In Proc. of ACL. Google ScholarDigital Library
- Y. W. Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proc. of COLING-ACL. Google ScholarDigital Library
- M. Wang, N. A. Smith, and T. Mitamura. 2007. What is the Jeopardy model? a quasi-synchronous grammar for question answering. In Proc. of EMNLP.Google Scholar
- D. Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comp. Ling., 23(3):377--404. Google ScholarDigital Library
- N. Xue, F. Xia, F.-D. Chiou, and M. Palmer. 2004. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. Natural Language Engineering, 10(4):1--30.Google Scholar
Index Terms
- Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
Recommendations
Logistic normal priors for unsupervised probabilistic grammar induction
NIPS'08: Proceedings of the 21st International Conference on Neural Information Processing SystemsWe explore a new Bayesian model for probabilistic grammars, a family of distributions over discrete structures that includes hidden Markov models and probabilistic context-free grammars. Our model extends the correlated topic model framework to ...
Unsupervised multilingual grammar induction
ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our goal is to use bilingual cues to learn improved parsing models for each language and to evaluate these models on held-out monolingual test data. We ...
A Gibbs sampler for phrasal synchronous grammar induction
ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2We present a phrasal synchronous grammar model of translational equivalence. Unlike previous approaches, we do not resort to heuristics or constraints from a word-alignment model, but instead directly induce a synchronous grammar from parallel sentence-...
Comments