skip to main content
10.5555/1608912.1608924dlproceedingsArticle/Chapter ViewAbstractPublication PagesdeeplpConference Proceedingsconference-collections
research-article
Free Access

Pruning the search space of a hand-crafted parsing system with a probabilistic parser

Published:28 June 2007Publication History

ABSTRACT

The demand for deep linguistic analysis for huge volumes of data means that it is increasingly important that the time taken to parse such data is minimized. In the XLE parsing model which is a hand-crafted, unification-based parsing system, most of the time is spent on unification, searching for valid f-structures (dependency attribute-value matrices) within the space of the many valid c-structures (phrase structure trees). We carried out an experiment to determine whether pruning the search space at an earlier stage of the parsing process results in an improvement in the overall time taken to parse, while maintaining the quality of the f-structures produced. We retrained a state-of-the-art probabilistic parser and used it to pre-bracket input to the XLE, constraining the valid c-structure space for each sentence. We evaluated against the PARC 700 Dependency Bank and show that it is possible to decrease the time taken to parse by ~18% while maintaining accuracy.

References

  1. Srinivas Bangalore and Aravind K. Joshi. 1999. Supertagging: An approach to alsmost parsing. Computational Linguistics, 25(2):237--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dan Bikel. Design of a Multi-lingual, Parallel-processing Statistical Parsing Engine. In Proceedings of HLT, YEAR = 2002, pages = 24--27, address = San Diego, CA,. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Miriam Butt, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi, and Christian Rohrer. 2002. The Parallel Grammar Project. In Proceedings of Workshop on Grammar Engineering and Evaluation, pages 1--7, Taiwan. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Aoife Cahill, Martin Forst, Michael Burke, Mairead McCarthy, Ruth O'Donovan, Christian Rohrer, Josef van Genabith, and Andy Way. 2005. Treebank-based acquisition of multilingual unification grammar resources. Journal of Research on Language and Computation, pages 247--279.Google ScholarGoogle ScholarCross RefCross Ref
  5. Eugene Charniak and Mark Johnson. 2005. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking. In Proceedings of ACL, pages 173--180, Ann Arbor, Michigan. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eugene Charniak. 2000. A maximum entropy inspired parser. In Proceedings of NAACL, pages 132--139, Seattle, WA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Stephen Clark and James R. Curran. 2004. The Importance of Supertagging for Wide-Coverage CCG Parsing. In Proceedings of COLING, pages 282--288, Geneva, Switzerland, Aug 23--Aug 27. COLING. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Richard Crouch, Ron Kaplan, Tracy Holloway King, and Stefan Riezler. 2002. A comparison of evaluation metrics for a broad coverage parser. In Proceedings of the LREC Workshop: Beyond PARSEVAL, pages 67--74, Las Palmas, Canary Islands, Spain.Google ScholarGoogle Scholar
  9. Ron Kaplan and Joan Bresnan. 1982. Lexical Functional Grammar, a Formal System for Grammatical Representation. In Joan Bresnan, editor, The Mental Representation of Grammatical Relations, pages 173--281. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  10. Ron Kaplan, Stefan Riezler, Tracy Holloway King, John T. Maxwell, Alexander Vasserman, and Richard Crouch. 2004. Speed and Accuracy in Shallow and Deep Stochastic Parsing. In Proceedings of HLT-NAACL, pages 97--104, Boston, MA.Google ScholarGoogle Scholar
  11. Tracy Holloway King, Richard Crouch, Stefan Riezler, Mary Dalrymple, and Ron Kaplan. 2003. The PARC 700 dependency bank. In Proceedings of LINC, pages 1--8, Budapest, Hungary.Google ScholarGoogle Scholar
  12. Alexandra Kinyon. 2000. Hypertags. In Proceedings of COLING, pages 446--452, Saarbrücken. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Takuya Matsuzaki, Yusuke Miyao, and Jun'ichi Tsujii. 2007. Efficient HPSG Parsing with Supertagging and CFG-filtering. In Proceedings of IJCAI, pages 1671--1676, India. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. John T. Maxwell and Ronald M. Kaplan. 1993. The interface between phrasal and functional constraints. Computational Linguistics, 19(4): 571--590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Takashi Ninomiya, Takuya Matsuzaki, Yoshimasa Tsuruoka, Yusuke Miyao, and Jun'ichi Tsujii. 2006. Extremely Lexicalized Models for Accurate and Fast HPSG Parsing. In Proceedings of EMNLP, pages 155--163, Australia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Eric W. Noreen. 1989. Computer Intensive Methods for Testing Hypotheses: An Introduction. Wiley, New York.Google ScholarGoogle Scholar
  17. Adwait Ratnaparkhi. 1996. A Maximum Entropy Part-Of-Speech Tagger. In Proceedings of EMNLP, pages 133--142, Philadelphia, PA.Google ScholarGoogle Scholar
  18. Stefan Riezler, Tracy King, Ronald Kaplan, Richard Crouch, John T. Maxwell, and Mark Johnson. 2002. Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques. In Proceedings of ACL, pages 271--278, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Pruning the search space of a hand-crafted parsing system with a probabilistic parser

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          DeepLP '07: Proceedings of the Workshop on Deep Linguistic Processing
          June 2007
          171 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 28 June 2007

          Qualifiers

          • research-article

          Acceptance Rates

          DeepLP '07 Paper Acceptance Rate10of45submissions,22%Overall Acceptance Rate10of45submissions,22%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader