skip to main content
10.5555/1642049.1642055dlproceedingsArticle/Chapter ViewAbstractPublication PageslatechConference Proceedingsconference-collections
research-article
Free Access

The development of the Index Thomisticus Treebank valency lexicon

Published:30 March 2009Publication History

ABSTRACT

We present a valency lexicon for Latin verbs extracted from the Index Thomisticus Treebank, a syntactically annotated corpus of Medieval Latin texts by Thomas Aquinas.

In our corpus-based approach, the lexicon reflects the empirical evidence of the source data. Verbal arguments are induced directly from annotated data.

The lexicon contains 432 Latin verbs with 270 valency frames. The lexicon is useful for NLP applications and is able to support annotation.

References

  1. David Bamman. 2006. The Design and Use of Latin Dependency Treebank. In Jan Hajič and Joakim Nivre (eds.), TLT 2006. Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories. December 1--2, 2006, Prague, Czech Republic, Institute of Formal and Applied Linguistics, Prague, Czech Republic, 67--78.Google ScholarGoogle Scholar
  2. David Bamman and Gregory Crane. 2008. Building a Dynamic Lexicon from a Digital Library. In Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2008), Pittsburgh. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David Bamman, Marco Passarotti, Gregory Crane and Savina Raynaud. 2007a. Guidelines for the Syntactic Annotation of Latin Treebanks, «Tufts University Digital Library». Available at: http://dl.tufts.edu/view_pdf.jsp?urn=tufts:facpubs:dbamma01-2007.00002.Google ScholarGoogle Scholar
  4. David Bamman, Marco Passarotti, Gregory Crane and Savina Raynaud. 2007b. A Collaborative Model of Treebank Development. In Koenraad De Smedt, Jan Hajič and Sandra Kübler (eds.), Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. December 7--8, 2007, Bergen, Norway, Northern European Association for Language Technology (NEALT) Proceedings Series, Vol. 1, 1--6.Google ScholarGoogle Scholar
  5. David Bamman, Marco Passarotti, Roberto Busa and Gregory Crane. 2008. The annotation guidelines of the Latin Dependency Treebank and Index Thomisticus Treebank. The treatment of some specific syntactic constructions in Latin. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008). May 28--30, 2008, Marrakech, Morocco, European Language Resources Association (ELRA), 2008.Google ScholarGoogle Scholar
  6. Karl Bühler. 1934. Sprachtheorie: die Darstellungs-funktion der Sprache, Jena: Gustav Fischer, Stuttgart.Google ScholarGoogle Scholar
  7. Roberto Busa. 1974--1980. Index Thomisticus: sancti Thomae Aquinatis operum omnium indices et concordantiae, in quibus verborum omnium et singulorum formae et lemmata cum suis frequentiis et contextibus variis modis referuntur quaeque / consociata plurium opera atque electronico IBM automato usus digessit Robertus Busa SJ, From-mann-Holzboog, Stuttgart-Bad Cannstatt.Google ScholarGoogle Scholar
  8. Gregory R. Crane, Robert F. Chavez, Anne Mahoney, Thomas L. Milbank, Jeff A. Rydberg-Cox, David A. Smith and Clifford E. Wulfman. 2001. Drudgery and deep thought: Designing a digital library for the humanities. In Communications of the ACM, 44(5), 34--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. John Carroll, Ted Briscoe and Antonio Sanfilippo. 1998. Parser Evaluation: a Survey and a New Proposal. In Proceedings of the First International Conference on Language Resources and Evaluation (LREC 1998). May 28--30, 1998, Granada, Spain, 447--454.Google ScholarGoogle Scholar
  10. Joseph Denooz. 1996. La banque de données du laboratoire d'analyse statistique des langues anciennes (LASLA). « Le Médiéviste et l'ordinateur », 33, 14--20.Google ScholarGoogle Scholar
  11. Jan Hajič, Jarmila Panevová, Eva Buráňová, Zdeňka Urešová and Alla Bémová. 1999. Annotations at Analytical Level. Instructions for annotators, Institute of Formal and Applied Linguistics, Prague, Czech Republic. Available at: http://ufal.mff.cuni.cz/pdt2.0/doc/manuals/en/alayer./pdf/a-man-en.pdf.Google ScholarGoogle Scholar
  12. Jan Hajič, Jarmila Panevová, Zdeňka Urešová, Alla Bémová, Veronika Kolárová-Reznícková and Petr Pejas. 2003. PDT-VALLEX: Creating a Large Coverage Valency Lexicon for Treebank Annotation. In Joakim Nivre and Erhard Hinrichs (eds.), TLT 2003 --- Proceedings of the Second Workshop on Treebanks and Linguistic Theories, volume 9 of Mathematical Modelling in Physics, Engineering and Cognitive Sciences, Växjö University Press, Växjö, Sweden, 57--68.Google ScholarGoogle Scholar
  13. Heinz Happ. 1976. Grundfragen einer Dependenz-Grammatik des Lateinischen, Vandenhoeck&Ruprecht, Goettingen.Google ScholarGoogle Scholar
  14. Dag Haug and Marius Jøhndal. 2008. Creating a Parallel Treebank of the Old Indo-European Bible Translations. In Proceedings of the Language Technology for Cultural Heritage Data Workshop (LaTeCH 2008), Marrakech, Morocco, 1st June 2008, 27--34.Google ScholarGoogle Scholar
  15. Peter Hellwig. 1986. Dependency Unification Grammar, In Proceedings of the 11th International Conference on Computational Linguistics, Universität Bonn, Bonn, 195--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Richard Hudson. 1990. English Word Grammar, Blackwell Publishers Ltd, Oxford, UK.Google ScholarGoogle Scholar
  17. Paul Kingsbury and Martha Palmer. 2002. From Treebank to Propbank. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas --- Gran Canaria, Spain.Google ScholarGoogle Scholar
  18. Anna Korhonen, Yuval Krymolowski and Ted Briscoe. 2006. A Large Subcategorization Lexicon for Natural Language Processing Applications. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy.Google ScholarGoogle Scholar
  19. Matthias T. Kromann. 2003. The Danish Dependency Treebank and the underlying linguistic theory. In Joakim Nivre and Erhard Hinrichs (eds.), TLT 2003 --- Proceedings of the Second Workshop on Treebanks and Linguistic Theories, volume 9 of Mathematical Modelling in Physics, Engineering and Cognitive Sciences, Växjö University Press, Växjö, Sweden.Google ScholarGoogle Scholar
  20. Leonardo Lesmo, Vincenzo Lombardo and Cristina Bosco. 2002. Treebank Development: the TUT Approach. In Rajeev Sangal and Sushma M. Bendre (eds.), Recent Advances in Natural Language Processing. Proceedings of International Conference on Natural Language Processing (ICON 2002), Vikas Publ. House, New Delhi, 61--70.Google ScholarGoogle Scholar
  21. Beth Levin. 1993. English verb classes and alternations: a preliminary investigation, University of Chicago Press, Chicago.Google ScholarGoogle Scholar
  22. Dekang Lin. 1995. A dependency-based method for evaluating broadcoverage parsers. In Proceedings of the IJCAI-95, Montreal, Canada, 1420--1425. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Igor Mel'čuk. 1988. Dependency Syntax: Theory and Practice, State University Press of New York, Albany/NY.Google ScholarGoogle Scholar
  24. Cedric Messiant, Anna Korhonen and Thierry Poibeau. 2008. LexSchem: A Large Subcategorization Lexicon for French Verbs. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008). May 28--30, 2008, Marrakech, Morocco, European Language Resources Association (ELRA), 2008.Google ScholarGoogle Scholar
  25. Jarmila Panevová. 1974--1975. On Verbal Frames in Functional Generative Description. Part I, «Prague Bulletin of Mathematical Linguistics», 22, 3--40; Part II, «Prague Bulletin of Mathematical Linguistics», 23, 17--52.Google ScholarGoogle Scholar
  26. Marco Passarotti. 2007. Verso il Lessico Tomistico Biculturale. La treebank dell'Index Thomisticus. In Raffaella Petrilli and Diego Femia (eds.), Il filo del discorso. Intrecci testuali, articolazioni linguistiche, composizioni logiche. Atti del XIII Congresso Nazionale della Società di Filosofia del Linguaggio, Viterbo, 14--16 Settembre 2006, Aracne Editrice, Pubblicazioni della Società di Filosofia del Linguaggio, 04, Roma, 187--205.Google ScholarGoogle Scholar
  27. Marco Passarotti. Forthcoming. Theory and Practice of Corpus Annotation in the Index Thomisticus Treebank. In Proceedings of the Conference 'Trends in Computational and Formal Philology - Venice Padua, May 22--24, 2008'.Google ScholarGoogle Scholar
  28. Josef Ruppenhofer, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson and Jan Scheffczyk. 2006. FrameNet II. Extendend Theory and Practice. E-book available at http://framenet.icsi.berkeley.edu/index.php?option=com_wrapper&Itemid=126.Google ScholarGoogle Scholar
  29. Petr Sgall, Eva Hajičová and Jarmila Panevová. 1986. The Meaning of the Sentence in its Semantic and Pragmatic Aspects, D. Reidel, Dordrecht, NL.Google ScholarGoogle Scholar
  30. Lucien Tesnière. 1959. Éléments de syntaxe structurale, Editions Klincksieck, Paris, France.Google ScholarGoogle Scholar
  31. Thomas Aquinas. 1856--1858. Sancti Thomae Aquinatis, doctoris angelici, Ordinis praedicatorum Commentum in quatuor libros Sententiarum magistri Petri Lombardi, adjectis brevibus adnotationibus, Fiaccadori, Parma.Google ScholarGoogle Scholar
  32. Zdenka Urešová. 2004. The Verbal Valency in the Prague Dependency Treebank from the Annotator's Point of View. Jazykovedný ústav L'. Štúra, SAV, Bratislava, Slovakia.Google ScholarGoogle Scholar
  33. Leonoor Van der Beek, Gosse Bouma, Rob Malouf and Gertjan van Noord. 2002. The Alpino Dependency Treebank. In Mariet Theune, Anton Nijholt and Hendri Hondorp (eds.), Proceedings of the Twelfth Meeting of Computational Linguistics in the Netherlands (CLIN 2001), Rodopi, Amsterdam, 8--22.Google ScholarGoogle Scholar

Index Terms

  1. The development of the Index Thomisticus Treebank valency lexicon

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        LaTeCH-SHELT&R '09: Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education
        March 2009
        85 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 30 March 2009

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader