ABSTRACT
We present a valency lexicon for Latin verbs extracted from the Index Thomisticus Treebank, a syntactically annotated corpus of Medieval Latin texts by Thomas Aquinas.
In our corpus-based approach, the lexicon reflects the empirical evidence of the source data. Verbal arguments are induced directly from annotated data.
The lexicon contains 432 Latin verbs with 270 valency frames. The lexicon is useful for NLP applications and is able to support annotation.
- David Bamman. 2006. The Design and Use of Latin Dependency Treebank. In Jan Hajič and Joakim Nivre (eds.), TLT 2006. Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories. December 1--2, 2006, Prague, Czech Republic, Institute of Formal and Applied Linguistics, Prague, Czech Republic, 67--78.Google Scholar
- David Bamman and Gregory Crane. 2008. Building a Dynamic Lexicon from a Digital Library. In Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2008), Pittsburgh. Google ScholarDigital Library
- David Bamman, Marco Passarotti, Gregory Crane and Savina Raynaud. 2007a. Guidelines for the Syntactic Annotation of Latin Treebanks, «Tufts University Digital Library». Available at: http://dl.tufts.edu/view_pdf.jsp?urn=tufts:facpubs:dbamma01-2007.00002.Google Scholar
- David Bamman, Marco Passarotti, Gregory Crane and Savina Raynaud. 2007b. A Collaborative Model of Treebank Development. In Koenraad De Smedt, Jan Hajič and Sandra Kübler (eds.), Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. December 7--8, 2007, Bergen, Norway, Northern European Association for Language Technology (NEALT) Proceedings Series, Vol. 1, 1--6.Google Scholar
- David Bamman, Marco Passarotti, Roberto Busa and Gregory Crane. 2008. The annotation guidelines of the Latin Dependency Treebank and Index Thomisticus Treebank. The treatment of some specific syntactic constructions in Latin. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008). May 28--30, 2008, Marrakech, Morocco, European Language Resources Association (ELRA), 2008.Google Scholar
- Karl Bühler. 1934. Sprachtheorie: die Darstellungs-funktion der Sprache, Jena: Gustav Fischer, Stuttgart.Google Scholar
- Roberto Busa. 1974--1980. Index Thomisticus: sancti Thomae Aquinatis operum omnium indices et concordantiae, in quibus verborum omnium et singulorum formae et lemmata cum suis frequentiis et contextibus variis modis referuntur quaeque / consociata plurium opera atque electronico IBM automato usus digessit Robertus Busa SJ, From-mann-Holzboog, Stuttgart-Bad Cannstatt.Google Scholar
- Gregory R. Crane, Robert F. Chavez, Anne Mahoney, Thomas L. Milbank, Jeff A. Rydberg-Cox, David A. Smith and Clifford E. Wulfman. 2001. Drudgery and deep thought: Designing a digital library for the humanities. In Communications of the ACM, 44(5), 34--40. Google ScholarDigital Library
- John Carroll, Ted Briscoe and Antonio Sanfilippo. 1998. Parser Evaluation: a Survey and a New Proposal. In Proceedings of the First International Conference on Language Resources and Evaluation (LREC 1998). May 28--30, 1998, Granada, Spain, 447--454.Google Scholar
- Joseph Denooz. 1996. La banque de données du laboratoire d'analyse statistique des langues anciennes (LASLA). « Le Médiéviste et l'ordinateur », 33, 14--20.Google Scholar
- Jan Hajič, Jarmila Panevová, Eva Buráňová, Zdeňka Urešová and Alla Bémová. 1999. Annotations at Analytical Level. Instructions for annotators, Institute of Formal and Applied Linguistics, Prague, Czech Republic. Available at: http://ufal.mff.cuni.cz/pdt2.0/doc/manuals/en/alayer./pdf/a-man-en.pdf.Google Scholar
- Jan Hajič, Jarmila Panevová, Zdeňka Urešová, Alla Bémová, Veronika Kolárová-Reznícková and Petr Pejas. 2003. PDT-VALLEX: Creating a Large Coverage Valency Lexicon for Treebank Annotation. In Joakim Nivre and Erhard Hinrichs (eds.), TLT 2003 --- Proceedings of the Second Workshop on Treebanks and Linguistic Theories, volume 9 of Mathematical Modelling in Physics, Engineering and Cognitive Sciences, Växjö University Press, Växjö, Sweden, 57--68.Google Scholar
- Heinz Happ. 1976. Grundfragen einer Dependenz-Grammatik des Lateinischen, Vandenhoeck&Ruprecht, Goettingen.Google Scholar
- Dag Haug and Marius Jøhndal. 2008. Creating a Parallel Treebank of the Old Indo-European Bible Translations. In Proceedings of the Language Technology for Cultural Heritage Data Workshop (LaTeCH 2008), Marrakech, Morocco, 1st June 2008, 27--34.Google Scholar
- Peter Hellwig. 1986. Dependency Unification Grammar, In Proceedings of the 11th International Conference on Computational Linguistics, Universität Bonn, Bonn, 195--198. Google ScholarDigital Library
- Richard Hudson. 1990. English Word Grammar, Blackwell Publishers Ltd, Oxford, UK.Google Scholar
- Paul Kingsbury and Martha Palmer. 2002. From Treebank to Propbank. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas --- Gran Canaria, Spain.Google Scholar
- Anna Korhonen, Yuval Krymolowski and Ted Briscoe. 2006. A Large Subcategorization Lexicon for Natural Language Processing Applications. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy.Google Scholar
- Matthias T. Kromann. 2003. The Danish Dependency Treebank and the underlying linguistic theory. In Joakim Nivre and Erhard Hinrichs (eds.), TLT 2003 --- Proceedings of the Second Workshop on Treebanks and Linguistic Theories, volume 9 of Mathematical Modelling in Physics, Engineering and Cognitive Sciences, Växjö University Press, Växjö, Sweden.Google Scholar
- Leonardo Lesmo, Vincenzo Lombardo and Cristina Bosco. 2002. Treebank Development: the TUT Approach. In Rajeev Sangal and Sushma M. Bendre (eds.), Recent Advances in Natural Language Processing. Proceedings of International Conference on Natural Language Processing (ICON 2002), Vikas Publ. House, New Delhi, 61--70.Google Scholar
- Beth Levin. 1993. English verb classes and alternations: a preliminary investigation, University of Chicago Press, Chicago.Google Scholar
- Dekang Lin. 1995. A dependency-based method for evaluating broadcoverage parsers. In Proceedings of the IJCAI-95, Montreal, Canada, 1420--1425. Google ScholarDigital Library
- Igor Mel'čuk. 1988. Dependency Syntax: Theory and Practice, State University Press of New York, Albany/NY.Google Scholar
- Cedric Messiant, Anna Korhonen and Thierry Poibeau. 2008. LexSchem: A Large Subcategorization Lexicon for French Verbs. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008). May 28--30, 2008, Marrakech, Morocco, European Language Resources Association (ELRA), 2008.Google Scholar
- Jarmila Panevová. 1974--1975. On Verbal Frames in Functional Generative Description. Part I, «Prague Bulletin of Mathematical Linguistics», 22, 3--40; Part II, «Prague Bulletin of Mathematical Linguistics», 23, 17--52.Google Scholar
- Marco Passarotti. 2007. Verso il Lessico Tomistico Biculturale. La treebank dell'Index Thomisticus. In Raffaella Petrilli and Diego Femia (eds.), Il filo del discorso. Intrecci testuali, articolazioni linguistiche, composizioni logiche. Atti del XIII Congresso Nazionale della Società di Filosofia del Linguaggio, Viterbo, 14--16 Settembre 2006, Aracne Editrice, Pubblicazioni della Società di Filosofia del Linguaggio, 04, Roma, 187--205.Google Scholar
- Marco Passarotti. Forthcoming. Theory and Practice of Corpus Annotation in the Index Thomisticus Treebank. In Proceedings of the Conference 'Trends in Computational and Formal Philology - Venice Padua, May 22--24, 2008'.Google Scholar
- Josef Ruppenhofer, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson and Jan Scheffczyk. 2006. FrameNet II. Extendend Theory and Practice. E-book available at http://framenet.icsi.berkeley.edu/index.php?option=com_wrapper&Itemid=126.Google Scholar
- Petr Sgall, Eva Hajičová and Jarmila Panevová. 1986. The Meaning of the Sentence in its Semantic and Pragmatic Aspects, D. Reidel, Dordrecht, NL.Google Scholar
- Lucien Tesnière. 1959. Éléments de syntaxe structurale, Editions Klincksieck, Paris, France.Google Scholar
- Thomas Aquinas. 1856--1858. Sancti Thomae Aquinatis, doctoris angelici, Ordinis praedicatorum Commentum in quatuor libros Sententiarum magistri Petri Lombardi, adjectis brevibus adnotationibus, Fiaccadori, Parma.Google Scholar
- Zdenka Urešová. 2004. The Verbal Valency in the Prague Dependency Treebank from the Annotator's Point of View. Jazykovedný ústav L'. Štúra, SAV, Bratislava, Slovakia.Google Scholar
- Leonoor Van der Beek, Gosse Bouma, Rob Malouf and Gertjan van Noord. 2002. The Alpino Dependency Treebank. In Mariet Theune, Anton Nijholt and Hendri Hondorp (eds.), Proceedings of the Twelfth Meeting of Computational Linguistics in the Netherlands (CLIN 2001), Rodopi, Amsterdam, 8--22.Google Scholar
Index Terms
- The development of the Index Thomisticus Treebank valency lexicon
Recommendations
Valency lexicon of czech verbs VALLEX: recent experiments with frame disambiguation
TSD'05: Proceedings of the 8th international conference on Text, Speech and DialogueVALLEX is a linguistically annotated lexicon aiming at a description of syntactic information which is supposed to be useful for NLP. The lexicon contains roughly 2500 manually annotated Czech verbs with over 6000 valency frames (summer 2005). In this ...
Semantic Classes in Czech Valency Lexicon
TSD '08: Proceedings of the 11th international conference on Text, Speech and DialogueWe introduce a project aimed at enhancing a valency lexicon of Czech verbs with coherent semantic classes. For this purpose, we make use of FrameNet, a semantically oriented lexical resource. At the present stage, semantic frames from FrameNet have been ...
Lexicon+TX: rapid construction of a multilingual lexicon with under-resourced languages
Most efforts at automatically creating multilingual lexicons require input lexical resources with rich content (e.g. semantic networks, domain codes, semantic categories) or large corpora. Such material is often unavailable and difficult to construct ...
Comments