The Valency Lexicon of Czech Verbs with Complex Syntactic-Semantic Annotation

The Valency Lexicon of Czech Verbs is a collection of linguistically annotated data and documentation; it provides a formal, machine-readable description of valency frames of Czech verbs and additional syntactico-semantic information useful for the analysis and synthesis of Czech texts as well as other applied tasks in NLP. It covers the common senses of the most frequent Czech verbs (in total over 6800 senses of over 4700 lemmas).

The lexicon provides:

  • valency frames with basic syntactico-semantic characterization of the most frequent verbs in their particular senses (number of complementations, their morphological forms and obligatoriness);
  • glosses, examples;
  • additional characteristics – idioms and multiword expressions (light verb constructions), control, reflexivity, reciprocity, diatheses, lexicalized alternations, and syntactico-semantic class.

The lexicon is available in three formats:

 

VALLEX 4.0

VALLEX 4.0 is an enhanced successor of VALLEX 3.0 and 3.5. In addition to the information stored there, VALLEX 4.0 contains also a detailed classification of verbs expressing reciprocity and reflexivity. VALLEX 4.0 covers 324 lexical units for inherently reciprocal verbs; further, it identifies almost 2,750 lexical units allowing for syntactic reciprocalization and almost 2,050 lexical units allowing for syntactic reflexivization.

The annotation of reflexivity and reciprocity has been developed within the project Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions supported by the Grant Agency of the Czech Republic, grant No. 18-03984S.

The theoretical part of the lexicon (including the Grammar Component) has been published as a Technical report in the ÚFAL series.

 

How to cite the VALLEX lexicon

If you make use of VALLEX, please cite (at least one of) the following papers:

  • Lopatková, M., Kettnerová, V., Bejček, E., Vernerová, A., Žabokrtský, Z.: Valenční slovník českých sloves VALLEX. Praha, Karolinum, 698 p., 2016
  • Kettnerová, V., Lopatková, M.: Ke způsobům vyjádření vzájemnosti v češtině. Slovo a slovesnost, Vol. 81, No 4, pp. 243-268, 2020
  • Kettnerová, V., Lopatková, M.: Reflexives in Czech from a Dependency Perspective. In Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, Syntaxfest 2019), ACL, Paris, France, ISBN 978-1-950737-63-5, pp. 14-25, 2019
  • Kettnerová, V., Lopatková, M., Bejček, E.: The Syntax-Semantics Interface of Czech Verbs in the Valency Lexicon. In Proceedings of the 15th EURALEX International Congress, Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway, pp. 434-443, 2012
  • Lopatková, M., Kettnerová, V., Vernerová, A., Bejček, E., Žabokrtský, Z.: VALLEX 4.0, LINDAT/CLARIAH-CZ Digital Library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, Prague, http://hdl.handle.net/11234/1-3498, Dec 2020
  • Lopatková, M., kettnerová, V., Vernerová, A., Bejček, E., Žabokrtský, Z.: Valenční slovník českých sloves VALLEX. ÚFAL Technical report TR-2021-68. Praha, ÚFAL, ISSN 1214-5521, 184 p., 2021
@book{2016-vallex-book,
       title = {Valen{\v{c}}n{\'{i}} slovn{\'{i}}k {\v{c}}esk{\'{y}}ch sloves {VALLEX}},
       author = {Mark{\'{e}}ta Lopatkov{\'{a}} and V{\'{a}}clava Kettnerov{\'{a}} and Eduard Bej{\v{c}}ek and Anna Vernerov{\'{a}} and Zden{\v{e}}k {\v{Z}}abokrtsk{\'{y}}},
       year = {2016},
       publisher = {Karolinum},
       address = {Praha},
       isbn = {978-80-246-3542-2},
}

@article{2020-sas-recipr,
       journal = {Slovo a slovesnost},
       title = {Ke zp{\r{u}}sob{\r{u}}m vyj{\'{a}}d{\v{r}}en{\'{\i}} vz{\'{a}}jemnosti v {\v{c}}e{\v{s}}tin{\v{e}}},
       author = {V{\'{a}}clava Kettnerov{\'{a}} and Mark{\'{e}}ta Lopatkov{\'{a}}},
       year = {2020},
       volume = {81},
       number = {4},
       pages = {243--268},
       issn = {0037-7031},
}

@inproceedings{2019-depling-reflex,
       booktitle = {Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, Syntaxfest 2019)},
       title = {Reflexives in Czech from a Dependency Perspective},
       editor = {Kim Gerdes and Sylvain Kahane},
       author = {V{\'{a}}clava Kettnerov{\'{a}} and Mark{\'{e}}ta Lopatkov{\'{a}}},
       year = {2019},
       publisher = {Association for Computational Linguistics},
       organization = {Universit{\'{e}} Paris Sorbonne Nouvelle},
       address = {Paris, France},
       pages = {14--25},
       isbn = {978-1-950737-63-5},
}

@inproceedings{2012-EURALEX-sep,
       booktitle = {Proceedings of the 15th {EURALEX} International Congress},
       title = {The Syntax-Semantics Interface of Czech Verbs in the Valency Lexicon},
       editor = {Ruth Fjeld and Julie Torjusen},
       author = {V{\'{a}}clava Kettnerov{\'{a}} and Mark{\'{e}}ta Lopatkov{\'{a}} and Eduard Bej{\v{c}}ek},
       year = {2012},
       publisher = {Department of Linguistics and Scandinavian Studies, University of Oslo},
       address = {Oslo},
       pages = {434--443},
}

@misc{vallex-4-0-data,
       title = {{VALLEX} 4.0},
       author = {Mark{\'{e}}ta Lopatkov{\'{a}} and V{\'{a}}clava Kettnerov{\'{a}} and Anna Vernerov{\'{a}} and Eduard Bej{\v{c}}ek and Zden{\v{e}}k {\v{Z}}abokrtsk{\'{y}}},
       year = {2020},
       publisher = {{LINDAT}/{CLARIAH}-{CZ} Digital Library at the Institute of Formal and Applied Linguistics ({\'{U}}{FAL}), Faculty of Mathematics and Physics, Charles University},
       address = {Prague},
}

@book{2021-vallex-techreport,
       title = {Valen{\v{c}}n{\'{i}} slovn{\'{i}}k {\v{c}}esk{\'{y}}ch sloves {VALLEX}},
       subtitle = {ÚFAL Technical report TR-2021-68},
       author = {Mark{\'{e}}ta Lopatkov{\'{a}} and V{\'{a}}clava Kettnerov{\'{a}} and Anna Vernerov{\'{a}}
                 and Eduard Bej{\v{c}}ek and Zden{\v{e}}k {\v{Z}}abokrtsk{\'{y}}},
       year = {2021},
       publisher = {ÚFAL},
       address = {Praha},
       issn = {1214-5521},
}

VALLEX Archive

VALLEX 3.5

VALLEX 3.5 is an enhanced successor of VALLEX 3.0. In addition to the information stored in VALLEX 3.0, VALLEX 3.5 contains an annotation of light verb constructions, covering almost 3,000 collocations of predicative nouns with light verbs (counted as combinations of a lemma of a light verb and a lemma of a predicative noun), which correspond to almost 1,500 light verb constructions (counted as individual combinations of a lexical unit of a light verb and a lexical unit of a predicative noun).

The annotation of light verb constructions has been developed within the project Combining Words: Syntactic Properties of Czech Multiword Expressions with Light Verbs supported by the Grant Agency of the Czech Republic, grant No. GA15-09979S.

VALLEX 3.0

VALLEX 3.0 is an enhanced, cleaned and corrected successor of VALLEX 2.5. It contains - in addition to the information stored in VALLEX 2.5 - also 

  • annotation of grammaticalized alternations (diatheses and reciprocity) and lexicalized alternations,
  • links to real-world sentences annotated by the lexicon entries for more than one hundred Czech verbs, and
  • links to PDT-Vallex, a lexicon connected with the Prague Dependency Corpus.

VALLEX 3.0 has been developed within the project Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs supported by the Grant Agency of the Czech Republic, grant  No. GA P406/12/0557.

VALLEX 2.7

VALLEX 2.7 is an enhanced, cleaned and corrected successor of VALLEX 2.5. It contains - in addition to the information stored in VALLEX 2.5 - also 

  • annotation of grammaticalized alternations (diatheses and reciprocity) and lexicalized alternations
  • links to real-world sentences annotated by the lexicon entries for more than one hundred Czech verbs, and
  • links to PDT-Vallex, a lexicon connected with the Prague Dependency Corpus.

VALLEX 2.5

VALLEX 2.5 is a cleaned and corrected successor of VALLEX 2.0. It was released electronically at the end of 2007 and since spring 2008 it is available also as a book issued by Karolinum Press, the publishing house of Charles University in Prague.

VALLEX 2.0

In VALLEX 2.0, there are roughly 2,730 lexeme entries containing together around 6,460 lexical units ("senses"). VALLEX 2.0—unlike traditional dictionaries and also unlike VALLEX 1.0—treats a pair of perfective and imperfective aspectual counterparts as a single lexeme (if perfective and imperfective verbs would be counted separately, the size of VALLEX 2.0 would virtually grow to 4,250 verb entries).

VALLEX 1.0

VALLEX 1.0 contains roughly 1400 verbs (counting only perfective and imperfective verbs, but not their iterative counterparts) – 1000 most frequent Czech verbs were selected according to their number of occurrences in a part of the Czech National Corpus (only 'být' (to be) was excluded); then their perfective or imperfective aspectual counterparts were added, if they were missing.

License

VALLEX can be used under the Creative Commons license BY-NC-SA 4.0

VALLEX can be used free of charge by any academic, educational or research institution, or other organization or individual making use of VALLEX for non-commercial research and/or education purposes. Legal usage of VALLEX is conditioned by filling the registration form.