Universal Derivations (UDer)

Universal Derivations (UDer) is a collection of harmonized lexical networks capturing word-formation, especially derivation, in a cross-linguistically consistent annotation scheme for many languages. The annotation scheme is based on a rooted tree data structure (as used in the DeriNet 2.0 database), in which nodes correspond to lexemes while edges represent derivational relations or compounding.

Each individual resource in the UDer collection can be searched online using DeriSearch and the data can be processed using DeriNet 2.0 API.

 

The current version

The current version of the collection is UDer 0.5. It contains eleven harmonized resources covering eleven different languages (listed in the table below). UDer 0.5 is available in the LINDAT/CLARIN digital library (item: http://hdl.handle.net/11234/1-3041). The license for each of the harmonized resources included in the collection is specified in the appropriate language/resource directory.

Resource Language Lexemes Relations Families License
Démonette 1.2 French 21,290 13,808 7,482 CC BY-NC-SA 3.0
DeriNet 2.0 Czech 1,027,665 808,682 218,383 CC BY-NC-SA 3.0
DeriNet.ES Spanish 151,173 36,935 114,238 CC BY-NC-SA 4.0
DeriNet.FA Persian 43,357 35,745 7,612 CC BY-NC-SA 3.0
DErivBase 2.0 German 280,775 44,830 235,945 CC BY-SA 3.0
English WordNet 3.0 English 13,813 7,855 5,958 CC BY-NC-SA 3.0
EstWordNet 2.1 Estonian 988 507 481 CC BY-SA 3.0
FinnWordNet 2.0 Finnish 20,035 13,687 6,348 CC BY 3.0
NomLex-PT 2017 Portuguese 7,020 4,201 2,819 CC BY 4.0
Polish WFN 0.5 Polish 262,887 189,217 73,670 CC BY-NC-SA 3.0
Word Formation Latin Latin 29,708 22,641 5,320 CC BY-NC-SA 4.0

 

 

Related publications