PDT & Monolingual Corpora

The Prague Dependency Treebank

The Prague Dependency Treebank (PDT) contains a large amount of Czech texts with complex and interlinked morphological, syntactic and complex semantic annotation; in addition, certain properties of sentence information structure and coreference relations are annotated at the semantic level. ... [learn more]

Prague Discourse Treebank

Annotation of discourse relations is a project related to the Prague Dependency Treebank 2.5 (PDT; Bejček et al. 2011), which is a revised, updated and extended version of the Prague Dependency Treebank 2.0 (Hajič et al. 2006). It represents a new manually annotated layer of language description, above the existing layers of the PDT (morphology, surface syntax and underlying syntax) and it portrays linguistic phenomena from the perspective of discourse structure and coherence. ... [learn more]

Prague Database of Spoken Language

The project focuses on speech reconstruction of Czech and English. It is part of the Prague Dependency Treebank family of annotated corpus resources and tools, to which it adds the spoken language layer(s). It consists of the Prague DaTabase of Spoken English and Prague DaTabase of Spoken Czech ... [learn more]

 

Other Monolingual Corpora

Project Tags
Czech Academic Corpus Corpora, Data, Monolingual
Czech Legal Text Treebank Annotations, Corpora, Data, Monolingual
Czech Named Entity Corpus Corpora, Data, Monolingual
EngVallex - English valency lexicon linked to corpora Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Valency
HindEnCorp Corpora, Data, Machine Translation, Monolingual, Multilingual
Lindat KonText Annotations, Corpora, Data, Monolingual, Multilingual, Tools
MorfFlex CZ Corpora, Data, Lexicons, Monolingual, Morphology
PDT-Vallex: valency lexicon linked to Czech corpora Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Valency
PDTSC 2.0 Annotations, Corpora, Data, Linked data, Monolingual, Morphology, Multi-modality, Semantics, Speech Recognition, Speech Retrieval, Valency
Prague Dependency Treebank Annotations, Corpora, Data, Monolingual
Prague Dependency Treebank 3.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics
Prague Discourse Treebank 1.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics
Prague Discourse Treebank 2.0 Annotations, Coreference, Corpora, Data, Discourse, Information Structure, Monolingual, Morphology, Multiword Expressions, Semantics
Prague English Dependency Treebank Annotations, Corpora, Data, Lexicons, Monolingual, Valency
ROMi 1.0 Corpora, Data, Dialog, Monolingual, Speech Recognition
Semantic Pattern Recognition Annotations, Corpora, Data, Lexicons, Monolingual, Morphology, Parsers, Publications, Semantics, Taggers, Tools, Valency
Sentiment Analysis in Czech Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Tools
TextLink: Skladba diskurzu v evropských jazycích Annotations, Corpora, Data, Discourse, Lexicons, Linked data, Monolingual
UrMonoCorp Corpora, Data, Monolingual
VPS-30-En: Verb Pattern Sample - 30 English Annotations, Corpora, Data, Lexicons, Monolingual, Semantics, Valency
VPS-GradeUp Annotations, Corpora, Data, Lexicons, Machine Learning, Monolingual, Semantics, Valency
Working with the Penn Discourse Treebank Annotations, Corpora, Data, Discourse, Linked data, Monolingual, Tools