[ Skip to the content ]

Institute of Formal and Applied Linguistics

at Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic

[ Back to the navigation ]


Year 2012
Type oral presentation *
Status published
Language English
Author(s) Klyueva, Natalia Bojar, Ondřej Garabík, Radovan Týnovský, Miroslav
Title Czech-Russian corpus via a simple web interface
Czech title Česko-ruský korpus v novém webovém rozhraní
Publisher's city and country Mainz, Germany
Venue Johannes Gutenberg University
Month September
URL http://www.slavistik.uni-mainz.de/606.php
Supported by 2012-2013 GAUK 639012/2012 (Nástroje a data pro strojový překlad mezi blízkými jazyky) 2010-2013 GAP406/10/0875 (Komputační lingvistika: Explicitní popis jazyka a anotovaná data se zřetelem na češtinu)
Czech abstract Představujeme nové webové rozhraní česko-ruského paralelního korpusu UMC 0.1.
English abstract We describe the Czech-Russian parallel corpus that was initially created as training data for Machine Translation systems. The corpus has been available in a format suitable for the computer processing, but theoretical linguists interested in the resources would not benefit much from them. So we decided to put the corpus into a user-friendly environment. They are now accessible via a simple www interface, based on the Manatee backend. Both parts of each corpus can be queried using full CQL syntax with regular expression based search of wordforms, lemmas and POS/morphological tags.
Specialization linguistics ("jazykověda")
Confidentiality default – not confidential
Event Workshop on parallel corpora
Presentation type contributed talk at conference/workshop
Open access no
Creator: Common Account
Created: 11/4/12 1:37 PM
Modifier: Common Account
Modified: 11/9/12 10:06 PM

Content, Design & Functionality: ÚFAL, 2006–2018. Page generated: Wed Jan 16 04:02:12 CET 2019

[ Back to the navigation ] [ Back to the content ]

100% OpenAIRE compliant