Milan Straka

office
Room 420
office hours
Monday 9-17
Tuesday 10-16
email
straka@ufal.mff.cuni.cz
phone
(+420) 95155 4361
address
Malostranské náměstí 25
118 00 Praha 1
Czech Republic

Main Research Interests

  • Machine Learning
    • Artificial Neural Networks
    • Deep Learning
    • Structured Prediction
    • Bayesian Nonparametrics Modelling and Unsupervised Learning
  • NLP Tools
    • POS Tagging
    • Dependency Parsing
    • Named Entity Recognition and Linking

Projects

Curriculum Vitae

Teaching

Selected Bibliography

ORCID

Papers

  1. Petr Bělohlávek, Ondřej Plátek, Zdeněk Žabokrtský, Milan Straka (2018): Using Adversarial Examples in Natural Language Processing. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), pp. 3693-3700, European Language Resources Association, Paris, France, ISBN 979-10-95546-00-9 (url, biblio, attachment.pdf, bibtex)
  2. Daniel Kondratyuk, Tomáš Gavenčiak, Milan Straka, Jan Hajič (2018): LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2018, pp. 4921-4928, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-84-1 (url, biblio, attachment.pdf, bibtex)
  3. Jakub Náplava, Milan Straka, Pavel Straňák, Jan Hajič (2018): Diacritics Restoration Using Neural Networks. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), pp. 1-10, European Language Resources Association, Paris, France, ISBN 979-10-95546-00-9 (url, biblio, attachment.pdf, bibtex)
  4. Milan Straka (2018): UDPipe 2.0 Prototype at CoNLL 2018 UD Shared Task. In: Proceedings of CoNLL 2018: The SIGNLL Conference on Computational Natural Language Learning, pp. 197-207, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-72-8 (pdf, biblio, attachment.pdf, bibtex)
  5. Milan Straka, Nikita Mediankin, Tom Kocmi, Zdeněk Žabokrtský, Vojtěch Hudeček, Jan Hajič (2018): SumeCzech: Large Czech News-Based Summarization Dataset. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), pp. 3488-3495, European Language Resources Association, Paris, France, ISBN 979-10-95546-00-9 (url, biblio, attachment.pdf, bibtex)
  6. Daniel Zeman, Jan Hajič, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, Slav Petrov (2018): CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. In: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 1-21, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-948087-82-7 (url, biblio, attachment1.pdf, attachment2.pdf, bibtex)
  7. Natalia Klyueva, Antoine Doucet, Milan Straka (2017): Neural Networks for Multi-Word Expression Detection. In: Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017), pp. 60-65, Association for Computational Linguistics (ACL), Stroudsburg, PA, USA, ISBN 978-1-945626-48-7 (pdf, biblio, attachment.pdf, obd, bibtex)
  8. Milan Straka, Jana Straková (2017): Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 88-99, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-70-8 (pdf, biblio, attachment.pdf, obd, bibtex)
  9. Milan Straka, Jana Straková, Jan Hajič (2017): Prague at EPE 2017: The UDPipe System. In: Proceedings of the 2017 Shared Task on Extrinsic Parser Evaluation at the Fourth International Conference on Dependency Linguistics and the 15th International Conference on Parsing Technologies, pp. 65-74, Association for Computational Linguistics (ACL), Stroudsburg, PA, USA, ISBN 978-1-945626-74-6 (pdf, biblio, attachment.pdf, obd, bibtex)
  10. Jana Straková, Milan Straka, Magda Ševčíková, Zdeněk Žabokrtský (2017): Czech Named Entity Corpus. In: Handbook of Linguistic Annotation, pp. 855-873, Springer Netherlands, Netherlands, ISBN 978-94-024-0879-9 (biblio, obd)
  11. Daniel Zeman, Martin Popel, Milan Straka, Jan Hajič, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast, Francis Tyers, Elena Badmaeva, Memduh Gökırmak, Anna Nedoluzhko, Silvie Cinková, Jan Hajič, jr., Jaroslava Hlaváčová, Václava Kettnerová, Zdeňka Urešová, Jenna Kanerva, Stina Ojala, Anna Missilä, Christopher Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi Kanayama, Valeria de Paiva, Kira Droganova, Héctor Martínez Alonso, Çağrı Çöltekin, Umut Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia, Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, Michael Mandl, Jesse Kirchner, Hector Fernandez Alcalde, Jana Strnadová, Esha Banerjee, Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo Mendonça, Tatiana Lando, Rattima Nitisaroj, Josie Li (2017): CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 1-19, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-70-8 (pdf, biblio, attachment.pdf, obd, bibtex)
  12. Milan Straka, Jan Hajič, Jana Straková (2016): UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 4290-4297, European Language Resources Association, Paris, France, ISBN 978-2-9517408-9-1 (pdf, biblio, attachment.pdf, obd, bibtex)
  13. Jana Straková, Milan Straka, Jan Hajič (2016): Neural Networks for Featureless Named Entity Recognition in Czech. In: Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Lecture Notes in Computer Science, ISSN 0302-9743, 9924, pp. 173-181, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-319-45509-9 (url, biblio, attachment.pdf, obd, bibtex)
  14. Magda Ševčíková, Zdeněk Žabokrtský, Jonáš Vidra, Milan Straka (2016): Lexikální síť DeriNet: elektronický zdroj pro výzkum derivace v češtině. In: Časopis pro moderní filologii, ISSN 0008-7386, vol. 98, no. 1, pp. 62-76 (biblio, obd, bibtex)
  15. Zdeněk Žabokrtský, Magda Ševčíková, Milan Straka, Jonáš Vidra, Adéla Limburská (2016): Merging Data Resources for Inflectional and Derivational Morphology in Czech. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 1307-1314, European Language Resources Association, Paris, France, ISBN 978-2-9517408-9-1 (pdf, biblio, attachment.pdf, obd, bibtex)
  16. Milan Straka, Jan Hajič, Jana Straková, Jan Hajič, jr. (2015): Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle. In: 14th International Workshop on Treebanks and Linguistic Theories (TLT 2015), pp. 208-220, IPIPAN, Warszawa, Poland, ISBN 978-83-63159-18-4 (pdf, biblio, attachment.pdf, obd, bibtex)
  17. Jana Straková, Milan Straka, Jan Hajič (2014): Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 13-18, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-00-6 (pdf, biblio, attachment.pdf, obd, bibtex)
  18. David Mareček, Milan Straka (2013): Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 281-290, Association for Computational Linguistics, Sofija, Bulgaria, ISBN 978-1-937284-50-3 (pdf, biblio, attachment.pdf, obd, bibtex)
  19. Jana Straková, Milan Straka, Jan Hajič (2013): A New State-of-The-Art Czech Named Entity Recognizer. In: Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Lecture Notes in Computer Science, ISSN 0302-9743, 8082, pp. 68-75, Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6 (url, biblio, attachment.pdf, obd, bibtex)
  1. Milan Straka (2011): Adams’ Trees Revisited – Correct and Efficient Implementation. In Proceedings of TFP 2011, Symposium on Trends in Functional Programming, Madrid, Spain, May 2011 (pdf)
  2. Milan Straka (2010): The performance of the Haskell containers package. In Proceedings of Haskell 2010, 3rd ACM Haskell symposium on Haskell, Baltimore, Maryland, September 2010 (pdf)
  3. Milan Straka (2009): Optimal worst-case fully persistent arrays. In TFP 2009, Symposium on Trends in Functional Programming, Komarno, Slovakia, June 2009 (pdf)
  4. Martin Mareš and Milan Straka (2007): Linear-Time Ranking of Permutations. In Proceedings of ESA 2007, 15th Annual European Symposium, Eilat, Israel, October 2007 (pdf)

Theses