SIS code: 
Semester: 
summer
E-credits: 
summer s.:6
Examination: 
2/2 C+Ex

Dependency Grammars and Treebanks

Lectures: Markéta Lopatková, Daniel Zeman

  • Wed, room S1, 15:40-17:10

Practical sessions: Jiří Mírovský, Daniel Zeman

  • Fri, SU1, 12:20-13:50

 

Remote classes from March 11, 2020 - please study the teaching material provided below.

Zoom on-line classes from March 18 on (Wednesday 15:40): https://matfyz.zoom.us/j/501653775

I will do my best to provide slides and additional reading each Tuesday afternoon. In case of your interest, I am available for individual consultations, preferably in the time slot of the lecture (i.e., Wednesday afternoon). Please contact me in advance by email.

Lectures

  • Lecture 1 (February 19, 2020): Introduction, trees, word order, projectivity (pdf);
    • reading:
      • Kuhlmann, M., Nivre, J. (2006): Mildly Non-Projective Dependency Structures. In COLING/ACL Main Conference Poster Sessions, 507–514 (link).
      • Havelka, J. (2007): Mathematical Properties of Dependency Trees and their Application to Natural Language Syntax. PhD Thesis, MFF UK (link)
  • Lecture 2 (February 26, 2020): A bit of history; Dependency and Non-dependency relations (pdf)
    • reading:
      • Osborne, T. (2019) A Dependency Grammar of English. John Benjamins Publishing Company, Amsterdam/Philadelphia (available in my office)
      • Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a   počítačové lingvistiky, sv. I. Karolinum, Praha (available in the secretariat)
      • Štěpánek, J. (2006) Závislostní zachycení větné struktury v anotovaném syntaktickém korpusu. PhD Thesis, MFF UK (link)
      • Wikipedia - basic articles on dependency grammar are consistent with Timothy Osborne's approach
  • Lecture 3 (March 4, 2020): Intro to a stratificational language description (pdf)
    • reading:
      • Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a   počítačové lingvistiky, sv. I. Karolinum, Praha (available in the secretariat)
      • Štekauer, P., ed. (2000) Rudiments of English Linguistics.Slovacontact, Prešov.
      • Sgall, P. (1967) Generativní popis jazyka a česká deklinace. Academia, Praha (available in my office)
      • Žabokrtský, Z. (2006) Resemblances between Meaning Û Text Theory and    Functional Generative Description. In Proceedings of the 2nd International   Conference of Meaning-Text Theory, Slavic Culture Languages Publishers House, Moskva, pp. 549-557. (link)
      • https://www.britannica.com/science/linguistics/Stratificational-grammar
  • Lecture 4 (March 11, 2020): TOPIC 1: Prague Dependency Treebank: Intro (pdf)
    • reading:
      • Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a   počítačové lingvistiky, sv. I. Karolinum, Praha (available in the secretariat)
      • PDT guide: http://ufal.mff.cuni.cz/pdt2.0
      • documentation (see individaul corpora)
    • TOPIC 2: PDT: morphological annotation (pdf)
      • note: You are not supposed to memorize the tag structure but you might be ask to provide examples (using the following table pdf);
    • reading:
  • Lecture 5 (March 18, 2020): Intro to UD, morphology (pdf, video)
    • reading:
      • Nivre Joakim et al. (2020) Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection. To appear in: Proceedings of LREC 2020.
      • https://universaldependencies.org/
  • Lecture 6 (March 25, 2020): Surface syntactic annotation in PDT (a-layer) (pdf)
    • reading:
      • Hajič, J. (1998) Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. In E. Hajičová (ed.): Issues of Valency and Meaning. Studies in Honour of Jarmila Panevová, Karolinum, Charles University Press, Prague, Republic, pp. 106-132 (link)
      • Štekauer, P., ed. (2000) Rudiments of English Linguistics.Slovacontact, Prešov (chapter 4, Syntax)
      • Quirk, R., Greenbaum, S., Leech, G., Svartvik, J. (1985) A Comprehensive Grammar of the English Language, Longman, London.
      • PDT documentation: Manual for Analytical Annotation (link)
      • Table with analytical functions in PDT 2.0 (pdf)
  • Lecture 7 (April 1, 2020): Syntax in UD (pdf – covered until slide no. 35; video)
    • reading:
      • Nivre Joakim et al. (2020) Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection. To appear in: Proceedings of LREC 2020.
      • https://universaldependencies.org/
  • Lecture 8 (April 8, 2020): Enhanced dependencies in UD
    • reading:
      • Schuster, S., Manning, C. (2016) Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks. In: Proceedings of LREC 2016, Portorož, Slovenia (pdf)
      • Droganova, K., Zeman, D. (2019) Towards Deep Universal Dependencies. In: Proceedings of Depling/Syntaxfest 2019, Paris, France (pdf)
      • https://universaldependencies.org/u/overview/enhanced-syntax.html

Plan

  • 9. April 15, 2020: PropBank (pdf)
    • reading:
  • 10. April 22, 2020: PDT: tectogrammatical layer (pdf)
    • reading:

Sample Final Test

Anybody is allowed to take the test without completing homeworks; however, the final grade will be registered in SIS only after completing them.

Note: 40% of the total score for the final test is necessary for passing!

Useful Links and Other Materials

  • Table with Czech positional morphological tags (pdf);
  • Table with analytical functions in PDT 2.0 (pdf);
  • Table with T-nodes attributes in PDT 2.0 (pdf);
  • PDT 2.0 Guide or here pdf
  • PDT documentation (PDT 3.0, 2.0)
  • Universal Dependencies (link)

 

Practical (lab) sessions

(click here)

Homeworks

  • All homeworks must be committed into the https://svn.ms.mff.cuni.cz/svn/undergrads/students svn repository; do not send your homeworks by e-mail.
  • Submit your work into your personal directories.
  • There is an explicit deadline for submitting each homework - Tuesday before midnight.
  • If the deadline is not met, ask for additional homework. All homeworks must be submitted in order to get the credit (zápočet).
  • You can solve an additional homework even if you submitted the normal homework in time (i.e., you can improve your average by solving some of the additional homeworks).
  • You have to e-mail us to confirm your additional homework is ready to be rated. All additional homeworks must be submitted at least one week before the credit.
  • Each student is supposed to create all homework solutions himself/herself; any cheating will be penalised (but you can send us an e-mail if you are stuck).

Final grade

  • Homework (40%)
  • Activity (10%)
  • Final test (50%) ... Note: 40% of the total score for the final test is necessary for passing!
  • Excellent: >= 90 %
  • Very good: >= 70 %
  • Good: >= 50 %

Archive