Jan Hajič jr.

Malostranské náměstí 25
118 00 Praha 1
Czech Republic

Main Research Interests


Optical Music Recognition: Here's my dissertation: ODEVZDANE_IPTX_2013_2_11320_0_455244_0_153947.pdf

I've recently published the MUSCIMA++ dataset.

Music Information Retrieval in general (see e.g. the defended Bc. thesis of Marek Židek).

Bayesian models, non-parametric Bayesian models

Neural networks for text modeling

Multimodal (text/image) models


Ribosomal RNA secondary structure prediction

Sentiment analysis

Unsupervised morphology

Generative parsing



Multimodal Optical Music Recognition (GAUK 1444217), 2017 - 2019 (PI).

Convolutional Neural Networks for Optical Music Recognition (GAUK 170217), 2017 - 2018 (Co-investigator)


rRNA Secondary Structure Prediction (GAUK 550214), 2015 - 2016 (PI).

Curriculum Vitae

My CV is available here: CV_HajicJr.pdf

Selected Bibliography

Jan Hajič jr., Matthias Dorfer, Gerhard Widmer, Pavel Pecina: Towards Full-Pipeline Handwritten OMR with Musical Symbol Detection by U-Nets. In: 19th International Society for Music Information Retrieval Conference, Paris, France, 2018. [pdf]

Alexander Pacha, Jan Hajič jr., Jorge Calvo-Zaragoza: A Baseline for Musical Object Detection with Deep Learning. Applied Sciences 8 (9), 1488-1509. 2018. [pdf]

Matthias Dorfer, Jan Hajič Jr, Andreas Arzt, Harald Frostel, Gerhard Widmer: Learning Audio–Sheet Music Correspondences for Cross-Modal Retrieval and Piece Identification. Transactions of the International Society for Music Information Retrieval, 1 (1), 2018. [html]

Jan Hajič jr., Pavel Pecina: The MUSCIMA++ Dataset for Handwritten Optical Music Recognition. Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, Osaka Prefecture University, Kyoto, Japan, November 2017. [pdf]

Hajič jr., J. & Pecina, P.: Detecting Noteheads with ConvNets and Bounding Box Regression. Technical report, to appear in ArXiv e-prints, 2017 [pdf]

Hajič jr., J. & Pecina, P.: In Search of a Dataset for Handwritten Optical Music Recognition: Introducing MUSCIMA++
ArXiv e-prints, 1703.04824, 2017 [pdf]

Hajič jr., J.; Novotný, J.; Pecina, P. & Pokorný, J.: Further Steps towards a Standard Testbed for Optical Music Recognition. Proceedings of the 17th International Society for Music Information Retrieval Conference, New York University, 2016, 157-163 [pdf]

Straka, M.; Hajič, J.; Straková, J. & Hajič jr., J.: Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle. 14th International Workshop on Treebanks and Linguistic Theories (TLT 2015), IPIPAN, 2015, 208-220

Hajič jr., J. & Pecina, P.: Matching Illustrative Images to “Soft News” Articles. In: UFAL WDS 2015 (Conference of PhD Students in Mathematical Linguistics), Institute of Formal and Applied Linguistics, Charles University in Prague, 2015, 49-56

Veselovská, K.; Hajič jr., J. & Šindlerová, J.: Subjectivity Lexicon for Czech: Implementation and Improvements.
Journal for Language Technology and Computational Linguistics, German Society for Computational Linguistics and Language Technology, 2014, 29, 47-61 

Veselovská, K. & Hajič jr., J.: Why Words Alone Are Not Enough: Error Analysis of Lexicon-based Polarity Classifier for Czech. Proceedings of the 6th International Joint Conference on Natural Language Processing, Asian Federation of Natural Language Processing, 2013, 1-5 [pdf]

Veselovská, K.; Hajič jr., J. & Šindlerová, J.: Creating Annotated Resources for Polarity Classification in Czech
Proceedings of the 11th Conference on Natural Language Processing, Schriftenreihe der Österreichischen Gesellschaft für Artificial Intelligende (ÖGAI), 2012


I am open to topics concerning music technology. Currently, I am supervising:

Marek Židek (defended Bc. thesis on generating music with LSTMs, includes significant effort in evaluation)

Jiří Balhar (working on Bc. thesis on melody extraction from orchestral audio)


I've recently finished PhD studies at ÚFAL, writing my thesis under RNDr. Pavel Pecina on the topic of Optical Music Recognition. I am generally interested in music informatics: if you are a student and have interest in music, especially machine learning for musical applications, I will be happy to hear about it! For instance, we did some music generation (interview in Czech).

My Mgr. thesis, also under RNDr. Pecina, was on the topic of automatically selecting images for news articles. This work was also done for the CEMI project. My thesis is available here: Matching Images to Texts

I have previously worked on the SEANCE project on Sentiment Analysis, with my Bc. thesis and in the following years.