Maarten Janssen
Postdoctoral Fellow
Department for the Study of Ancient and Medieval Thought
Institute of Philosophy of the Czech Academy of Sciences
Jilská 1, 1100 Praha 1
Czech Republic
janssen@cas.flu.cz
Institute of Formal and Applied Linguistics
Charles University
Malostranské náměstí 25
118 00 Praha 1
Czech Republic
janssen@ufal.mff.cuni.cz
About
Maarten Janssen is a computational linguist, with a special interest in corpora, lexica, and formal semantics. He was responsible for the system behind the official vocabulary of Portuguese (VOC), and is currently mostly known as the main author of the TEITOK corpus system. His PhD is from Utrecht University, and he has worked in various research institutes around Europe: OTS (Utrecht), ERSS (Toulouse), ILTEC (Lisbon), IULA (Barcelona), CLUL (Lisbon), CELGA (Coimbra), and is currently working at UFAL (Prague).
Maarten Janssen is responsible for overseeing the creation of a corpus out of the source material used in Alchemies of Scent, and for developing a system that can be used to model recipes and ingredients directly over the source material.
Recent Publications
M. Janssen. 2021. UDWiki: Guided creation and exploitation of UD treebanks. UDW 2021 - 5th Workshop on Universal Dependencies, Proceedings - To be held as part of SyntaxFest 2021, pages 84–95.
C. Navarretta and M. Eskevich, editors. 2020. Integrating TEITOK and Kontext at LINDAT. CLARIN, Madrid, Spain.
Egon W Stemle, Adriane Boyd, Maarten Janssen, A Rosen, D Rosén, E Volodina, et al. 2019. Working together towards an ideal infrastructure for language learner corpora
M. Janssen. 2018. Adding words to manuscripts: From PagesXML to TEITOK. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11057 LNCS:152–157.
M. Janssen. 2016. TEITOK: Text-faithful annotated corpora. Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, pages 4037–4043.