Morphology Project: StateMorph
Research:
We develop information-theoretic methods for modeling morphology.-
Morphological segmentation
Evaluation of morphological segmentation
-
Finnish
Russian
Turkish
Resources:
Unsupervised Learning of Morphology:
-
Morphological Segmentation package from the SLSP-2017 paper
Evaluation package from the LREC-2016 paper
Finnish dataset
Transliteration:
People:
-
Mian Du, PhD student
Suvi Hiltunen, MSc student
Guowei Lv, MSc student
Javad Nouri, MSc student
Kirill Reshetnikov,
Russian Academy of Sciences,
Institute of Linguistics, Moscow Arto Vihavainen, MSc student Hannes Wettig, PhD student Roman Yangarber: Project Lead
Collaboration:
-
Russian Academy of Sciences (RAS),
Institute of Linguistics.
Bayesian Statistics Group, led by J Corander,
Department of Mathematics and Statistics,
COIN Center of Excellence of the Academy of Finland.
Supported by:
-
Academy of Finland, the UraLink Project,
Russian Fund for
the Humanities /
Russian Foundation for Basic Research,
HIIT: Helsinki Institute
for Information Technology,
Algodan Center of Excellence: Algorithmic Data Analysis.
Publications: conference and journal papers, book chapters, dissertations
-
Learning Morphology of Natural Language as a Finite-State Grammar
Javad Nouri, Roman Yangarber In Proceedings of SLSP: the 5th International Conference on Statistical Language and Speech Processing
(2017) Le Mans, France Minimum Description Length Models for Unsupervised Learning of Morphology (Master's Thesis)
Javad Nouri (2016) University of Helsinki, Department of Computer Science A novel method for evaluation of morphological segmentation
Javad Nouri, Roman Yangarber In Proceedings of LREC: 10th International Conference on Language Resources and Evaluation
(2016) Portorož, Slovenia MDL-based Models for Transliteration Generation
Javad Nouri, Lidia Pivovarova, Roman Yangarber In SLSP: International Conference on Statistical Language and Speech Processing
Springer Verlag, Lecture Notes in Artificial Intelligence (LNAI) Volume 7978,
(2013) Tarragona, Spain Hidden Markov models for induction of morphological structure of natural language
Hannes Wettig, Suvi Hiltunen, Roman Yangarber. WITMSE-2010: Workshop on Information Theoretic Methods in Science and Engineering
(2010) Tampere, Finland