Allomorfessor: Towards Unsupervised Morpheme Analysis

Reference:

Oskar Kohonen, Sami Virpioja, and Mikaela Klami. Allomorfessor: Towards unsupervised morpheme analysis. Lecture Notes in Computer Science, 5706, Evaluating Systems for Multilingual and Multimodal Information Access 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008 Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers, Editors Carol Peters, Thomas Deselaers, Nicola Ferro, Julio Gonzalo, Gareth J.F.Jones, Mikko Kurimo, Thomas Mandl, Anselmo Peņas, Vivien Petras, 2009.

Abstract:

We extend the unsupervised morpheme segmentation method Morfessor Baseline to account for the linguistic phenomenon of allo- morphy, where one morpheme has several different surface forms. Our method discovers common base forms for allomorphs from an unanno- tated corpus. We evaluate the method by participating in the Morpho Challenge 2008 competition 1, where inferred analyses are compared against a linguistic gold standard. While our competition entry achieves high precision, but low recall, and therefore low F-measure scores, we show that a small model change gives state-of-the-art results.

Suggested BibTeX entry:

@article{okohonen_clefpostproc_2009,
    author = {Oskar Kohonen and Sami Virpioja and Mikaela Klami},
    journal = {Lecture Notes in Computer Science},
    publisher = {Springer},
    title = {Allomorfessor: Towards Unsupervised Morpheme Analysis},
    volume = {5706, Evaluating Systems for Multilingual and Multimodal Information Access 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008 Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers, Editors Carol Peters, Thomas Deselaers, Nicola Ferro, Julio Gonzalo, Gareth J.F.Jones, Mikko Kurimo, Thomas Mandl, Anselmo Peņas, Vivien Petras},
    year = {2009},
}

This work is not available online here.