Tagging the Teleman Corpus

Thorsten Brants,Christer Samuelsson
DOI: https://doi.org/10.48550/arXiv.cmp-lg/9505026
1995-05-11
Abstract:Experiments were carried out comparing the Swedish Teleman and the English Susanne corpora using an HMM-based and a novel reductionistic statistical part-of-speech tagger. They indicate that tagging the Teleman corpus is the more difficult task, and that the performance of the two different taggers is comparable.
Computation and Language
What problem does this paper attempt to address?