Typesafe Modeling in Text Mining

Fabian Steeg
DOI: https://doi.org/10.48550/arXiv.1108.0363
2011-07-29
Abstract:Based on the concept of annotation-based agents, this report introduces tools and a formal notation for defining and running text mining experiments using a statically typed domain-specific language embedded in Scala. Using machine learning for classification as an example, the framework is used to develop and document text mining experiments, and to show how the concept of generic, typesafe annotation corresponds to a general information model that goes beyond text processing.
Programming Languages,Information Retrieval
What problem does this paper attempt to address?