Solving Multi-instance Learning Problem with Evaluating the Importance of Concept in Instances

GAN Rui,YIN Jian
DOI: https://doi.org/10.3969/j.issn.1002-137X.2012.07.033
2012-01-01
Computer Science
Abstract:In multi-instance learning,the training set is composed of labeled bags,each of which consists of many unlabeled instances,and the goal is to learn some classifier from the training set for correctly labeling unseen bags.In the past,some researches about multi-instance learning aim at improving single-instance learning algorithms to meet the multi-instance representation,and others try to propose some new methods to find the relationship between instances and bags and use the result to solve the problem.This paper started from adapting the representation of the bag and proposed a new algorithm——concept evaluating algorithm.First,this algorithm uses a cluster algorithm to cluster all instances into d group,here each group can be treated as a concept in the instances.Then,it uses the TF-IDF(term frequency-inverse document frequency)algorithm to get the importance of each concept in the bag.Finally,each bag is re-represented as a d dimensional vector——concept evaluating vector,the ith value in this vector is the importance of the ith group in the bag.Because after re-representing the data set is not "multi" again,some propositional single-instance learning algorithms can be used to solve multi-instance learning problem effetely.
What problem does this paper attempt to address?