Drosophila Gene Expression Pattern Annotations Via Multi-Instance Biological Relevance Learning

Hua Wang,Cheng Deng,Hao Zhang,Xinbo Gao,Heng Huang
DOI: https://doi.org/10.1609/aaai.v30i1.10173
2016-01-01
Abstract:Recent developments in biology have produced a large number of gene expression patterns, many of which have been annotated textually with anatomical and developmental terms. These terms spatially correspond to local regions of the images, which are attached collectively to groups of images. Because one does not know which term is assigned to which region of which image in the group, the developmental stage classification and anatomical term annotation turn out to be a multi-instance learning (MIL) problem, which considers input as bags of instances and labels are assigned to the bags. Most existing MIL methods routinely use the Bag-to-Bag (B2B) distances, which, however, are often computationally expensive and may not truly reflect the similarities between the anatomical and developmental terms. In this paper, we approach the MIL problem from a new perspective using the Class-to-Bag (C2B) distances, which directly assesses the relations between annotation terms and image panels. Taking into account the two challenging properties of multi-instance gene expression data, high heterogeneity and weak label association, we computes the C2B distance by introducing class specific distance metrics and locally adaptive significance coefficients. We apply our new approach to automatic gene expression pattern classification and annotation on the Drosophila melanogaster species. Extensive experiments have demonstrated the effectiveness of our new method.
What problem does this paper attempt to address?