Filtering Microarray Correlations by Statistical Literature Analysis Yields Potential Hypotheses for Lactation Research

Maurice HT Ling,Christophe Lefevre,Kevin R. Nicholas
DOI: https://doi.org/10.48550/arXiv.0901.0213
2009-01-02
Abstract:Our results demonstrated that a previously reported protein name co-occurrence method (5-mention PubGene) which was not based on a hypothesis testing framework, it is generally statistically more significant than the 99th percentile of Poisson distribution-based method of calculating co-occurrence. It agrees with previous methods using natural language processing to extract protein-protein interaction from text as more than 96% of the interactions found by natural language processing methods to overlap with the results from 5-mention PubGene method. However, less than 2% of the gene co-expressions analyzed by microarray were found from direct co-occurrence or interaction information extraction from the literature. At the same time, combining microarray and literature analyses, we derive a novel set of 7 potential functional protein-protein interactions that had not been previously described in the literature.
Digital Libraries,Databases
What problem does this paper attempt to address?