Molecular Signatures from Gene Expression Data

Ramon Diaz-Uriarte
DOI: https://doi.org/10.48550/arXiv.q-bio/0401043
2004-10-08
Abstract:Motivation: ``Molecular signatures'' or ``gene-expression signatures'' are used to predict patients' characteristics using data from coexpressed genes. Signatures can enhance understanding about biological mechanisms and have diagnostic use. However, available methods to search for signatures fail to address key requirements of signatures, especially the discovery of sets of tightly coexpressed genes. Results: After suggesting an operational definition of signature, we develop a method that fulfills these requirements, returning sets of tightly coexpressed genes with good predictive performance. This method can also identify when the data are inconsistent with the hypothesis of a few, stable, easily interpretable sets of coexpressed genes. Identification of molecular signatures in some widely used data sets is questionable under this simple model, which emphasizes the needed for further work on the operationalization of the biological model and the assessment of the stability of putative signatures. Availability: The code (R with C++) is available from <a class="link-external link-http" href="http://www.ligarto.org/rdiaz/Software/Software.html" rel="external noopener nofollow">this http URL</a> under the GNU GPL.
Quantitative Methods,Genomics
What problem does this paper attempt to address?