Paul Cullen,Stefan Lorkowski,Steffen Hennig,Albert Poustka,Georgia Panopoulou,Hans Lehrach,Eberhard Korsching,Oxana Pickeral,Ralf Herwig,Alexander Kel,Dmitrij Tchekmenev,Edgar Wingender,Johannes Streicher,Gerd B. Müller,Takeshi Kawashima,Kazuhiro W. Makabe,Inna Dubchak,Hongkai Ji,Kousaku Okubo,Shoko Kawamoto,Ellen Fricke,Dagmar Karas,Martin Haubrock,Sigrid Land,Stella Rotert,Xin Chen,Joan Pontius,Eric Eveno,Charles Auffray,Charles Decraene,Claude Chelala,Geneviève Piétu,Marie‐Dominique Devignes,Régine Mariage‐Samson,Sandrine Imbeaud,Sylvie Bortoli,Alon Amit,Martin Ringwald,Michael J. de Veer,Bryan R. G. Williams,Eldon M. Walker,Jamie A. Davies,Christoph Grunau,Richard Baldock,Duncan R. Davidson,Christian J. Stoeckert,Angel Pizarro,Elisabetta Manduchi,Gregory R. Grant,Jonathan Crabtree,Junmin Liu,Phuc V. Le,Shannon K. McWeeney,Stephen Welle,Catherine A. Ball,David Botstein,Gail Binkley,Gavin J. Sherlock,J. Michael Cherry,Kara Dolinski,Laurie Issel‐Tarver,Mark Schroeder,Selina S. Dwight,Shuai Wenig,John C. Matese,Heng Jin,Jeremy Gollub,Joan Hebert,Miroslava Kaloper,Patrick O. Brown,Tina Hernandez‐Boussard,Anuj Kumar,Kei‐Hoi Cheung,Luis Marenco,Michael Snyder,Nick Tosches,Paul Bertone,Perry Miller,Peter Masiar,Yang Liu,Graziano Pesole,Philippe Marc,Margaret Biswas,Paul Kersey,Rolf Apweiler,Christine Hoogland

Abstract:This chapter contains sections titled: Introduction Comparative expressed sequence tag analysis Introduction Processing expressed sequence tags prior to content analysis Gene content and annotation of expressed sequence tags Expressed sequence tags in comparative genomics In silico subtraction using clustered sets of expressed sequence tags Expressed sequence tag data repositories and cDNA clone distribution centres Data management and data mining Introduction Current situation Future development Taking part in bioinformatics Hardware and software demands Data types, structures and processing Communication structures Building a test scenario Microarray experiments Analysing the workflow – getting things done Designing the question and choosing the right tools for the answer Scaling up Strategies of data mining Data evaluation and representation Principles of query languages Data mining Custom solutions Summary Integration of heterogeneous high-throughput gene expression data Introduction Steps towards data integration Initial steps in realising data integration Conclusions Cluster analysis of gene expression profiles Introduction Information content of gene expression clusters Similarity matrices and gene expression matrices Clustering algorithms Hierarchical clustering Self-organising maps K-means Gene shaving Evaluation of gene expression clusters Conclusion Promoter finding in eukaryotic genomes Introduction Transcription regulation in eukaryotes Promoter structure Transcription factors Combinatorial nature of transcription regulation Databases on transcriptional regulation In silico study of gene transcription regulation Recognition of cis-regulatory elements Recognition of composite regulatory elements Recognition of promoters Conclusions GeneEMAC – Three-dimensional visualisation of gene expression Introduction Principles and basics of the GeneEMAC concept Specimen preparation Whole-mount in situ hybridisation Embedding Introduction of external markers Capturing of a reference image Histological sectioning Microscopy and digital image processing Image capturing Image congruencing Image segmentation Generation of a three-dimensional model Visualisations of models Examples Discussion RNA-based gene expression databases and analyses tools Introduction ASDB – The Alternative Splicing Database AsMamDB – The Alternative Splice Database of Mammals BodyMap – An anatomical gene expression database of human and mouse The CYTOMER® Gene Expression Database on human organs and cell types Database of three-dimensional visualisation of gene expression dbEST – The Database of Expressed Sequence Tags DDD – Digital Differential Display The Genexpress IMAGE Knowledge Base of the Human Genome and Transcriptomes The GencartaTM Database GEO – Gene Expression Omnibus Database GXD – The Mouse Gene Expression Database ISG Database – Interferon-Stimulated Gene Database The Kidney Development Database MAGEST – The Maboya Gene Expression Patterns and Sequence Tags Database MethDB – The DNA Methylation Database EMAGE – The Edinburgh Mouse Atlas Gene Expression Database RAD – The RNA Abundance Database The Rochester Muscle Database SAGEmap – The serial analysis of gene expression tag to gene mapping database SGD – The Saccharomyces Genome Database and its Expression Connection SMD – The Stanford Microarray Database TRIPLES – The Database of Transposon-Insertion Phenotypes, Localisation, and Expression in Saccharomyces cerevisiae UTRdb and UTRsite – The Specialised Databases of Sequences and Functional Elements of 5′ and 3′-Untranslated Regions of Eukaryotic mRNAs yMGV – The Yeast Microarray Global Viewer Protein-based gene expression databases and analyses tools Introduction to protein-based gene expression databases The Proteome Analysis Database SWISS-2DPAGE – A two-dimensional polyacrylamide gel electrophoresis database Further gene-expression databases in the internet Summary References

Textrous!: extracting semantic textual meaning from gene sets

Gene Set linkage analysis: a tool for interpreting the overall functional impacts of observed transcriptomic changes

Application and evaluation of automated semantic annotation of gene expression experiments

MedMiner: An Internet Text-Mining Tool for Biomedical Information, with Application to Gene Expression Profiling

Extracting Information for Meaningful Function Inference through Text-Mining

Text mining for contexts and relationships in cancer genomics literature

Tagging gene and protein names in biomedical text

Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

GenoTEX: A Benchmark for Evaluating LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians

Semantic computing for human phenotypes

Extraction of semantic biomedical relations from text using conditional random fields

Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines

WebGestalt: an integrated system for exploring gene sets in various biological contexts

Discovery of perturbation gene targets via free text metadata mining in Gene Expression Omnibus

Integration of Text- and Data-Mining Using Ontologies Successfully Selects Disease Gene Candidates

Computational Methods and Bioinformatic Tools

Rummagene: massive mining of gene sets from supporting materials of biomedical research publications

Gene Set Summarization using Large Language Models

Text mining for finding functional community of related genes using TCM knowledge

The BioLexicon: a large-scale terminological resource for biomedical text mining

GEOGLE: Context Mining Tool for the Correlation Between Gene Expression and the Phenotypic Distinction