Abstract:Many efforts exist to design and implement approaches and tools for data capture, integration and analysis in the life sciences. Challenges are not only the heterogeneity, size and distribution of information sources, but also the danger of producing too many solutions for the same problem. Methodological, technological, infrastructural and social aspects appear to be essential for the development of a new generation of best practices and tools. In this paper, we analyse and discuss these aspects from different perspectives, by extending some of the ideas that arose during the NETTAB 2012 Workshop, making reference especially to the European context. First, relevance of using data and software models for the management and analysis of biological data is stressed. Second, some of the most relevant community achievements of the recent years, which should be taken as a starting point for future efforts in this research domain, are presented. Third, some of the main outstanding issues, challenges and trends are analysed. The challenges related to the tendency to fund and create large scale international research infrastructures and public-private partnerships in order to address the complex challenges of data intensive science are especially discussed. The needs and opportunities of Genomic Computing (the integration, search and display of genomic information at a very specific level, e.g. at the level of a single DNA region) are then considered. In the current data and network-driven era, social aspects can become crucial bottlenecks. How these may best be tackled to unleash the technical abilities for effective data integration and validation efforts is then discussed. Especially the apparent lack of incentives for already overwhelmed researchers appears to be a limitation for sharing information and knowledge with other scientists. We point out as well how the bioinformatics market is growing at an unprecedented speed due to the impact that new powerful in silico analysis promises to have on better diagnosis, prognosis, drug discovery and treatment, towards personalized medicine. An open business model for bioinformatics, which appears to be able to reduce undue duplication of efforts and support the increased reuse of valuable data sets, tools and platforms, is finally discussed.

Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction

META-BASE: A Novel Architecture for Large-Scale Genomic Metadata Integration

Semantic Health Knowledge Graph: Semantic Integration of Heterogeneous Medical Knowledge and Services

Integrated Bio-Search: challenges and trends for the integration, search and comprehensive processing of biological information

Extending traditional query-based integration approaches for functional characterization of post-genomic data

ROBOKOP KG and KGB: Integrated Knowledge Graphs from Federated Sources

An open source knowledge graph ecosystem for the life sciences

Querying semantic catalogues of biomedical databases

A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora

Functional analysis of OMICs data and small molecule compounds in an integrated "knowledge-based" platform

GenoSurf: metadata driven semantic search system for integrated genomic datasets

A knowledge graph to interpret clinical proteomics data

An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries

An approach for proteins and their encoding genes synonyms integration based on protein ontology

Large-Scale Knowledge Synthesis and Complex Information Retrieval from Biomedical Documents

Application and evaluation of automated semantic annotation of gene expression experiments

Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

A large collection of bioinformatics question-query pairs over federated knowledge graphs: methodology and applications

Representing Semantified Biological Assays in the Open Research Knowledge Graph

Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge Graphs

Data Cleaning and Semantic Improvement in Biological Databases