Abstract:With this review we articulate two important topics in the context of deciphering the role and functions of microbial communities: metabolic modeling and metagenomics. We survey the methodological approaches, highlight the changes raised by third generation sequencing, and provide resources to bridge the gaps between the sequencing reads and the models. Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome‐scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta‐)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third‐generation sequencing, and we discuss the opportunities of long‐read sequencing, strain‐level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.

Deciphering enzymatic potential in metagenomic reads through DNA language model

Integrated De Novo Gene Prediction and Peptide Assembly of Metagenomic Sequencing Data

Deciphering microbial gene function using natural language processing

Decoding proteome functional information in model organisms using protein language models.

Language model-guided anticipation and discovery of unknown metabolites

Community‐scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing

GENA-LM: A Family of Open-Source Foundational DNA Language Models for Long Sequences

Understanding the Natural Language of DNA using Encoder-Decoder Foundation Models with Byte-level Precision

Assembling bacterial puzzles: piecing together functions into microbial pathways

GENA-Web - GENomic Annotations Web Inference using DNA language models

BEND: Benchmarking DNA Language Models on biologically meaningful tasks

Leveraging Large Language Models for Metagenomic Analysis

Genomic language model predicts protein co-regulation and function

Deciphering the Language of Protein-DNA Interactions: A Deep Learning Approach Combining Contextual Embeddings and Multi-Scale Sequence Modeling

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Advancing Plant Metabolic Research By Using Large Language Models To Expand Databases And Extract Labelled Data

Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Deciphering the Biosynthetic Potential of Microbial Genomes Using a BGC Language Processing Neural Network Model

Metabolic Network Analysis Integrated with Transcript Verification for Sequenced Genomes

Nucleotide dependency analysis of DNA language models reveals genomic functional elements

Metaproteomics beyond databases: addressing the challenges and potentials of de novo sequencing