Combinatorial Design Testing in Genomes with POLAR-seq

Klaudia Ciurkot,Xinyu Lu,Anastasiya Malyshava,Livia Soro,Aidan Lees,Thomas E Gorochowski,Tom Ellis
DOI: https://doi.org/10.1101/2024.06.06.597521
2024-06-06
Abstract:Synthetic biology projects increasingly use modular DNA assembly or synthetic in vivo recombination to generate diverse combinatorial libraries of genetic constructs for testing. But as these designs expand to multigene systems it becomes challenging to sequence these in a cost-effective way that reveals the genotype to phenotype relationships in the libraries. Here, we introduce a new quick, low-cost method designed for assessing combinational designs of genome-integrated multigene constructs that we call Pool of Long Amplified Reads (POLAR) sequencing. POLAR-seq takes genomic DNA isolated from library pools and uses long range PCR to amplify target genomic regions up to 35 kb long containing combinatorial designs. The pool of long amplicons is then directly read by nanopore sequencing with full length reads then used to identify the gene content and structural variation of individual genotypes in the library and read count indicating how abundant a genotype is within the pool. Using yeast cells with loxP-containing synthetic gene clusters that rearrange in vivo in the presence of Cre recombinase, we demonstrate how POLAR-seq can be used to identify global patterns from combinatorial experiments, find the most abundant genotypes in a pool and also be adapted to sequence-verify gene clusters from isolated strains.
Synthetic Biology
What problem does this paper attempt to address?
### The Problem Addressed by the Paper The paper aims to address the issue of rapid identification of genotypes in combinatorial libraries within the field of synthetic biology. Specifically, when researchers construct multi-gene combinatorial libraries within cells, existing sequencing technologies (such as Sanger sequencing or short-read sequencing) struggle to efficiently and economically identify the genotypes and structural variations within these libraries. This problem is particularly pronounced when the gene combinations exceed a certain length (such as multi-gene systems) and are integrated into the host genome. The paper proposes a new method—Pool of Long Amplified Reads (POLAR-seq)—for the rapid and low-cost identification of multi-gene combinatorial libraries integrated into the genome. POLAR-seq employs long-range PCR to amplify target regions and utilizes nanopore sequencing technology to obtain full-length reads, thereby identifying the specific content and abundance of each genotype in the library. This method is particularly suitable for gene rearrangement experiments conducted in yeast cells, effectively identifying various structural variations such as gene deletions, duplications, transpositions, and inversions.