Molecular recordings by directed CRISPR spacer acquisition

Seth L Shipman,Jeff Nivala,Jeffrey D Macklis,George M Church,Seth L. Shipman,Jeffrey D. Macklis,George M. Church
DOI: https://doi.org/10.1126/science.aaf1175
IF: 56.9
2016-07-29
Science
Abstract:INTRODUCTION Although recent advances in DNA synthesis and sequencing technologies have made practical the writing and readout of arbitrary data in the form of synthetic DNA, still lacking are the robust tools necessary to generate a dynamic record of such information within the genomes of living cells. An in vivo system, built out of biological parts with large storage capacity, would enable the recording of defined biological events into stable genetic memory and facilitate the tracking of long molecular and cellular histories. RATIONALE The CRISPR (clustered regularly interspaced short palindromic repeats)–Cas system is a prokaryotic type of immunological memory. Foreign DNA sequences originating from viral infections are stored within genome-based arrays in the form of short sequences—called spacers—that confer sequence-specific resistance to the invading nucleic acids. These arrays not only preserve the spacer sequences but also record the order in which the sequences are acquired, generating a temporal record of acquisition events. We harnessed this system to record arbitrary DNA sequences into a genomic CRISPR array in the form of spacers acquired from synthetic oligonucleotides electroporated into a population of cells overexpressing the CRISPR adaptation proteins Cas1 and Cas2. This enabled the recording of defined molecular events into a stable genomic locus over time and the storage of arbitrary information across a population of cells. RESULTS We show that the Cas1-Cas2 complex can be used in vivo to integrate synthetic DNA of a defined sequence into the Escherichia coli genome. We used this feature to examine the type I-E CRISPR-Cas spacer acquisition process and optimized the synthetic spacer design to achieve higher acquisition efficiency and specific integration orientation through the addition of an AAG protospacer adjacent motif (PAM). We then generated stable genomic recordings of multiple molecular events by electroporating sets of oligonucleotides over several days. These molecular records were read out with high-throughput sequencing and then decoded with a program that identified and faithfully reconstructed the temporal event order. Last, we used directed evolution to generate many Cas1-Cas2 mutants with modified PAM specificity (PAM NC ). By modulating expression of these mutant and wild-type Cas1-Cas2 complexes, we could dynamically control the orientation of spacer integration. This enabled us to record acquisition events in multiple modes. That is, information was encoded in both the temporal order of the spacers and the orientation in which they were integrated. CONCLUSION Our results establish a recording system that uses the nucleotide content, temporal ordering, and orientation of defined DNA sequences within a CRISPR array in order to encode arbitrary information within the genomes of a population of cells. Because information can be encoded in spacer nucleotide space (up to two bits per base) and in alternate modes, the system has the potential to record and permanently store higher capacities of information than any other synthetic biological system to date. This lays the foundation for an in vivo recording device that could be coupled with diverse molecular phenomena and used for applications that require tracing of long molecular histories. We also demonstrate that delivery of synthetic DNA substrates to a CRISPR-Cas adaptation system in vivo is a practical method to probe and adapt the system. Two modes of encoding information into the CRISPR locus. ( A ) Oligonucleotides containing an AAG PAM and 32 variable bases were electroporated into cells overexpressing Cas1-Cas2 and inserted into the genomic CRISPR array. Delivery of oligos with distinct sequence over time generates a molecular record. ( B ) Cas1-Cas2 mutants identified through directed evolution alter the orientation of acquisition. Varying expression ratios of wild-type and mutant Cas1-Cas2 over time generates a record encoded in spacer orientation.
multidisciplinary sciences
What problem does this paper attempt to address?