📜 ⬆️ ⬇️

Information on the threshold of immortality, part 2: DNA storage

Imagine that you can store every bit of information in something as small as a microscopic drop of blood. Strange? But this is exactly what scientists have long been striving for - storing data in DNA. Where there are 5D vaults about which we recently wrote! What is this new and unusual type of storage? How it works? And most importantly, what is its potential?



DNA (deoxyribonucleic acid) is a biological polymer molecule that carries genetic information about the development and functioning of an organism. Almost all living things on Earth store their genetic information in the form of DNA. DNA consists of monomers (nucleotides), which consist of sugar (deoxyribose), nitrogenous bases (adenine, guanine, cytosine, thymine) and phosphoric acid. The order of nucleotides and nitrogenous bases is determined by the genetic code: this is what regulates all biological processes.


The composition and structure of the DNA molecule
')
Many properties of DNA are already known, and DNA manipulations are widely used in research and medicine: genetically modified organisms, vaccinations, HIV treatment, gene therapy. DNA is not only a biological mixture: in its essence, it is a chemical polymer with physical properties that could not remain out of the view of chemists, physicists and engineers. DNA can stabilize nanostructures, store terabytes of information, and be an important component of the new information age.

DNA is practically the only storage method that can exist for a thousand or even a million years, and the DNA of ancient fossils is proof of that. The DNA molecule is stable in the environment, scientists have even found a 300,000-year-old bear mitochondrial DNA and deciphered it. Imagine that the entire human history in the form of text, images, video and audio can be artificially encoded in the DNA molecule and saved for our descendants for thousands of years.

In addition, working with information is based on the use of a binary code (1 or 0). DNA has a greater potential for data storage, since four letters (A, G, T, C) can be encoded into each bit, and synthesized DNA molecules with a specific nucleotide sequence can hold up to 1 Zb (billion terabytes) of information in just a few grams DNA


The process of writing data to DNA

According to some estimates, DNA will be able to accommodate the amount of data contained in one hundred industrial data centers and store it in a space the size of a shoebox.

DNA achieves this in two ways. First , the coding units are very small, less than half a nanometer in size, whereas a modern computer storage transistor is not less than 10 nanometers in size. At the same time, such a difference in size increases the storage capacity not by 10, but by a thousand or even 100,000 times. This difference arises from the great advantage of DNA: three-dimensional.

A pair of tar spoons


Of course, there are some limitations in using DNA as a storage device. For example, to synthesize long DNA sequences for a very long time, in addition, the cost of these manipulations is relatively high, and the probability of error is high during chemical synthesis of DNA. But scientists are on their way to overcoming these difficulties: first , thousands of short DNA molecules (up to 200 nucleotides) are used to store information instead of one or several very long polymers. Secondly , the cost of DNA synthesis and sequencing decreases exponentially over the course of a year: the cost of one megabyte coding is about $ 500 (three times less than two years ago), and getting it costs about $ 200.


Increase sequencing speed over time

In 2013, a team from the European Institute of Bioinformatics reported on successful recording of 739 kilobytes of data in DNA - including a color image, 154 Shakespeare's sonnets, and an excerpt from Martin Luther King’s speech “I have a dream”. Recently, scientists from the Zurich Institute of Chemistry and Bioengineering have developed a new improved method for coding data and correcting errors during DNA sequencing, as well as increased storage efficiency - now the DNA molecule encased in a silicone glass shell can be stored for up to 1,000,000 years at -18 ° C

DNA = RAM?


What makes the storage of DNA data unique, besides the carrier itself, is that the code does not work like a hard disk, but more closely resembles the computer’s RAM. Storing data in DNA is similar to the computer's RAM in that it does not matter where the data is stored in the DNA chain — you can retrieve it from anywhere.


Here it is worth noting the super efficient DNA storage structure in the form of a double helix. Chromatin, the DNA protein system that makes up the chromosomes, is essentially a very complex mechanism that allows the DNA molecule to twist tightly enough, while quickly unwinding when the body is in dire need of certain parts of DNA.

The process of extracting a specific section of DNA

This natural chromatin-power system, which allows any gene to be extracted from any part of the genome with approximately the same efficiency, led researchers to compare DNA and a version of computer random access memory, or RAM. Unfortunately, the similarity ends here, and the main drawback appears - speed. DNA can be stored almost forever, but it will take years to download files.

DNA is much harder and slower to read than ordinary computer transistors, i.e. in terms of access speed, DNA resembles less computer memory than any flash drive or hard disk.

This is because the incredible abilities of an evolutionary data storage solution do not necessarily include instant reading of information. To read from the DNA molecule, it is necessary to unravel the complex structure of chromatin, then unravel the double helix of DNA itself, make a copy of the sequence and pack everything back - it is clear that it will take a lot of time.

To read the data you need to add an extra step. Reading information is achieved using old-fashioned biotech labs, called a polymerase chain reaction ( PCR ) for amplifying, or repeating, the sequence we want to read. The entire sample is then sequenced, and all multiple repetitions of the same sequence are discarded: what remains is informational interest. These DNA segments are labeled with small target sequences that allow you to begin the replication process.


DNA digestion for reading information

In cells, genes are turned on and off by changing the availability of these target sequences. This can be done by winding and unwinding chromatin, directly adding or removing a protein blocker, or even interacting with other regions of the genome. Theoretically, this process could have been made much more perfect, but it would have required a high level of complexity in protein engineering, which has not yet been achieved at the moment.

Source: https://habr.com/ru/post/317110/


All Articles