📜 ⬆️ ⬇️

"The principle of pasta": scientists have organized random access to DNA memory

Scientists from the University of Illinois at Urbana-Champaign managed to implement DNA storage with random (random) data access. About their approach and what is common in DNA chains with pasta, we describe below.


/ photo by Tim Sackton CC

How it works


To store data in DNA, scientists convert a binary code into a sequence of four nitrogenous bases — adenine (A), thymine (T), guanine (G), and cytosine (C). After they are synthesized into short chains overlapping each other. For example, if the chains consist of one hundred pairs of bases, then the last 75 pairs from the previous chain will be the first for the next.
')
For this reason, accessing arbitrary “points in memory” causes certain difficulties. It is necessary to decipher the entire amount of information to obtain a single file. Selecting among the many molecules is necessary, it's like trying to catch a specific macaroni from the soup. The probability of arbitrarily grabbing it is small.

However, scientists from the University of Illinois found a solution. If you replicate the same macaroni again and again until the plate is full, any of them will be necessary. Therefore, they decided to synthesize coded chains with additional sequences that would act as an address.

These addresses are used by primers to identify DNA strands that need to be replicated. As a result, scientists were able to identify and reproduce DNA chains with the necessary data using the polymerase chain reaction method . This simplifies the process of finding a copy of the desired chain.

Now scientists must overcome a number of difficulties. Some of them are related to the features of the sequencers. They are susceptible to substitution errors — it turned out to be difficult to restore the genome after it was broken down into separate components for reading, and not to confuse the segments. Therefore, work is now underway on "error correction codes."

Another difficulty is that companies that are engaged in DNA synthesis are not yet ready to switch to new methods of work, since their production processes are automated, and rebuilding them to generate additional chains is too expensive. Therefore, researchers have yet to carry out some work aimed at cheapening all operations.


/ University of Michigan CC Photo

Who else is involved in DNA storage


Microsoft, together with the University of Washington, are also working on the creation of DNA storage with random access (we wrote about this in one of our previous materials ). And at the beginning of the year, they were able to encode and accurately recover more than 400 MB of data. In the future, the storage capacity is planned to increase to 1 TB and more, and Microsoft even plans to add DNA storage to its cloud platform.

Also in this area are scientists from Harvard. They managed to record an animation with a rider on a horse, Shakespeare's sonnets to the bacterium, and one of the researchers, George Church, perpetuated his book Regenesis in DNA (he created 90 billion copies of it).

For recording and reading information, biologists used the CRISPR system. This system is a natural defense mechanism by which bacteria create immunity to the invasion of viruses. They capture the DNA molecules of viruses, generate so-called spacers and “insert” their locus . Scientists coded the desired information into spacers and transferred it to bacteria under the guise of viral DNA.

When will the DNA future come?


Despite the success of all the experiments mentioned, scientists cannot yet put the technology "on stream" (mainly due to the high cost). Therefore, it’s not yet possible to talk about their entry into the wide market.

However, today there are commercial companies that offer their customers the service of recording information in DNA. For example, a startup Twist Bioscience " canned " user data for 100 thousand dollars (12 megabytes). At the same time, the company's management predicts that in a couple of years the cost of recording will drop to just 10 cents.

In Twist Bioscience, a special machine that looks like an inkjet printer is engaged in the synthesis of DNA chains. It "squeezes" molecules A, T, G and C into 9.6 thousand "nanolunks" the diameter of a human hair. These tiny holes are located on a black silicon plate about the size of a postcard. In total, Twist Bioscience synthesizes about 3 million chains per day.

Another example: the British trip-hop group Massive Attack decided to keep the third studio album Mezzanine in DNA as a gift for its twentieth anniversary. The coding of the musical heritage will be carried out by the laboratory staff in Zurich. The result of their work should appear in a month.



PS Several posts from the First Corporate IaaS Blog:


PPS Other materials from our blog on Habré:

Source: https://habr.com/ru/post/359040/


All Articles