Introduction
I think all users of Habr are familiar with the progress of mankind in the field of microelectronics, the overwhelming majority is the conquest of space, a considerable part is physics. But almost no one knows that a revolution is taking place in biology right now, which will change our life in the next few decades no less than the spread of computers. Moreover, this revolution is directly related to the success in building powerful computing systems. Of course, some "circles on the water" diverge. But not everyone is able to compare the hysteria in the media regarding GMOs, the word “recombinant” on a vial with interferon or insulin, and vague (in Russia) rumors about a certain 23andme. In fact, all these phenomena are connected by one thread. And it is better to unravel this thread from the very beginning.
Mendel and genes
It’s worth starting, perhaps, with research to anyone at that time (XIX century) of the unknown monk Mendel (this is just now being told about him in every school). He noticed strange patterns in the inheritance of flowers by pea plants. To explain this effect, he introduced the concept of a gene as a unit of hereditary information, manifested in a set of external features (phenotype). All genes are inherited in pairs (alleles); paired may be genes encoding two different variations of a particular trait; which particular variation is brought to life is determined by the characteristics of the gene itself. This theory very well explained the observed features of inheritance. So good that no one was surprised by the discovery of a physical mechanism that determines precisely such laws of inheritance.
DNA
The next milestone is the discovery by James Watson and Francis Crick in 1953 of the DNA structure. No, of course, and before them it was known that there was some substance concentrated in the cell's nucleus and forming
chromatin (tangled threads visible in a light microscope after special coloring), and when dividing the cell - chromosomes (much better visible spindle-shaped structures). But it was Watson and Crick who showed what this substance is and what its structure is. It is worth making a small digression and recalling what DNA, proteins and RNA are. DNA is an organic polymer - a substance composed of a large number of fairly simple "bricks". In the case of DNA, these are the nucleotides thymine, adenine, guanine and cytosine, usually encoded by the letters T, A, G, C, respectively. DNA in a cell is a double helix (see
picture ), consisting of two chains of complementary nucleotides: the fact is that nucleotides form pairs of bonds between each other (TA, GC). The complementarity property makes single strands of DNA floating in the solution find their “pairs” and merge with them. Such self-organization is generally extremely characteristic of biochemical processes. RNA is quite similar to DNA, but does not form a double chain, which is why it is less stable, but more mobile and chemically active. Proteins are polymers of amino acids. Usually they are built on an RNA template, which, in turn, is built on DNA.
')
Genetic Engineering
After the discovery of the structure of DNA, further discoveries were not long in coming. A “genetic engineer kit” was gradually formed — proteins found mainly in bacteria and viruses and allowing various DNA operations to be performed:
- restrictase, cutting DNA at precisely defined locations;
- ligase, "gluing" DNA back;
- transcriptases that allow information to be translated from DNA to RNA and vice versa;
- and many other proteins.
The availability of such a set and ever-improving tools has enabled biologists to understand the structure of a living much deeper than it was ever possible. In fact, the abyss, which for centuries lay between chemistry (which we can examine in vitro) and biology (which studies living organisms), has begun to narrow more and more. The first sign of this convergence was a clear understanding of the processes of protein synthesis in cells, which allowed us to create transgenic bacteria and plants, whose cells synthesize the proteins we need besides those necessary for our own survival. Even this rather crude modification presented to humanity:
- Insulin and interferon. Previously, to obtain these proteins, it was necessary to use tons of pig intestines; now they are produced by bacteria literally from the simplest ingredients like glucose solution.
- Glyphosate (aka roundup). Compared with other herbicides, glyphosate is practically harmless to animals (the lethal dose for an adult is measured in hundreds of grams), while it is lethal for all plants except artificially genetically modified - in other words, it completely solves the weed problem without interfering with the growth of the cultivated crop. Actually, the term “genetically modified soy” almost always means just soybean, modified for resistance to round-up. In light of this, GMO opponents look particularly funny, actually calling for the use of herbicides that are truly dangerous for people - there is no alternative, without herbicides, not to feed all 6.5 billion.
Bioinformatics
The improvement of biological tools brought new problems: the exorbitant complexity of the processes occurring in a living cell, and a huge amount of information that no single person could analyze “manually” became obvious. But by this time computers were already widespread enough to become an obvious answer to the question about the means of controlling complexity. This is how bioinformatics was born - a science designed to cope with huge amounts of data intertwined with internal dependencies and non-obvious patterns more than anything known to humanity before. The problems that it solves are extremely new, complex and relevant for humanity, and it is this area of ​​knowledge that now attracts the minds of the strongest algorithmicists of the world. Practically all of them are generated by lagging by several orders of the available computing power from those necessary for direct modeling of the processes occurring in the cell. A large amount of news about building more powerful supercomputers can create the impression that there are no problems with computing power and the most difficult task of modern IT is to build a fail-safe chatik for a couple of million people - well, that’s not true :) What does bioinformatics do ?
- Build sequences. Now there is an opportunity to “disassemble” a protein or DNA into its constituent parts by constructing a sequence of nucleotides or amino acids from which it is built (this process is called sequencing). Unfortunately, due to technical limitations, the size of the largest section, analyzed “at a time”, is hundreds of times smaller than the whole size of the human genome, while the ends of the segments are not accurately fixed, and the segments themselves can be duplicated. This creates the need to collect the genome from small sections, trying to guess with the help of heuristics how exactly the “tails” of these sections overlap and in what order they should be located. For a better understanding of the computational complexity of the problem, it is necessary to clarify that the full human genome, recorded as a string, takes about 3 gigabytes, but the total length of sections can exceed the length of the final sequence tenfold.
- Comparison and sequence matching. Twenty years ago, even comparing two sequences could be a problem; Now, thanks to the increased power of computers and the developed algorithms, this is no longer a difficult task. However, the problem of multiple sequence comparison has become extremely urgent; Thus, for example, metagenomics makes particularly high demands on the comparison: a method for classifying bacteria (many species are indistinguishable from each other under a microscope), based on extracting a mixture of short sections of DNA from a mass of bacteria, their sequencing and exclusively programmatic comparison of the resulting "soup" random DNA sections of different bacteria with already known genomes of bacteria.
- Folding. The functionality of proteins (which are the best catalysts known to mankind, capable of conducting virtually any reaction) is determined by their three-dimensional structure. The three-dimensional structure is determined by the sequence of amino acids in the protein and some of the actions that the cell performs on the protein after its synthesis. Naturally, in order to design proteins with given properties, it is crucially important to be able to predict a three-dimensional structure according to a linear sequence of amino acids. Molecular dynamics methods (direct modeling of the atoms that make up a molecule) are transmitted on the number of atoms characteristic of proteins (tens of thousands). An example of the complexity of the structure of proteins can be, for example, so-called. protein autospliding, when an already synthesized protein exclusively at the expense of its own structure cuts out a section of its sequence and “glues” the incision, forming two different proteins.
- Docking Obviously, there are a lot of proteins and they interact with each other. Docking is the process of the interaction of proteins with each other and with simpler molecules. The calculation of this interaction includes folding as an integral part, but is not limited to it.
Why all this? Of course, to better understand how life works. Biology has already passed that stage when there was some kind of mystery and mystic in the manifestation of life; No more searching for a black cat in a dark room. There is only a giant mountain range with peaks, on which science is climbing, gritting his teeth, only to see even more impressive peaks. But there is no secret in them - all the processes occurring in the cells, just as well proceed in a test tube.
To which vertices does biology lead us?
As I noted in the introduction, our life in the next couple of decades cannot but change. We have learned too much about life to keep this knowledge barren. So, for example, in the past year, an
event occurred that is comparable with the landing on the moon, but attracting much less attention - the creation of the first artificial organism. Hey, the denominations of the world! We humans have created a life! For this, God is not needed! What's next?
- Cancer and AIDS drugs. It is necessary to understand that between the appearance of a medicine in laboratories and on the market, about 5-10 years of testing and certification pass. Laboratories now have drugs that can cure certain types of cancer with a single injection ( publication ). No chemotherapy, falling hair, radiation, operations and weight loss. The only injection. Could we at least hope for it 10 years ago?
- Real biotechnology. Plants that do not require herbicides and pesticides; bananas ripening in the tundra; grass that can compete with asphalt for durability. All this is possible and becomes a reality right before our eyes.
- This genetic diagnosis. Already now a startup (although is it possible to call a start-up a company of Sergey Brin's wife? :)) 23andme.com diagnoses dozens of hereditary diseases and predispositions (which can “shoot” 50 years old) for $ 200. And this is just the beginning.
- Slowing down aging. Armed with new knowledge, a large number of scientists are working on research on the aging process. Given the explosive growth of biology in the last 10-20 years - many of us have a good chance not to die in the next 100 years :)
Conclusion
I hope this brief excursion into modern biology was interesting. If someone is interested in something, I can try to write a sequel to topics of interest. In conclusion, I can put up with a little moral: don’t believe charlatans who say they know better how a person works and tell about all kinds of “non-fixed energies”, “ soul "and so on. Modern biology almost did not leave quite dark places in the living, and the likelihood that, in order to explain any effect, it will be necessary to involve some unknown matter physics, is vanishingly small. Living is extremely difficult, but this complexity is simple in its fundamentals and relies on reactions that everyone can repeat in a test tube without expensive reagents and equipment. I would also like to say a special thank you to
vyahhi and to all the people who made wonderful courses on bioinformatics at
SPbAU possible. You are very, very cool.