Unbreakable code exists

Computer scientists can prove that there are no errors in a program with the same certainty that mathematics can prove theorems. These achievements are used to increase security in various areas, from UAVs to the Internet.

In the spring of 2015, a hacker team tried to crack an unmanned military helicopter called the Little Bird . The helicopter, similar to the manned version of the beloved US special forces aircraft, was located on the territory of the Boeing company in Arizona. The hackers had a head start: at the beginning of their work, they had access to one of the subsystems of the control computer. They could only hack the main on-board flight computer and get control of the UAV.

At the very beginning of the project, the Red Team hacker team could hack the helicopter system almost as easy as your home WiFi. In the following months, DARPA engineers applied a new security mechanism, a software system not subject to enforced expropriation. The key parameters of the Little Bird system cannot be cracked with existing technologies, and its code can be trusted as mathematical evidence. And although the team was given six weeks of access to the UAV, and access to the computer system exceeded the access capabilities of the real hacker team, they could not crack the Birds protection.
')
“They weren’t able to hack or disrupt the operation,” said Kathleen Fisher , a computer science professor at Tufts University and project manager for High-Assurance Cyber Military Systems (HACMS). “As a result, DARPA representatives stood up and said, oh my goodness, you can actually use this technology in important systems.”

A technology opposed to hackers is a programming style called formal confirmation . Unlike ordinary code, written informally and usually evaluated only by its performance, formally confirmed software looks like a mathematical proof: each statement follows logically from the previous one. The entire program can be tested with the same confidence as the mathematical theorem.

“You write a mathematical formula that describes the behavior of the program and, using a tool to verify the evidence, which verifies the correctness of this statement,” says Brian Parno , who studies formal confirmation and security at Microsoft Research.

The desire to create formally confirmed programs existed almost as long as the field of informatics itself. And it remained impossible for a very long time, but the achievements of the past ten years in the field of "formal methods" pushed this approach closer to common practice. Today, formal confirmation of software is studied in academic groups receiving funding, in US military projects and technology companies such as Microsoft and Amazon.

Interest grows with an increase in the number of vital tasks going through online. When computers existed in isolation in homes and offices, programming errors were only an inconvenience. Now they are opening security holes on the machines on the network, allowing any knowledgeable people to penetrate computer systems.

“In the twentieth century, when the program had a mistake, it was bad, it could fall, well, God bless her,” said Andrew Appel , a computer science professor at Princeton University and a leader in program confirmation. But in the 21st century, a mistake could open up “to hackers to gain control of the program and steal your data. From a tolerable mistake, it turned into a vulnerability, which is much worse, ”he says.

Kathleen Fisher

Dreams of perfect programs

In October 1973, Edsger Dijkstra came up with the idea of creating a code without errors. At night, while at a hotel at a conference, he suddenly thought about how to make programming more mathematical. Explaining the idea later, he said, "With an agitated mind, I got out of bed at 2:30 in the morning, and wrote for more than an hour." This material was the beginning of his fruitful 1976 book The Discipline of Programming, which, along with the work of Charles Hoare (like Dijkstra, who received the Turing Award ), defined views on the use of proof of correctness when writing programs.

But computer science did not follow this idea, mainly because for many years it seemed impractical, or even impossible, to determine the function of a program using the rules of formal logic.

A formal definition is a way to determine what the program is specifically doing. Formal confirmation is a way to prove without any doubt that the program code perfectly fits this definition. Imagine that you are writing a program for a mobile car taking you to the grocery store. At the level of operations, you determine the movements that the robot has to achieve the goal - it can turn, brake or accelerate, turn on or off at the beginning and end of the path. Your program will consist of these basic operations, built in the right order so that in the end you come to the store, and not to the airport.

The traditional way of testing the program is testing. Programmers give the program a lot of input data (unit tests) to make sure that it behaves as it should. If your program sets the algorithm for the movement of the robot, you can test its travel between many different points. Such testing ensures the correct operation of programs in most cases, which is sufficient for most applications. But unit tests do not guarantee that the software will always work correctly - you cannot test the program on all possible inputs. Even if the algorithm works for all destinations, there is always the possibility that it will behave wrong in rare conditions — or, as they say, in “boundary conditions” —and open a security hole. In real programs, such errors can be simple, for example, a buffer overflow, in which the program copies a little more data than necessary and overwrites part of the computer's memory. Such a seemingly harmless mistake is hard to eliminate, and it provides a hole for a hacker attack - a weak door hinge that opens the way to the lock.

“One hole somewhere in the software, and this is already a vulnerability. It’s very difficult to check all the paths for all possible inputs, ”says Parno.

These specifications are harder than going to the store. Programmers, for example, may need a program notarizing and stamping the date of receipt of documents in the order in which they were received. In this case, the specification should indicate that the counter is always incremented, and that the program should never allow the key used for the signature to leak.

In human language it is easy to describe. It is much harder to turn a specification into a formal language understandable to a computer - which explains the main problem of writing programs.

“To make a formal specification that is understandable for a computer is difficult in essence,” says Parno. “It’s easy at the top level to say“ don't let the password leak ”, but you need to think about how to turn it into a mathematical definition.”

Another example is a program that sorts a list of numbers. A programmer trying to formalize a specification for it may come up with the following:

For each item j in the list, make sure that j ≤ j + 1

But even in this formal specification - to make sure that each element in the list is less than or equal to the next one - there is an error. The programmer assumes that the output will contain the modified input. If the list is [7, 3, 5], he expects the program to return [3, 5, 7]. But the output of the program [1, 2] will satisfy the specifications - because “this is also a sorted list, but not the list that you probably expected,” says Parno.

In other words, it is hard to turn the idea of what a program should do into a formal specification that excludes any misinterpretation of what you want from the program. And the given example describes the simplest sorting program. Now imagine a description of something much more abstract, such as password protection. “What does this mean mathematically? For such a definition, it may be necessary to write a mathematical description of what “keep secret” means, or what it means to be safe for the encryption algorithm, says Parno. “We considered all these questions and advanced in their study, but they can be extremely tricky to implement.”

Block security

In fact, it is necessary to write both specifications and additional comments necessary for the software to make conclusions about the code. A program that includes its formal confirmation may be five times the size of a regular program written for the same purpose.

This burden can be a little easier with the help of suitable tools - programming languages and auxiliary programs that help programmers create bullet-proof programs. But in the 1970s they did not exist. “Many aspects of science and technology were not yet sufficiently developed, so by 1980 many areas of computer science lost interest in this,” said Appel, a leading researcher for the DeepSpec group, which is developing formally validated computer systems.

Despite the improvement in tools, another barrier arose in the way of confirming programs: there was no certainty that it was needed at all. Although the enthusiasts of the method said that small errors can lead to big problems, the rest drew attention to the fact that mostly computer programs work quite well. Sometimes they fall, yes - but losing a little unsaved data, or sometimes restarting the system seems like a small price to pay for the lack of the need to painstakingly translate every piece of the program into the language of formal logic. Over time, even those who were at the origins of software confirmation, began to doubt its usefulness. In the 1990s, even Hoar admitted that specifications can be a time-consuming solution to a non-existent problem. In 1995 he wrote:

Ten years ago, researchers of formal methods (and among them I was wrong most of all) predicted that the world of programming would gratefully accept the help of formalization ... It turned out that the world did not suffer much enough from the problems that our research was trying to solve.

Then the Internet appeared, which made for errors in the code the same as air travel made for infections: when all computers are connected with each other, inconvenient but tolerable errors can lead to increasing security problems.

“We did not fully understand this,” says Appel. “There are programs open to all Internet hackers, and if there is an error in such a program, it can become a vulnerability.”

By the time researchers began to realize the seriousness of the security threats presented by the Internet, the confirmation of the programs was ready to return. To begin with, researchers have made progress in the fundamental parts of formal methods. This is an improvement in support programs such as Coq and Isabelle; development of logical systems ( dependent type theory ) providing a development platform that helps computers evaluate code; "Operational semantics" is a language with the right words to express what is required of the program.

“Starting with the specification in human language, you are confused with ambiguities,” says Janette Wing , vice president of Microsoft Research. - Any natural language contains ambiguities. In the formal specification, you write the exact description on the basis of mathematics, to explain what you want from the program. "

Janette Ving from Microsoft Research

In addition, formal method researchers have revised their goals. In the 1970s and 1980s, they wanted to create fully proven computer systems, from electronic circuits to programs. Today, most researchers are working on small, but the most critical or vulnerable parts of a system — for example, operating systems or cryptographic protocols.

“We are not saying that we will prove the correctness of the entire system by 100%, up to electronic circuits,” says Wing. - Such statements are ridiculous. We are precise in what we can or cannot do. ”

The HACMS project shows how greater security guarantees can be achieved by describing one small part of a computer system. The first goal of the project was to create an unbreakable quadrocopter for entertainment. The software that came with the quadric was monolithic - that is, by gaining access to one part of it, the hacker gained access to all the others. For two years, the HACMS team was engaged in splitting the code in the control computer into parts.

The team rewrote the software architecture using what Fisher calls “high-confidence building blocks” —the tools that allow the programmer to prove the purity of the code. One of them provides evidence that by gaining access to one of the parts, it is impossible by escalating privileges to reach the rest.

Later, the programmers installed such software on "Bird". In the test with the Red Team, they were given access to one of the parts of the system that controlled some part of the helicopter, for example, a camera — but not the main functions. Failure of hackers was mathematically guaranteed. “They proved in a machine-verifiable way that hackers would not be able to go beyond the limits of a system partition, so this is not surprising,” says Fisher. “This coincides with the theorem, but it is always useful to check.”

In the year following the audit, DARPA applied the project’s tools and technologies to other areas of military technology, such as satellites and automated trucks. New initiatives coincide with the way formal confirmation has spread over the past ten years: each successful project encourages the next. “Excuses on the complexity of this principle no longer pass,” says Fisher.

Check Internet

Safety and reliability are the two main goals that inspire formal methods. Every day the need for improvements is becoming increasingly apparent. In 2014, a small software error, which a formal description would be like, would have opened the way for a Heartbleed error that threatened to break the Internet. A year later, a couple of “legal” hackers confirmed the worst concerns about cars connected to the Internet, gaining remote control over the Jeep Cherokee .

With higher stakes, formal methods researchers are setting increasingly ambitious goals. The DeepSpec team, led by Appel (also working on HACMS), is trying to build a complex, fully proven system like a web server. If successful, the project, which received a grant from the State Science Foundation of $ 10 million, will bring together many smaller successful projects of the last decade. Researchers have created several proven secure components like the operating system kernel. “What hasn’t been done yet is the integration of these components across specification interfaces,” says Appel.

In Microsoft Research, programmers have two ambitious projects. The first, Everest, seeks to create a confirmed version of HTTPS, a protocol that ensures the security of web browsers, which Wing calls the "Achilles heel of the Internet."

The second is the creation of confirmed specifications of complex cyber-physical systems, such as UAVs. The project has significant difficulties. While conventional programs operate with separate, unambiguous steps, programs that control the movement of drones use machine learning to make probabilistic decisions based on a continuous flow of environmental data. It is far from obvious how you can make logical decisions with such a degree of inaccuracy, or describe them in the form of a formal specification. But only in the last ten years, formal methods have advanced quite far, and Wing, managing the project, is optimistic that researchers will be able to solve all these problems.

Source: https://habr.com/ru/post/398825/

All Articles

Unbreakable code exists

Dreams of perfect programs

Block security

Check Internet

More articles: