📜 ⬆️ ⬇️

Unexpected Turing fullness everywhere

A catalog of software constructs, languages ​​and APIs that are unexpectedly Turing complete; implications for safety and reliability. Application: how many computers in your computer?

Any rather complex C or Fortran program contains a rewritten, unspecified, buggy, and slow implementation of half of Common Lisp . - The tenth rule of Greenspan

Turing completeness (TC) is a property of the system, with some simple representation of input and output, to implement any computable function.

Turing completeness is a fundamental concept in computer science. It helps to answer many key questions, for example, why it is impossible to create an ideal anti-virus program. But at the same time, it is amazingly common . It would seem that a computer system is difficult to achieve such universality to perform any program, but it turns out the opposite: it is difficult to write a useful system that does not immediately turn into full Turing. It turns out that even a small control over the input data and converting them into a result, as a rule, allows you to create a turing-complete system. It can be fun, useful (although not usually ), harmful or extremely unsafe, and a real gift for a hacker (see about “language safety” , which studies methods of hacking “strange machines” 1 ). Amazing examples of such behavior remind us that Turing completeness lurks everywhere, and protecting the system is extremely difficult.

“Too powerful” programming languages ​​can also trigger unpleasant DoS attacks. Afl fazzer found such a roff in OpenBSD that it is capable of generating an infinite loop , abusing some string substitution rules.
')
Probably, these unexpected examples of turing-complete systems are best viewed as a subset of the “discovered” or “found” esoteric programming languages . So an extraordinarily minimalist in its essence FRACTRAN is not considered 2 , as well as the specially obfuscated Malbolge language (where writing a trivial program will take years), because it is a specially designed esoteric PL. Also, “Life” is not part of our subset, because the questions about turing completeness appeared immediately after its release, and its recognition by Turing was not a surprise. And given the complexity of networks with routing and packet switching, it is not surprising that you can build a cellular automaton or program logic circuits on these networks, and planning / validating air tickets is not only NP-hard and even EXPSPACE-hard, but completely unsolvable (due to complex airline regulations).

Many configurations, special languages, tools, or complex games, as it turns out, violate the rule of least power and “accidentally become Turing-complete” , like MediaWiki templates , sed or multiple regexp / find-replace commands in an editor. In general, any form of replacing strings or templating, or compiling on the fly with high probability is a turing-complete system itself or when repeated, as they often support lambda calculus or rewriting of terms of a language or a label, for example, esoteric languages ​​" /// " or thue .

XSLT , Infinite Minesweeper , Dwarf Fortress 3 , Starcraft, Minecraft , Ant , Transport Tycoon , C ++ templates and Java generalizations , DNA calculations and so on - all of these are Turing-complete systems, and this is also not surprising. Many games support scripts to simplify development and custom mods. Therefore, to make the game turing-complete elementary: it is enough to include the syntax to call more famous languages, such as Perl.

Turing completeness may simply be a little-known part of the standard format. Probably, in our time many people do not know that TrueType and many fonts are PostScript programs on stack machines, similar to ELF metadata and DWARF debugging information . Or that some music formats go beyond MIDI , support scripts, and need interpretation. If you know about the turing of the fullness of fonts, it is no longer surprising that the Turing documents are full of TeX, which naturally causes many serious and interesting security vulnerabilities in fonts and media, such as BLEND or Linux exploits SNES and NES . In other formats like PDF, just a terrible amount of vulnerabilities 4 . Again, outstanding achievements such as the creation of a small Turing machine from Lego or Domino 5 cubes are not considered, since we have long known how mechanical computers work.

On the other hand, the direction of computer security research called “weird machines” often reveals truly astounding turing-complete systems. Moreover, in different people they cause surprise in varying degrees: one seems unusual that others are not surprised.


The following systems may be accidentally turing-complete:


see also



Links



application


How many computers are there in your computer?


Some get bogged down in disputes about weird machines or about how “big” an AI agent will become: one such, two, ten or millions will be created. It doesn't matter, since this is just an organizational question. In fact, the inputs and outputs of the system are important: how efficient is the system as a whole, and what resources does it consume? No one cares if Google runs on 50 supercomputers, 50,000 mainframes, 5 million servers, 50 million embedded / mobile processors, or a combination of all of the above . It doesn't matter that Google uses a variety of chips: from self-made “tensor processors” to unique silicon processors (Intel implements them on chips for Xeon processors for a number of major customers), FPGA, GPU, CPU to even more exotic equipment like D-Wave quantum computers. It is only important that it remain competitive and can provide services for a reasonable fee. In the end, today, a supercomputer usually looks like a large number of servers in racks with a huge amount of GPUs and unusually high-speed InfiniBand connections. That is, the supercomputer is not so different from the data center, as you might think. Any of the listed equipment can support numerous strange machines, depending on its internal dynamics and connectivity.

Similarly, any AI system can be implemented as a single giant neural network or a multitude of individual neural networks operating asynchronously, or as a heterogeneous set of microservices, or as a “society of reason”, and so on. All this is not particularly important. In terms of complexity or risk, it’s not so important how the system is organized as long as it works. The system can be seen on many levels, each of which is equally invalid in and of itself, but useful for different purposes in a common system.

Here is an example of a badly defined question: how many computers are in your pockets and on the table now? How many computers are there in your “computer”? Think only one? Let's take a closer look.

It's not just about the CPU: nowadays, transistors and processor cores are so cheap that now it often makes sense to allocate separate cores for real-time tasks, for increasing performance, for security, to avoid loading the main OS, for compatibility with old architecture or existing software package. Just because a DSP or core is faster to program than to create a specialized ASIC, or because this is the simplest possible solution. In addition, many of these components can be used as computational elements, even if they are not designed or hide this functionality altogether.

So:


So in a regular smartphone or desktop computer there will be from fifteen to several thousand computers in the sense of turing-full devices. Each of them can be programmed, it has enough power to run many programs and can be used by an attacker to monitor, exfiltrate, or attack the rest of the system.

There is nothing unusual in the historical context, because even the very first mainframes usually included several computers where the main computer performs batch processing, and auxiliary computers provide high-speed I / O operations that would otherwise interfere with the main machine with their own interrupts.

In practice, apart from the information security community (since all these computers are insecure and therefore useful for the NSA and virus writers), all other users do not care that insanely complex systems are hidden under the hood of our computers, which are more accurately regarded as a colorful menagerie from hundreds computers, awkwardly connected with each other (it is not clear, “network is a computer” or “computer is a network” ...?). The user perceives and uses this as one computer.



1. The active area of ​​research is the creation of languages ​​and systems that are carefully designed and are not guaranteed to be turing-complete (for example, totally functional programming ). Why make so much effort to create a language in which it is impossible to write many programs? The fact is that Turing completeness is closely related to Godel's incompleteness theorem and Rice's theorem.. Therefore, if we allow TC, then we lose all possible provability properties. On the contrary, different useful things are easily proved in a non-Turing language: for example, that a program is complete, type safe or not, that it can be easily converted into a logical theorem, that it consumes a limited amount of resources, that the implementation of the protocol is true or equivalent to another implementation. It is easy to prove that there are no side effects and that the program can be converted into a logically equivalent, but faster option (this is especially important for declarative languages ​​like SQL, where the ability of the optimizer to convert queries is the key to acceptable performance. Although, of course, you can do amazing things with SQL, such asgradient descent for machine learning models , and some SQL extensions make it turing-complete anyway, allowing you to either encode a looping system of tags , or modelDSL , or call PL / SQL , etc.

Here is some literature about strange machines:




2. Although linear neural networks exploit floating-point mode with rounding to zero to encode potentially turing-complete behavior (for RNN), but this is not noticeable in normal operation, which is also a random turing-complete behavior and a good example of a secure language.

3. Dwarf Fortress gives clockwork, so Turing completeness is not surprising. But water is implemented as a simple cellular automaton, so there are even more ways to get turing completeness! Now the game wiki names four potential ways to create logic gates: fluids, clockwork mechanisms, mine carts, and creature / animal logic gates with doors and pressure sensors.

4. The full specification of PDF is solely inflated. For example, in a simple PDF viewer that supports enough PDF specifications, like Google Chrome’s browser, you can play Breakout (because PDF includes its own strange subset of JavaScript). The official Adobe PDF viewer supports up to 3D CAD functionality.

5. See the domino logic gates on the Think Math and the 4-bit adder domino demo .

Source: https://habr.com/ru/post/429602/


All Articles