📜 ⬆️ ⬇️

Cipher Copiale Cipher XVIII century decrypted using statistical machine translation

More than 60 years ago, Warren Weaver, a pioneer in the field of machine translation, first proposed the use of cryptanalysis to interpret foreign language texts.

In the famous 1947 letter to mathematician Norbert Wiener, he wrote: “It is only natural to ask ourselves whether the translation problem can be considered as a cryptography problem. When I see the text in Russian, I say: “In fact, it is written in English, but coded with some kind of strange characters. Now I will try to decipher. ”

This conjecture resulted in the development of a whole generation of statistical machine translation programs, such as Google Translate - and, by chance, the emergence of new tools for analyzing historical ciphers. ”
')
Now a group of Swedish and American linguists used the techniques of statistical machine translation to break into one of the most difficult ciphers: Copiale Cipher , a handwritten 105-page manuscript from the late 18th century. Scientists have published their work on the eve of the Conference of Computer Linguistics Association in Portland.



Discovered among the scientific archives of East Germany, the volume in a skilful binding of gold and green brocade contains 75,000 characters of text, in an incomprehensible combination of mysterious symbols and Latin fonts. The name of the manuscript Copiale Cipher is assigned to one of only two unencrypted inscriptions that are present in the document.

Kevin Knight, a specialist from the Institute of Information Sciences at the University of Southern California, together with colleagues Beata Medyashi and Christian Schäfer from Uppsala University (Sweden) managed to decrypt the first 16 pages. They contain a detailed description of the ritual of the secret society, which was interested in eye surgery and ophthalmology.


The first page of the manuscript


The second and third pages of the manuscript

Work began this year as a weekend hobby, Dr. Knight said during an interview and added: “I don't have much experience in cryptography. My classes are mainly related to computational linguistics and machine translation. ”

Not knowing the original language, the researchers made some blind assumptions before they began to test their guesses. First, they assumed that all information is contained only in Latin characters ( in the illustration), that is, tried to just ignore the abstract characters. They took Latin characters and checked the text in 80 languages ​​of the world.

When this approach failed, scientists discovered that the text was actually created with a substitution cipher — one in which each character of the original is replaced with a different character. And they suggested that the original language is German, since the manuscript was found in Germany.

In the end, they came to the conclusion that the Latin characters are in fact the so-called "empty values", which are designed to mislead the decoder, and that some special characters denote spaces between words. The second breakthrough was the discovery that a colon means a doubling of the previous consonant.

After that, the researchers used well-known machine translation techniques as analyzing the expected frequency of characters to suggest which characters are equivalent to the letters of the German alphabet. First of all, they figured out which combination of characters corresponds to the combination of ch .



When it turned out, the frequency analysis suggested which symbol corresponds to the letter t , which in German most often follows the combination ch . And so on, step by step, all the other symbols were selected. Scientists could not decipher only large characters ( ), which are likely to be the code names of classified names and organizations.



“It turned out that we can use many linguistic methods for cryptanalysis,” says Dr. Knight.

The result was greatly appreciated by other experts: “Deciphering the Copiale Cipher is an exquisite work of Kevin Knight and his colleagues,” said Nick Pelling, a British software developer and security specialist who runs Cipher Mysteries blog on news in the field of cryptography.

But although this cipher was a significant success, Dr. Knight and his colleagues cannot rest on their laurels. They are disappointed to say that many ancient books and whole languages, which are of great historical value, still remain undeciphered.

Copiale Cipher is interesting only to historians who study the dissemination of political ideas. The secret societies were in vogue in the 18th century, says Dr. Knight, and they to some extent influenced the events of the French Revolution and the war of independence of the United States. Recently Kevin Knight sent the decoded text to Copiale Andreas Onnerfors (Andreas Onnerfors), a historian from the University of Land (Sweden), an expert on secret societies.

“When he saw the book and the decrypted version, he was extremely excited,” says Dr. Knight. - He found a political commentary at the end of the text, which referred to inalienable human rights. It's quite interesting that such things are found in such an early document. ”

Recent examples of the ciphers still uncovered are the serial killer's letters, nicknamed Zodiac , sent to the California police in the 1960s and 1970s and Kryptos with encrypted text, located in front of the CIA's central office in Langley, the text of which is only partially decoded.

But the most important mystery for the cryptographic community, the real “Cup of Grail” of the cryptographic world, remains the Voynich manuscript - a mysterious book written about 600 years ago by an unknown author in an unknown language using an unknown alphabet. It consists of 240 richly illustrated pages with text that challenges the best cryptographers in the world. For a long time, experts considered this a hoax, but a recent radiocarbon analysis confirmed that the document was created at the beginning of the 15th century.

Together with a colleague from the University of Chicago, Dr. Knight this year published a detailed analysis of the manuscript, which does not provide an answer to the question of mystification, but provides evidence that the Voynich manuscript contains some structures of natural language.

“This is the most mysterious manuscript in the world,” says Kevin Knight. - It is chock-full of patterns, and the one who created such a thing, spent a huge amount of time on it. So it seems to me that this is probably a cipher. "

Source: https://habr.com/ru/post/131285/


All Articles