Steganography in the XXI century. Goals Practical use. Relevance

I think everyone has ever heard of steganography. Steganography (τεγαννός - hidden + γράφω - I write, literally “in secret” ) is an interdisciplinary science and art to transmit hidden data, inside other, non-hidden data. Hidden data is usually called a stego message , and the data inside which the stego message is located is called a container .

Habrahabr had many different articles on specific information steganography algorithms, for example DarkJPEG , TCP TCP , and of course the LSB algorithm (for example, LSB steganography , Steganography in GIF , Kotfuskatsiya executable .net code ) loved by all students during course design

Steganographic ways countless. At the time of this writing, at least 95 patents on steganography have already been published in the USA, and in Russia at least 29 patents . Most of all I liked the Kursh K. patent and Lav R. Varchney “Food steganography” ( “Food steganography” , PDF )
')
Picture from the "food" patent to attract attention:

However, having read a decent amount of articles and works on steganography, I wanted to systematize my ideas and knowledge in this area. This article is purely theoretical and I would like to discuss the following questions:

The goals of steganography are actually three, not one.
Practical application of steganography - I counted 15.
The place of steganography in the 21st century - I think that from a technical point of view, the modern world has already been prepared, but “socially” steganography is still “late”.

I tried to summarize my research on this issue. (This means that a lot of text)
I hope for reasonable criticism and advice from the habrosoobschestva.

Goals of steganography

The goal is an abstract task, in relation to which a scientific theory and methodology for achieving this goal is being developed. Do not confuse the purpose and application . The goal is extremely abstract, as opposed to application .

As I said, in steganography there are three goals.

Digital Fingerprints (Digital Fingerprint)

This type of steganography implies the presence of various steganographic message labels for each copy of the container. For example, AC may be applicable to protect an exclusive right . If using an algorithm, the adversary can extract the target center from the container, then it is impossible to identify the enemy, but until the enemy learns to fake the target center, he will not be able to distribute the protected container without detection.

Thus, when extracting the DH, a third party (i.e., an adversary) can pursue two goals:

extraction of the target center from the container ( “weak target” );
substitution of one central organ to another central organ ( “strong goal” ).

An example of an AC is the sale of electronic books (for example, in * .PDF format). When you pay for the book and send it to the recipient, you can sprinkle information about the e-mail in * .pdf ; IP; user input, etc. Of course, these are not fingerprints and DNA analysis, but, you see, this is better than nothing. Perhaps in Russia, due to a different culture and a different, historically established attitude to the exclusive right, this use of steganography is irrelevant; but, for example, in Japan, where they can plant for downloading torrent files, the use of steganographic central organ is more likely.

Steganographic Watermarks (SVZ) (Stego Watermarking)

In contrast to the CO, the SVZ implies the presence of identical labels for each copy of the container. In particular, the SVZ can be used to confirm copyright. For example, when recording to a video camera, it is possible to intersperse information about the recording time, camera model and / or the name of the camera operator to each frame.
If the footage falls into the hands of a competing company, you can try to use the SVZ to confirm the authorship of the entry. If the key is kept secret from the camera owner, then with the help of the SVZ, you can confirm the authenticity of the photo and / or video images. By the way, our colleague, Dmitry Vitalyevich Sklyarov , successfully broke steganography on some models of the Canon camera . The truth of the problem was the hardware, Dmitry Vitalyevich did not touch the lander himself, nevertheless he steganographically "proved" Stalin's authenticity with the iPhone.

Photo of Stalin with an iPhone by D.V. Sklyarovy (with correct SVZ)

Hidden data transmission (SPD)

This is the "classic" goal of steganography, known since the days of Aeneas Tactic ( Αινείας ο Τακτικός , see his work, containing simple steganographic techniques: "On the transfer of the siege" ). The task is to transfer data in such a way that the adversary does not guess the very fact of the appearance of the message.

In modern Russian-language works devoted to steganography, the term CEH (Digital Watermarks) is often used. This term implies either SVZ, or CO. (And sometimes the SVZ and the CO at the same time, and even in one article!) Nevertheless, when implementing the CO and the SVZ, the problems and tasks are fundamentally different! Indeed, the SVZ on all copies of an electronic document is the same, and the CO on all copies of documents is different. For this reason, for example, a collusion attack is fundamentally impossible in the SVZ! At least for this reason, it is necessary to distinguish between SVZ and DH. Everyone who is going to work in the field of steganography, I strongly advise you not to use the term CEH in your speech.

This, seemingly obvious thought, still puzzles many. A similar point of view about the need to distinguish between the SVZ and the DH was expressed by such well-known in narrow circles “steganographers” like Kaschen (Cachin), Petikola (Petitcolas), Katzenbeiser (Katzenbeisser).

For each of these three goals, you should develop your own criteria for the durability of the steganographic system and formal information-theoretical models to achieve them, because The meaning of the use of steganography is different. About the fundamental difference between the SVZ and DH is written above. But maybe it makes sense to combine SPD with DH or with SVZ? Not! The fact is that the meaning of SPD is the hidden data transfer itself , and the CO and SVZ are designed to protect the container itself . Moreover, the very fact of having an AC or SVZ may not be secret, unlike most tasks for SPD. In particular, for this reason, talking about the possibility of building a perfect stegosystem (according to Cashen) for the implementation of an AC or SVZ for most practical tasks has no practical meaning.

The international journal Springer: Information Hidding generally proposes only SPD for steganography, since the CO and the SVZ do not require hiding the fact of data transmission. For example, you know that a bill of 100 rubles is protected. Some mechanisms are known to you, some are known only to specialists, and some contain state secrets. But the fact that there are some secret mechanisms are known to all; it is not known just how and which technologies additionally protect the paper bill ...

Calling steganography only the purpose of the LDS in Springer magazine seems reasonable, but the Russian-language term " Information Concealment " has not yet taken root in Russian articles and dissertations. Therefore, in this work, the term Springer Information Hiding will mean all steganography, and the term Springer steganography will mean only one goal of steganography — SPD.

Practical application of steganography

Having discussed the objectives, we turn to practical applications. I found 15 tasks for which steganography can be relevant. If you disagree with something or I have missed something, I will be glad to receive your comments! Feel free to write!

1. Imperceptible information transfer (PDS)

The most obvious is that the first comes to mind. Unlike cryptographic methods (which are secrets, but not secretive), steganography can be used as a method of unobtrusive information transfer. This constitutes the “classic practical application” of steganography, therefore this goal is in the first place.

2. Hidden storage of information (SPD)

This goal of steganography is very similar to the previous one. Only in this case, steganography is used not to transmit, but to store any information, the discovery of the very fact of which (even if it is even encrypted) is undesirable for the user. Obviously, this task is realizable on data carriers, but not in communication channels. Moreover, redundancy on many carriers can be incredibly large. For example, the total amount of data (including RLL codes) that can be burned onto a CD is 1,828 MB of data. This is a huge redundancy that can be used to hide data!

If Gena Ryzhov from the film } {0 @ ) H would have thought about this and would not be too lazy to popay and pooprogat a little, he would hardly have kept CDs with “compromising” software in a cactus pot. I think the hacker Gennady would simply intersperse the data in the ECC optical discs, and the disks with the photos of the seals themselves would be stored in the open! Agree, it is much better than a cactus pot! :)

3. Non-declared information storage (SPD)

Many information resources allow you to store data only of a certain type. For example, the YouTube portal allows you to store only video information in the formats MOV, MPEG4, AVI, WMV, MPEG-PS, FLV, 3GPP, WebM . However, you can use steganography to store data in other formats. I do not argue that in the conditions of existence of various resources like Yandex Disk, this goal may seem strange. Most likely, there is no practical significance; just just4fun and fun coursework for a student.

However, hid.im allows users to hide .torrent files inside PNG images. This is how Michael Nutt, the creator of the project, commented:

This is an attempt to make torrents more robust. The difference is that there is no need for an indexing site to store your torrent file. Many forums allow downloading images, but no other file types.
(@Mithgol Hid.im converts torrents to PNG images )

There is also the StegTorrent project, which, unlike the online hid service, requires installation.

In February 2015, Hacker published a note " Tens of thousands of MongoDB databases are available via the Internet ." In principle, these “jambs”, among other things, can be used for undeclared data storage. (Another analogue of a cactus pot?)

Tens of thousands of MongoDB databases are available via the Internet (Hacker, 02/12/2015)

Three students from the IT Security, Privacy and Reporting Center (CISPA) at the University of Saar found 39890 MongoDB databases accessible via the Internet. Some belong to large companies and contain personal and financial information for millions of people.

MongoDB is a popular cross-platform document-oriented open source database. It is used by Craigslist, eBay, SourceForge, Viacom and many others.

As you might guess, the students used to search for the famous search engine Shodan, which scans the ports and indexes information that is not available through other search engines. In particular, we looked for servers with an open TCP port 27017 , which is specified by default in the MongoDB configuration.

curl $SHODANURL |grep -i class=\"ip\" |cut -d '/' -f 3 \ |cut -d '"' -f 1|uniq >db.ip

“Without any special tools and without circumventing any defense mechanisms, we could read and write information into these databases,” the authors write.

The largest finds are the database of one of the French Internet providers and the cellular operator with the addresses and phone numbers of millions of customers, as well as the base of the German online store, which in addition contains payment information. In general, the location of the victims and the size of their potential losses are shown on the map above (clickable).

These companies, information security departments, CERT centers and MongoDB developers are notified of the vulnerability.

See the published report (pdf) for more detailed research results with protection recommendations.

4. Protection of exclusive rights (AC)

A possible application is a holographic multi-purpose disk ( Holographic Versatile Disc, HVD ). (However, there is a point of view that this technology was originally "stillborn") The HVBs currently being developed can contain up to 200 GB of data per cartridge. These technologies are intended to be used by television and radio broadcasting companies for storing video and audio information. The presence of an AC within the correction codes of these disks can be used as the main or additional means to protect the license right.

As another example, as I wrote earlier, you can cite an online sale of information resources. These can be books, movies, music, etc. Each copy must contain an AC for identification of the person (at least indirectly) or a special label for checking whether it is licensed or not licensed.

In 2007-2011, amazon.com attempted to implement this goal. Quote artty from the article "Protection" mp3 files on amazon.com :

If in Russian: the downloaded file will contain a unique identifier for the purchase, the date / time of the purchase, and other information (...).

Download the data in the forehead did not work (Amazon swears and says that it can only sell them in the US). I had to ask American acquaintances and after a while I had the same song in my hands, but independently downloaded by two different people from different accounts in the Amazon. By the form, the files were absolutely identical, the size coincided up to a byte.

But since Amazon wrote that it included a download identifier in each mp3 and decided to check some of the data for the two existing files and found the differences immediately.

5. Copyright Protection (SVZ)

In this case, one copy protects each copy of the content. For example, it may be a photo. If the photograph is published without the permission of the photographer, saying that he is allegedly not the author of this work, the photographer may try to prove his authorship with the help of steganography. In this case, each photo should be interspersed with information about the serial number of the camera and / or any other data that allows you to "bind" the photo to a single camera; and through the camera, the photographer may try to indirectly prove that he is the author of the picture.

6. Protection of authenticity of documents (SVZ)

The technology may be the same as for copyright protection . Only in this case, steganography is used not to confirm authorship, but to confirm the authenticity of the document. A document not containing an SVZ is considered “not real”, i.e. fake. The above-mentioned Dmitry Sklyarov just solved the opposite problem. He found the vulnerability of the Cannon camera and was able to fake the authenticity of Stalin's photo with the iPhone.

7. Individual imprint in SEDO (CO)

In the electronic document management system ( EDMS ), you can use an individual fingerprint inside * .odt, * .docx and other documents when working with them by the user. To do this, special applications and / or drivers must be written that are installed and working in the system. If this task is completed, then using an individual print can identify who worked with the document and who did not. Of course, steganography in this case is stupid to do the only criterion, but as an additional factor for identifying the participants of the work with the document, it may be useful.

8. Watermark in DLP systems (SVZ)

Steganography can be used to prevent information leaks ( Data Leak Prevention , DLP). Unlike an individual fingerprint in an SEDO , in this application of steganography, when creating a document containing a confidential character, a certain label is interspersed. In this case, the label does not change, regardless of the number of copies and / or revisions of the document.

In order to extract the label, the stego key is required. Steglokey, of course, kept secret. The DLP system, before approving or refusing to issue a document to the outside, checks for the presence or absence of a watermark. If the sign is present, the system does not allow sending the document outside the system.

9. Hidden transmission of the control signal (SPD)

Suppose that the recipient is a system (for example, a satellite); and the sender is the operator. In this case, steganography can be applied to deliver any control signal to the system. If the system can be in different states and we want the enemy to not even realize that the system has moved to another state, we can use steganography. Using only cryptography, without steganography, can give the enemy information that something has changed and provoke him into unwanted actions.

I think no one will argue that in the military sphere this task is incredibly relevant. This task may be relevant for criminal organizations. Accordingly, law enforcement agencies should be armed with a specific theory on this issue and promote the development of programs, algorithms and systems to counter this use of steganography.

10. Steganographic botnet network (SPD)

If to be a pedant, then this application can be considered a special case of a hidden transmission of a control signal . However, I decided to designate this application separately. My colleague from TSU sent me a very interesting article by some Shishir Nagaraja , Amir Houmansadr , Pratch Piyawongwisal , Vijit Singh , Pragya Agarwal and Nikita Borisov 'and "Stegobot: a covert social network botnet" . I am not a botnet specialist. I can not say it's a crap or an interesting feature. I will hear the opinion of the habrasoobshchestva!

11. Confirmation of the accuracy of the information transmitted (AC).

The stego-message in this case contains data confirming the correctness of the transferred data of the container. As an example, this may be a checksum or a hash function (digest). The task of confirming the accuracy is relevant if the adversary has the need to forge container data; for this reason, this application should not be confused with the protection of the authenticity of documents ! For example, if we are talking about a photo, then the protection of authenticity is proof that this photo is real, not forged in Photoshop. We kind of protect ourselves from the sender (in this case, the photographer). In case of confirmation of authenticity, it is necessary to organize protection against a third party (man in the middle), which has the ability to forge data between the sender and receiver.

This problem has many classic solutions, including cryptographic ones. Using steganography is another way to solve this problem.

12. Funkspiel (“Radio Game”) (SPD)

From Wikipedia :

Definition of Funkspiel

Radio game (tracing with him. Funkspiel - "radio game" or "radio play") - in the practice of intelligence of the XX century, the use of radio communications for disinformation of the enemy intelligence agencies. For a radio game, they often use captured counterintelligence and a recruited reconnaissance radio operator or a double agent. The radio game makes it possible to imitate the activities of the destroyed or never-existing intelligence network (and thus reduce the enemy’s activity in bringing in new intelligence officers), pass on the enemy's disinformation, obtain information about the intentions of his intelligence agencies and achieve other intelligence and counterintelligence goals.

The possibility of failure and the subsequent radio game was taken into account when planning intelligence operations. Various signs in the radiogram were stipulated in advance, by the presence or absence of which one could understand that the radio operator is working under the control of the enemy.

In this case, the stego message contains data indicating whether the container should be taken seriously. It can also be a hash function or just a pre-set sequence of bits. It can also be a hash function from the start time of the transfer (In this case, to eliminate the problem of time out of sync between the sender and the recipient, the time should be taken to the nearest minutes or even hours, and not to the second or milliseconds).

If the stego message fails validation, then the container should be ignored by the recipient, regardless of its content. In this case, steganography can be used to disinform the enemy. For example, the container may be a cryptographic message. In this case, the sender, wishing to lead the enemy into error, encrypts the data with a certain known enemy with a compromised cryptographic key, and the stego message is used so that the recipient does not accept the false container.

Suppose that the enemy has the ability to destroy the target center. In this case, the funkspiel can be used against the interests of the sender. The recipient, not finding the label, will not ignore the received container. Perhaps in some practical solutions it is reasonable to use funkspiel together with validation . In this case, any information that does not contain a valid label is ignored; and, accordingly, for a radio game, you should simply not intersperse the label in the message.

13. Inalienability of information (SVZ)

There are a number of documents for which integrity is important. It can be done by backing up data. But what to do if there is a need to have documents in such a way that it is impossible to separate one information from other information? An example is medical imagery. For the sake of reliability, many authors suggest incorporating information about the name, surname and other patient data inside the images. See for example the book by Stefan Katzenbeisser and Fabien AP Petitcolas " Information Hiding Techniques for Steganography and Digital Watermarking ":

An excerpt about the use of steganography in medicine. from the book '' Information Hiding Techniques for Steganography and Digital Watermarking ''

The use of hiding techniques. They use standards such as digital imaging, such as a digital image, such as a digital image, such as a physician. It could be a useful safety measure. It is a clear question. revealing that this might be a feasible. Another emerging technique is in DNA sequences. It could be used in the field of molecular biology or genetics.

Similar reasoning can be made about modern astronomy. Here is a quote from Russian astronomer Vladimir Georgievich Surdin ( link to video ):

I envy those who are now entering the science. For the past 20 years, we [astronomers] have, in general, been trampling on the spot. But now the situation has changed. In the world several telescopes of absolutely unique property are built. They see almost all the sky and huge amounts of information are received every night. Suffice it to say that in the previous 200 years, astronomers discovered several thousand objects. (...) It's over 200 years old! Today, every night we discover three hundred new objects of the solar system! This is more than a man with a pen could write to the directory. [per day]

Just think, every night 300 new objects. It is clear that these are various small space asteroids, and not the discovery of new planets, but still ... Indeed, it would be reasonable to embed information about the shooting time, location and other data directly into the image? Then, when exchanging pictures between astronomers, scientists could always understand where, when and under what circumstances a particular picture was taken. You can even intersperse information without a key, considering that there is no enemy. Those. use steganography only for the sake of "non-alienation" of the images themselves from additional information, hoping for the honesty of users; perhaps it would be much more convenient than accompanying each snapshot with information.

From the world of computer games you can bring WoW . If you take a screenshot of the game, the SVZ , containing the user name, the time taken to take a screenshot (up to a minute and IP) server address, is automatically embedded .

14. Steganographic distraction (?)

As the name implies, the task of steganographic distraction is to divert the attention of the enemy. This task can be set if there is any other reason for using steganography. For steganographic distraction, it is necessary that the generation of stegocontainers be significantly “cheaper” (in terms of machine and time resources) than the detection of steganography by the enemy.

Roughly speaking, steganographic distraction is a bit like DoS and DDoS attacks. You distract the enemy’s attention from containers that really contain something of value.

15. Steganographic Tracking (SPD)

This application is somewhat similar to paragraph 7 of the individual imprint in the SEDO , only the goal is different - to catch an intruder who "merges" the information. From the real world, you can give an example of marked banknotes (" marked money"). They are used by law enforcement agencies so that a criminal who has received money for any illegal activity cannot then declare that he had the money before the transaction.

Why not learn from the experience of "real colleagues" in our virtual world? Thus, steganographic tracking resembles something like a honeypot .

Forecast about the future of steganography in the first quarter of the XXI century

Having read fifty different articles on the quilting and several books, I would venture to express my opinion on steganography. This opinion is only my opinion and I do not impose it on anyone. Ready for constructive criticism and dialogue.

Thesis. I believe that the world is technically ready for steganography, but in a “cultural” sense, the modern information society has not yet matured. I think that in the near future (2015–2025), what is possible in the future will be called the “ steganographic revolution ” ... Maybe this is a somewhat arrogant statement, but I will try to justify my point of view in four points.

The first . At the moment there is no single theory of steganography. A top secret stegosystem (according to Kashen) is certainly better than nothing, but in my opinion this is a black and white photo of the tail of a spherical virtual horse in a vacuum ... Mittelkholzer tried to slightly improve the results of Christian Cachin, but for now this is a very lengthy theory.

The lack of a unified theory is an important brake. It is mathematically proven that the Vernama cipher (= "one-time notepad") cannot be cracked; for this reason, the connection between V.V. Putin and Barack Obama are done precisely with the help of this algorithm. There is a definite theory that creates and studies abstract (mathematical) cryptographic objects (Bent functions, LFSR, Feisteil cycles, SP-sets, etc.). In steganography, there is a zoo of terms and models, but most of them are unfounded, not fully understood, or far-fetched.

Nevertheless, certain progress in this direction already exists. Modest attempts are already being made to use steganography, if not as the main or even the only solution, then as an auxiliary tool. A huge shift in theory has occurred over the past fifteen years (2000-2015), but I think about this you can write a separate post, in a few words it is difficult to say.

The second . Steganography is an interdisciplinary science! This is the first thing that any beginner “steganograph” should understand. If cryptography can abstract itself from equipment and solve only problems in the world of discrete mathematics, then a specialist in steganography is obliged to study the environment. Although of course in the construction of cryptosystems there are a number of problems, for example, an attack on side channels; but this is not the fault of the cipher quality. I think that steganography will evolve in line with the development of studying the environment in which hidden messages are transmitted. Thus, it is reasonable to expect the appearance of “chemical steganography”, “steganography in images”, “steganography in error correction codes”, ~~“food steganography”~~ , etc.

Beginning around 2008, everyone has already realized this. Not only mathematics-cryptographers, but also linguists, philologists, and chemists became interested in steganography. I think this is a positive shift that says a lot.

The third . The modern virtual world is oversaturated with texts, pictures of cats, videos and so on and so on ... On one YouTube site every minute more than 100 hours of video is loaded! Just think, every minute ! ?.. 100! YouTube !!! ? «» ! «» . , , , , BigData Internet of Things .

. -, , " " . , , ENIAC , " ", , 1938 . 1939 ( ) . Colossus …

, , 2000- . RSA, . , , , , , .

? , : « », « »; , . This is normal. ( ).

- «» (.. ) , 1998-2008 . ( , ). … , / ? , , YouTube ; botnet- -22, .

Source: https://habr.com/ru/post/253045/

All Articles