
Elliot Alderson hid secret information in audio CD files. However, the technique used by the fictional hacker from the TV series “Mr. Robot” is far from the mere whims of TV people. This is just one of the many methods of steganography used by hackers and cyber criminals to bypass security systems.
Derived from the Greek words steganos (hidden) and graphos (letter), the word steganography means the method of hiding data. To understand how best to cope with this secret threat, we spoke with
Daniel Lerch , who holds a Ph.D. in computer science from the Universitat Oberta de Catalunya University (Catalonia, Spain) and is one of the best experts on steganography in Spain.
')
Luis Corrons (LC) :
How would you define steganography? How is it different from cryptography?Daniel Lerch (DL) : Steganography is studying how to hide information on a media object (image, audio file, text, or network protocol). While in cryptography the goal is that the message sent is not read by the attacker, in steganography the goal is to hide even the fact of some kind of communication.
These two sciences are not mutually exclusive. In fact, steganography usually uses cryptography to encrypt a message before hiding it. But their goals are different: not everyone who needs to protect information also needs to hide it. So steganography can be an extra level of security.
LC :
Who can benefit more from steganography: cyber criminals or security solution providers?
Daniel LerchDL : Undoubtedly cyber criminals. Those responsible for the security of companies and institutions do not need to hide their communications. To ensure their security, cryptography is sufficient.
Steganography is a tool that is of great interest from different types of criminals, because he allows to communicate without risk of their detection. Typical examples include communication between members of terrorist cells, the distribution of illegal materials, the disclosure of confidential information, a tool to conceal malicious programs or commands that remotely manage these malicious programs.
LC :
How has this technique developed lately?DL : Depending on the environment where steganography is used, its development was different.
The direction that has developed the most is image steganography. They are very difficult to simulate statistically, and therefore it is very easy to make changes in them that will not be visible. For example, a pixel value in a black and white image can be represented by a byte, i.e. a number from 0 to 255. If this value slightly changes, then the human eye may not notice. But the problem is that with statistical image analysis it is not easy to detect such a change. Images are a great way to hide data, just like video and audio.
Another medium that gets a lot of attention is steganography in network protocols. However, unlike what happens to images, network protocols are clearly defined. If we change the information in the network package, it can be noticeable, and therefore there is less room for maneuver when hiding data. And despite the fact that such changes can be easily detected from the very beginning, such techniques can be very effective as a result of the complexity of analyzing the huge amount of traffic in existing networks.
One of the oldest information carriers, and the least developed in the digital age, is the text. However, steganography in the text can make a significant leap in view of the development of machine learning. In techniques developed in recent years, the process of hiding information is tedious and it requires the user to manually enter to create a “normal” text that makes sense but contains a hidden message. However, modern advances in deep learning, used in
NLP (Natural Language Processing), allow us to create more and more realistic texts, so it is possible that we will soon see steganography in text that will be difficult to detect.
LC :
What are the areas of computer security steganoanalysis? What techniques are commonly used?DL : From a company security point of view, the main areas of use are malware detection that uses steganography to hide it, and malware detection that attempts to extract confidential information.
From the point of view of special services that ensure national security, the main areas of application of steganalysis are the detection of terrorist or espionage communications.
Although most of the steganography tools that can be found on the Internet are fairly simple and can be detected with simple and well-known attacks, there are no quality public tools that allow us to automate the process by detecting steganography in network protocols, images, video, audio, text and etc.
Maybe this is not yet possible. For example, in the field of steganography in images, modern advanced techniques that are currently being researched can be detected with great difficulty using machine learning. In addition, if information is distributed among various media, which significantly reduces the amount of information on each carrier object, then its detection with the help of modern technologies becomes almost impossible.
LC :
What role do you think steganography will play in the coming years? Will it be used more often as a weapon of attack, or, nevertheless, as a means of defense?DL : Steganography as a means of defense looks very unusual, although such examples exist: for example, information retrieval by activists in a country with a totalitarian regime.
The main role of steganography in the next few years will lie in its use as a tool for hiding malicious programs and for sending them the required control commands. This is already being done, although with the help of rather primitive techniques. Using modern steganography techniques to hide malicious code will significantly complicate detection, forcing security tools to use advanced methods of steganalysis.
LC :
What advice would you give to information security specialists who are thinking about using steganoanalysis?DL : Most likely, they are interested in malware detection or data exfiltration. The first is to keep track of everything in order to know which tools exist, and when and how to use them. Then, it all comes down to practice. Test and test the technologies we implement using huge amounts of data.
If you use machine learning to perform steganalysis, you should be careful what data you use to train the system. The model should be able to predict data that she has never seen. This can lead to an error if the data used for its training were used to validate the model. In machine learning it is often said that the model is as good as the data for its training. Therefore, if the training data is not complete, then most likely that our model will not be reliable. The more data we use to train the model, the less likely it is to be inferior. Otherwise, we risk ultimately developing tools that will work well only in our laboratory with our test data.
LC :
What role will artificial intelligence and machine learning play in enterprise information security strategies?DL : An example is the automatic detection of security holes in software. Also replacing an antivirus program that detects the signatures of known viruses with an artificial intelligence system that identifies viruses based on common characteristics and behavior.
LC :
In an environment where there are more and more connected devices, what security measures should be taken to protect the confidentiality of data at the enterprise level?DL : Security measures on IoT devices should be the same as those applied to other devices connected to the same network. It may seem strange that you need to manage the security of an office air conditioner at the same level as a PC, but from the point of view of an attacker, it is a good access point to the network, like any other.