DarkJPEG: steganography for all

As part of the DarkJPEG project, a new generation steganographic web service has been developed, which allows you to hide confidential information in the form of imperceptible noise in JPEG images, and you can select this information only if you know the secret key-password specified during coding.

The project is designed to implement freedom of information for people in countries that violate human rights, introducing censorship of the media or prohibiting the use of cryptography by law.
')
The service uses strong steganography methods to conceal the fact of information hiding along with strong cryptographic methods to protect data transmitted through open channels from being compromised (access by unauthorized persons). Project source documents are distributed under the MIT license.

Key Features:

Using SHA-3 to generate keys;
Symmetric AES-256 encryption;
JPEG (DCT LSB) steganography;
Support RarJPEG and double concealment;
Selection of a random container;
Calculations without server participation;
Guaranteed complete privacy.

Intro

Steganography is the science of the hidden transfer of information by keeping secret the fact of transmission. As a rule, the message will look like anything else, for example, as an image, article or letter. Steganography is usually used in conjunction with cryptography methods, thus complementing it. The advantage of steganography over pure cryptography is that messages do not attract attention. Messages whose encryption is not hidden are suspicious and may themselves be incriminating in those countries in which cryptography is prohibited. Thus, cryptography protects the content of the message, and steganography protects the very fact that there are any hidden messages.

How it all began

The idea to create an affordable, fast, private, and most importantly, completely free steganographic web service I visited a little less than a month ago. In connection with certain jovial laws, hastily adopted in this (and not only) country, I was told that somehow it was terribly wrong, that every person should have the opportunity to freely exchange information (and no matter what) open and / or unprotected communication channels with other people, and without bothering to study the features of gpg-encryption or blowing dust from tools such as steghide, so that everything is cross-platform, with really good usability and "here and now." That's why I was excited about the idea of making such a service, just like that, just for fun. And, despite the fact that this is my first experience in creating such services, and the interface designer is not so hot from me, in three weeks of exciting development in the evenings after work, it seems to me that everything worked out for me. But let's get everything in order.

Method

To begin with, about a dozen scientific articles were studied, revealing what methods of steganography are in general, their implementation features, analysis and detection. As containers, it was decided to use JPEG images as the most common type of content on the Internet. As it turned out, some methods easily give themselves out for testing using non-standard quantization matrices, others do not pass the histogram test, the third gives a useful volume of about 9-13% of the container size, that is, if we wanted to hide 500Kb of useful information, the container would have to look for a size of at least 5MB, and this is pretty sad.

As a result, having investigated the principle of operation of the quite new F5 steganographic methods and on the basis of a quantization error, it was decided to use simple and trivial LSB (least significant bits), adding it with preliminary encryption of AES-256 data, which, besides the possibility of using a 256-bit key for encoding, gives a pseudo-random sequence of bits at the output, which is precisely what is achieved in F5, by random permutations of data blocks. Thus, here is a schematic representation of the work method:

we take the 256-bit SHA-3 hash sum of the password entered by the user + randomly generated salt;
encrypt data + header (signature, name and file size) with AES-256, add salt to the beginning, 0xFFD9 to the end - JPEG End of Image marker (about why this will be done a little lower);
we select the container, the size suitable for our data, we transfer colors from RGB to YCbCr;
over each block of 8x8 pixels we perform a discrete cosine transform;
in the last two bits of each non-zero coefficient we write down our data little by little;
coefficients are compressed using run coding and Huffman codes.

As you can see, in the JPEG encoding process, the quantization step of the coefficients was skipped using the corresponding matrices, instead of which only one numbers are written to the file - thereby simulating image compression with a quality of 100%, which on the one hand, of course, significantly inflates the file size (adding a useful volume for data), on the other hand, reduces suspicions, since such single quantization matrices are not at all uncommon. Hooray! Everything worked out! The resulting file is a fully valid JPEG with about 20% of the volume occupied by our data, and we boldly give it to the user. Decoding is carried out according to a similar, inverse scheme.

Gluing and RarJPEG

In fact, it turned out that such a method is even quite redundant, in practice it usually suffices (unless we hide something really serious!) By simply adding our encrypted data to the end of a JPEG file. This is where the fake marker of the end of the picture comes in, which we add manually - it does a little bit, but removes suspicions from our glued tail. File concatenation also provides such a fun opportunity as creating and processing RarJPEGs, you only need to paste a ZIP or RAR file and skip the step with encryption, indicating an empty password, then you can easily access the hidden content if you can open the resulting image with almost any archiver ( but this picture is still valid jpeg!)

What is the result

Thus, we have three available options for steganography: auto, join and steg. The default auto (it was decided by me) uses join to encode, sticking files together (not necessarily with the archive - with anything), the only difference is that only with join you can use an empty password to create RarJPEG, and with auto and steg for reasons no security There is another tricky feature: the file can be encoded into the container using the steg method, and then you can attach something to it using the join method, which allows you to issue a password from the join part in case of “pressing to the wall” without compromising the steg part - such a container with a “double bottom” is obtained. By the way, if the picture is somehow changed (cropped, clamped, etc.), no steganography, by definition, alas, will survive, JPEG is a lossy compression format.

Containers

As for containers, there are also three options here: rand, grad and image. But everything is much simpler: the rand used by default downloads a random image from wikimedia, grad is used because the first method is not working (for example, when there is no Internet or the data is too large), the image allows the user to choose their own image for the container. There is also an unsafe option, which is recommended to be turned on only to owners of weak computers, if they do not have anything to work for, due to lack of memory, but until then it is not recommended to use it for the same security reasons.

Confidentiality

Let us turn to the most interesting and most controversial part: confidentiality and anonymity. Generally speaking, I agree that the expressions “web service” and “confidentiality” in one sentence sound, to put it mildly, rather strange. In fact, everything is not so bad. All service code is executed exclusively on the client, all calculations do not get out of the browser’s open tab, no information about user actions is monitored, cached, saved, logged, or transmitted in any way. Moreover, for its work, the web server, by and large, is no longer needed, it is enough to save the project (or clone the repository) to disk and open index.html just like that, without installing any web server. The only thing is that chromium (or chromium and its derivatives) should be run with the options -allow-file-access-from-files -disable-web-security to access the local filesystem during encoding and cross-domain downloads by reference when decoding.

Of course, if we consider extreme cases, such as breaking a githaba, then there is a danger of replacing scripts, but one should not, however, forget that absolutely everything that is somehow connected with the outside world can be affected by this scenario: so, it is possible to steal the private key of the maintainer of any software and use it to sign modified packages, which, by the way, has already been observed. The only difference with the web service in this case is that it works in a much more limited sandbox, and the maximum that can happen in this case is user tracking and data compromise. Therefore, I advise everybody the perfect solution to any paranoia problems: please use my service via TOR (as well as, of course, when transmitting coded content via any communication channels).

Security

I do not know how quickly, in case of urgent need, some people with gloomy faces will be able to determine through the IP provider of users, but here, as in the case of the same I2P, you can only prove the fact of visiting the service, track the user’s actions almost impossible (unless you spy on the person himself).

As for detecting darkjpegs on different sites, it will be a bit difficult if you use the join method and, generally speaking, it is rather difficult when using the steg. For example, only a resource-intensive chi-square test is suitable for detecting the latter, so one can not worry too much about this.

If someone wants to decode the encoded data without knowing the cipher, it should be remembered that the crypto algorithm uses a 256-bit key, and if you do not use any kind of qwerty, 123 and other easily generated dictionaries as a password, you just have to go with a hot soldering iron to the sender (which, by the way, still need to manage to find what is not trivial at all in the case of using the same TOR; please use TOR), since a brute force of trillions of years seems doubtful.

App Engine Support

Equally, perhaps, the controversial part is the use of the Google App Service to retrieve images when entering a link instead of specifying the file itself. Since we do not have a server (github pages, which is able to give only static pages, not counted), we need to somehow be able to download pictures (for decoding) from different sites. There is a restriction prohibiting cross-domain downloads, unless otherwise specified by the server from which we are trying to download content, and this limitation cannot be overcome in any way. There are four workarounds if we still very much want to use cross-domain loading:

saving the file to disk manually and already selecting it from the local file system is inconvenient;
using a local copy of the project opened from the local file system (file: /// home / ...) is not worth it;
automatic use of the service, in case of failure of a direct download, two proxy services on the App Engine platform - convenient, but limiting the information sent to 2GB per day;
The use of already written extensions for browsers accessible from the main page of the project for decoding images directly from any sites, through the context menu, simply clicking on them is ideal.

However, despite the paranoia from the word “Google”, the proxy service is exactly the same samopisny, it does not save anything, does not cache, but simply accepts the link encoded by base64, and issues content with an additional CORS header, is used only if -domain downloads on the site or disabled, or simply not supported; the source of the “hugs” - services hugs-01 and hugs-02 - is also on github.

For developers

The core of the project, dark.js, can be used by you in any third-party development. It is designed as an asynchronous web worker and accepts the following requests:

- {action: "encrypt", name: "file.ext", pass: "password", buffer: ArrayBuffer} - {action: "encode", method: "join", width: 0, height: 0, buffer: ArrayBuffer} - {action: "encode", method: "auto | join | steg", width: image.width, height: image.height, buffer: ImageData} - {action: "decode", method: "auto | join |steg", buffer: ArrayBuffer} - {action: "decrypt", pass: "password"}

Answers must be processed by the worker.onmessage function and look like this:

 - {type: "encrypt", size: encrypted} - {type: "encode", time: duration, isize: res.length, csize: enc.length, rate: 100*isize/csize, buffer: ArrayBuffer} - {type: "decode", time: duration, isize: res.length, csize: dec.length, rate: 100*isize/csize} - {type: "decrypt", name: "file.ext", buffer: ArrayBuffer} - {type: "progress", name: "encrypt | decrypt | encode | decode", progress: percent} - {type: "error", name: "encrypt | decrypt | encode | decode", msg: message}

The exact format of the encoded file is as follows:

 - container: [ JPEG <+> encoded data ] or [ JPEG ][ encoded data ] - encoded: [ 16-bit encryption salt ][ AES256 encrypted data ][ 0xFFD9 ] - encrypted: [ 0x3141593 ][ 32-bit file size ][ 16-bit file name length ][ UTF-16 file name ][ DATA ][ zero padding ]

Read at your leisure

Summary

only works with modern browsers (Safari 6, Chrome 25, Firefox 21, Opera 15 and all newer);
Yes, Opera 12.xx, alas, is not supported due to the lack of support in Presto of many html5 features (such as Blob URLs, download attribute, transferable objects etc.);
working scripts are minified with unglifyjs, to check the absence of “bookmarks” you can run git clone, make and then diff of what happened and what was already in the build directory before building - everything is sent to the client from there, directly from the github;
all the calculations are really performed on the client, without the least involvement of the server, you can save the project to your disk and simply open index.html without any apaches - everything will work;
yes, everything is really confidential, nothing is saved anywhere, nothing is being tracked, no logs are being kept - the github pages server is able to give only static pages;
There are three encoding methods: auto, join and steg, the default is auto, which basically uses join;
join sticks files together - the size of the inserted data is limited only by common sense, the security is average;
steg uses the most genuine DCT LSB steganography - the allowable size of the inserted data is about 20% of the container size, the security is high;
The difference from auto from join is that in join you can encode with an empty password, this makes it possible to create and process RarJPEG;
There are three types of container: rand, grad and image, the first one selects a random image from Wikimedia that fits the size, the second one is used when the first one is impossible, the third one allows using any user-specified image loaded from a local filesystem;
files can be attached from the local file system by pressing enter, clicking on the plus sign, drag-and-drop or URL;
all multi-colored arrows (and not only them) are clickable;
The inscription "darkjpeg" is also clickable, acts as an analogue of updating the page without reloading;
in the case of specifying the URL for decoding, two proxy services are used on Google App Engine due to CORS restrictions;
for development, you can use dark.js, which contains JavaScript implementations of a JPEG encoder and decoder, AES-256, SHA-3, and is implemented as an asynchronous web worker;
how my service will be used: whether for posting a pony on imageboards, or for coordinating some activists — this is on your conscience, I allow everything.

License

This software is provided "as is" without warranty of any kind, either express or implied, including, but not limited to, warranties of merchantability, fitness for a particular purpose and no violation of rights. In no event will authors or rights holders be liable for claims for damages, damages or other claims under existing contracts, delicts or otherwise arising from, caused by or related to software or software use or other actions with software.

Thanks

Emily Stark, Mike Hamburg, Dan Boneh, Stanford University for their implementation of AES-256 in JavaScript;
Chris Drost for his implementation of SHA-3 Keccak;
Yury Delendik, Brendan Dahl, notmasteryet for parts of their JavaScript JPEG decoder;
Andreas Ritter for his amazing jpeg port javascript encoder;
Dan Gries for his examples of very beautiful fractal gradients;
Brsev for the gear icon from its Token Dark set;
Fabrizio Panattoni for his Premade Background 019;
My girlfriend for inspiration;
You for reading, do not judge strictly: 3

Source: https://habr.com/ru/post/187402/

All Articles