📜 ⬆️ ⬇️

Hash steganography in dataset. This time fast

All with friday! In my last post about hash steganography, I proposed a different approach to steganography - not to inject any information into the container, but simply to arrange the containers in the right order and thereby transfer the hidden information. Two days ago, romabibi published proof of consept for hash steganography in the social network vKontakte .


However, there is an important flaw in the use of pictures as containers. I quote the comment alekseev_ap :


All this is very interesting, but the efficiency of such a system is extremely low. How many dozens (or even hundreds) kilobytes have to be sent to transfer a string of several words?

Indeed, if the image weighs conditionally 0.5 - 2 MB, and for each image we transfer from 1 to 3 nibls, then the resulting speed is very small: from 0.5 to 6 B / MB


Therefore, for practical use, you need to find a container that has the following properties:


  1. would be very small;
  2. with a large number of containers, standing next to each other; would not cause "suspicion";
  3. when changing the order of containers, they would not cause a "suspicion".

So, the captain is an obvious solution: it is necessary to implement hash steganography in large datasets. One line - one nibble (nibble) .



Gif-animation showing the essence of hash steganography in datasets. Of course, in practice, you need to compress and encrypt before steganography



Idea


The idea is simple and obvious:


  1. We take very big data.
  2. Hash each line, take the first n bits of data - this is a set of containers for hash steganography
  3. The message is compressed, encrypted, divided into blocks of n bits
  4. We arrange in accordance with the transmitted message.

Csv donor example


As an example, take the CSV with city coordinates world-cities.csv .


Each line contains:



On average, one record has a length of 33 bytes.
This is in my dataset. You can take another as a "donor". However, the order of numbers will be the same.
If we pass on one nibbl (4 bits), then the total steganographic speed will be as much as 16000 B / MB , which is three orders of magnitude (sic!) More than in hash steganography with pictures !!!


CHS


An example is called CHS (Csv Hash Steganography) .


Generate a CSV file with the message:


$ python3 chs.py -m ", !" -i data/world-cities.csv -o stego.csv 

Extract message:


 $ python3 chs.py -i stego.csv 

When generating and retrieving, you should specify the same password, of course.


Example

Generation


 ~$ python3 chs.py -m ", !" -i data/world-cities.csv -o stego.csv Run chs 2018-03-23 09:33:03.242100  : 12345 header:: 'name,country,subcountry,geonameid' 65 --> 'Soignies,Belgium,Wallonia,2786420' 129 --> 'Lagoa do Itaenga,Brazil,Pernambuco,3396769' 196 --> 'Dubai,United Arab Emirates,Dubai,292223' 138 --> 'Qarqīn,Afghanistan,Jowzjān,1129516' 94 --> 'Arroyo Seco,Argentina,Santa Fe,3865385' 44 --> 'Shahrak,Afghanistan,Ghowr,1125896' 48 --> 'Palpalá,Argentina,Jujuy,3842190' 235 --> 'Lashkar Gāh,Afghanistan,Helmand,1134720' 39 --> 'Karukh,Afghanistan,Herat,1137807' 23 --> 'Uíge,Angola,Uíge,2236568' 166 --> 'La Paz,Argentina,Entre Rios,3432079' 240 --> 'Monte Caseros,Argentina,Corrientes,3430598' 121 --> 'Berat,Albania,Berat,3186084' 48 --> 'Amstetten,Austria,Lower Austria,2782555' 206 --> 'Ansfelden,Austria,Salzburg,3323063' 101 --> 'Kuçovë,Albania,Berat,3185060' 43 --> 'Morayfield,Australia,Queensland,2156934' 198 --> 'Río Ceballos,Argentina,Cordoba,3838902' 9 --> 'Esperanza,Argentina,Santa Fe,3856022' 168 --> 'Goris,Armenia,Syunik Province,174895' 119 --> 'Posadas,Argentina,Misiones,3429886' 187 --> 'San Miguel de Tucumán,Argentina,Tucumán,3836873' 89 --> 'San Pedro,Argentina,Jujuy,3836772' 61 --> 'Mādārīpur,Bangladesh,Dhaka,1337245' 1 --> 'Caxito,Angola,Bengo,2242001' 13 --> 'Tres Isletas,Argentina,Chaco,3833794' 192 --> 'Nivelles,Belgium,Wallonia,2790101' 25 --> 'Fier,Albania,Fier,3185672' 5 --> 'Botevgrad,Bulgaria,Sofiya,733014' 239 --> 'Ārt Khwājah,Afghanistan,Takhār,1148106' 41 --> 'Masis,Armenia,Ararat Province,616435' 178 --> 'Schwechat,Austria,Lower Austria,2765388' 

Extract


 ~$ python3 chs.py -i stego.csv Run chs 2018-03-23 11:34:12.443084  : 12345  :', !' 

Nuances


Is it possible to detect steganography? The most subtle point is the "donor csv". In the ideal case, generate it yourself and after each use - destroy. Thus, if we use unique data for each data transfer and use reliable cryptosystems before steganography, the hash steganography system can be considered reliable.


Also required CSV, preferably should not imply any "orderliness". For example, data with a mouse track is intelligently ordered by time. As for the world-cities.csv file , it would probably be logical to order either by country or by city in alphabetical order. (By the way, the file is sorted by city;))


Sources


Laid out on github: https://github.com/PavelMSTU/CHS


This is a proof-of-concept. There is no protection from the fool and a very beautiful guidiny.
Thanks for attention.


Spelling is not my forte. If you see an error - do not be lazy and write in a personal, please.


')

Source: https://habr.com/ru/post/339432/


All Articles