Hello, dear inhabitants of Habr! In this publication (and, most likely, the cycle) I will talk about my implementation of one of the encryption algorithms. Why about implementation? Because the idea is not new, and it is impossible to assert that the idea belongs to me. But the method is quite interesting, so it’s worth finding out about it.
In this part I will briefly describe the principle of operation of the algorithm itself and my implementation.
Essence of the work
This section will describe the encryption algorithm. To describe it, we will answer some questions:
- What do we encrypt? ( input )
')
We will encrypt strings, or rather each individual character. Although with the same success coding can be all integer numbers. You can encrypt and fractional, but due to their special presentation in the computer temporarily discard them.
- What will we return? ( output )
Having processed the input string, we will return it in encrypted form. At the same time, the length and encoding of the string remain unchanged, which is quite convenient.
What happens inside? This is the most interesting! We feed each character of the input string to the
elementary cellular automaton ! Why he?
- It's simple. The algorithm of the elementary cellular automaton is simple to understand and implement.
- It is fast. Even my (not the most successful) implementation works quickly enough (we will be convinced of it, having finished work).
How does this happen? Yes, very simple!
- We split the input string into characters
- The code of each character is divided into binary digits.
- Write this string of ones and zeros in the field of the cellular automaton.
- We calculate the evolution of the cellular automaton using the state transition rule and at some moment return the current state of the field to the caller.
- Transform this state back to character code.
- "Push" the character into the output string
Actions 2-6 are repeated for each character and the output line is ready, we return it.
Implementation
The link on GitHub I will give in the end of the description.
To implement my idea, the object-oriented programming language C ++ was used. For clarity and convenience, all entities identified in the first part were implemented as separate classes. They received logical names - Field and Rule.
Class rule
This is an auxiliary class that stores information about the state transition rule and contains functions for working with the rule.
Data
Each instance has only cell states stored in std :: vector in accordance with each neighborhood case. Why std :: vector? Because for the bool type there is a specialization of this container. At the same time, only 1 bit is allocated for each element, instead of a whole byte. Due to this, the amount of occupied memory is reduced by 8 times.
Functions
- Constructor. The input is a constant unsigned integer of type size_t. This number is checked for belonging to the numerical segment [0; 255]. If the check is successful, it is divided into binary digits, which are stored in an internal bool type vector. Otherwise, an out_of_range exception is thrown.
- The index operator. It takes an unsigned integer such as size_t. If the number is in the numerical segment [0; 7], then an element of the bool type vector located at the given index is returned. Otherwise, we again generate the out_of_range exception. This number is the neighborhood of the cell. In this neighborhood, the transition rule matches the state of the cell in the middle in a future generation.
The class is easy to use and understand. But here there is a topic for discussion. Since it is used only in conjunction with an instance of the Field class, so why not make it internal? I do not have a clear answer to this question. And really, why not? You can make it private so that the objects are constructed as needed. It is tempting, but at the moment I think that he is quite independent. After all, you can define in advance several separate transition rules and encode strings with them that require different encryption methods. I do not want to deprive programmers of this convenience, so in my implementation the type Rule is declared separately from the type Field.
Field class
This is the main type in which the functions that directly perform the encoding are defined.
Data
- Rule of state transition. An instance of the Rule class that stores a state transition rule. The class interface is described above.
- The current state of the field. An instance of the class std :: vector <bool>, which stores the current state of all cells of the field. The vector is initialized by the binary bits of the encoded character in the constructor and is changed when the operator is called (), in which the encoding occurs.
- Auxiliary field. An auxiliary instance of the std :: vector <bool> class that stores the state of all the cells in a field. Used when miscalculation of the next generation of cells. It would be possible to define it directly in the () operator, but this operator is used quite often, so it is more practical to define the vector separately.
- Field size The size_t constant, which stores the size of the cellular automaton field, is equal to 8, since at the moment only ASCII encoding is supported.
Functions
- Constructor. Accepts a constant reference to an object of type Rule and a constant of type unsigned char. Since the constructor of the Rule class is not declared explicit, we can send an integer instead of a reference — a transition rule number, written in the form of a Wolfram code (see the link at the beginning). The transmitted character is divided into binary digits, written in order in the vector that stores the current state of the field.
- Operator (). This is where the character is encoded. The next generation is calculated (for each cell, we look at its neighborhood and, using the transition rule, we recognize the state of the cell in the middle in the next generation). The field is closed in a ring; therefore, we separately consider the state of the cells along the edges of the field. After this, the field is transferred back to the decimal number system, and we get the character code, which we return from the function.
Some tests
You can check the functionality of the program yourself at the link below. Here I will give some facts about performance that seemed interesting to me. Remember, they characterize only this program, but in no case the algorithm itself. Let's start!
To encrypt all Unicode characters using rule 110, my system took 645,000 ticks or approximately 0.645 seconds (average values for 10 starts).
To encrypt a single Unicode character encoding using rule 110, it took 80 ticks or approximately 0.00008 seconds (average values for 10 starts).
All Unicode characters using all 256 rules were encrypted in 133500000 ticks or approximately 134 seconds (average values for 10 runs).
But the strangest thing is that you get the right results! Checked manually by personally encrypting some numbers.
Conclusion
In my opinion, it turned out quite well and clearly. From mistakes you can not escape, but who is not without sin. The implementation works correctly, does its job. Of course, it is still too early to use it, much is not fully implemented, but I threw you some food for thought and I hope that your attempts to implement this algorithm will soon appear.