I have long had the idea to implement something interesting related to quineas. And finally it turned out to bring this thing to the end. I once saw the implementation of the game "Life" in the source code: Self Printing Game of Life and I wanted to do something similar. After some thought, I remembered that there are ASCII graphics . But it would be even more interesting and more difficult to make animation in ASCII graphics. It did not take long to find an example, almost immediately the Star Wars animation was found, more precisely, 4 episodes of “New Hope” called Asciimation . The work of the Japanese programmer Yusuke Endoh "Quine Relay" also spurred me to implement this idea. In the process of implementing such a “film”, many other problems had to be solved, which, however, was not to be regretted.
The resulting source is: gist . To make it easier to “view,” it also contains a script that will twist the entire animation. The sources of the entire project for its generation, the interface and other utilities are laid out on the githaba: Freaky-Sources (The quine-palindromes, which I already wrote about, are included there). It is clear that such a thing does not carry any practical benefits, it is just just for fun.
Introduction
Animation in a quine can only be output in one way: output each subsequent frame to the console at each compilation. Accordingly, the speed of the “movie” in this case depends on the size of the code (the larger the code, the longer it will compile and flip through the console), the decompression algorithm, the compilation speed (of iron). Moreover, the code is mainly occupied by compressed animation frames.
The aforementioned “Game of Life” uses strings as the field, which are initialized and displayed at the beginning of the program source code. Such an approach is valid if the amount of code is small, and it will always be visible during the compilation process. However, in my case, the code size is a few hundred kilobytes, and therefore frames must be output at the end so that they can be seen. It was decided to display the frames not in the form of real lines, but in the form of single-line comments (multi-line ones can also be used), which made it possible to output them without unnecessary tail characters (the final curly bracket) and quotes. ')
Since I have a low IQ, I do not have special abilities and cannot operate with a large number of parameters in my head, I decided to automate the process of writing a quine to the maximum. In addition, it allowed for a better understanding of the Quine structure and gaining experience in related fields (parsing and processing C # source code using the NRefactory).
Quine generation stages
Thus, I decided to break the process of writing any quine into the following steps:
Code generation
Data generation
Code Minification.
Formatting code.
Quine generation
All these steps are performed for a specially crafted template. By “code,” as applied to star wars, is meant a code that unpacks compressed data. And the data is the packed array of animation frames. Minification is needed to make the code unreadable to reduce the text size of the code.
The truth is, at the beginning, a version with the usual .NET GZipStream class was written to decompress the compressed data. But I thought it was not sporty, and besides, such a class exists only for this platform. Thus, I wrote a more complex version with my own archiver.
In order to make it easier to navigate in these stages, I wrote the interface on WinForms.
At the top left there is a text field in which a quine template with all markers is entered. Their purpose is discussed below. The first four stages are performed in this field.
At the bottom left are various parameters and buttons for performing the above steps. Also, the code of the template or the resulting Quine can be immediately checked for errors using compilation.
At the top right of the field, which is filled with the quine code, generated at the last stage.
At the top right of the field, the next iteration of the Quine is displayed after compilation. Also, the “Console Output -> Output” button with the number of iterations is needed in order to loop the Quine compilation process and see the frame-by-frame animation (well, or something else).
To highlight C # syntax, the BigObfuscatorFastColoredTextBox component wasused . Unfortunately, it works rather slowly, so the resulting “movie” is more convenient to look through a regular console (although it, unfortunately, is also not very fast).
Quine template
The template is the starting point of such a complex quine, and it may look like this:
Quine template
using System; using System.Text; using System.Collections.Generic; namespaceAsciimation_1_3 { classProgram { /*#Asciimation_1_3*//*Asciimation_1_3#*//*#DecodeBase64*//*DecodeBase64#*//*#HuffmanTree*//*HuffmanTree#*//*#HuffmanRleDecode2*//*HuffmanRleDecode2#*//*#Enums*//*Enums#*//*#Utils*//*Utils#*/staticstring Data = /*%Data_1_3*/""/*Data_1_3%*/; staticint CurrentFrame = /*$CurrentFrame*/0/*CurrentFrame$*/; staticvoidMain() { var output = Decompress_v_1_3(Data, CurrentFrame++); if (CurrentFrame == 3591) CurrentFrame = 3590; /*$@$*/ } } } /*$Output_1_3$*/
As you can see, in the above code a bunch of incomprehensible comments. But in reality, they are markers that indicate where a particular code or data should be inserted. The advantage of this approach is that the code can be inserted from other full versions of classes, and they, in turn, can be tested (I use NUnit). I classified these markers as follows:
/*#...*/… /*...#*/ - Code. Code sections between these markers are searched for files from the specified directory.
/*%...*/… /*...%*/ - Data. Between these markers, data generated using certain methods is inserted, which should be presented in string form (well, a regular string, an array of strings, an array of bytes, integers, etc.). Generally speaking, you can retrieve information about a method from data markers, look for it in a compiled assembly, and call it using reflection. But I have decided so far to limit myself to manually calling the necessary methods, because this is a more universal approach (it is easier to port into other languages).
/*$...*/… /*...$*/ - Options. These markers indicate the code that changes with each subsequent iteration of the Quine display. They are indicated in a special table, in which the replacement code is indicated. In this case, the parameters are the current frame number (it is incremented each time) and the frame output field at the bottom, implemented using single-line comments. Some parameters are introns , i.e. lines that do not affect the code in the next iteration. In this case, the intron is an output field.
/ * @ * / - Print. This is a special marker into which the code itself is inserted to output the entire quine. It exists only in a single instance. Replacement occurs at the very end, after generating the code and data in the previous steps. It is important to note that in order to prevent duplication of the data section (in order not to print them in the quine itself and in this line), at this stage there is a replacement not on the data itself, but on the parameters (i.e. instead of "asdf" contained in output, output '' '+ output +' "').
Thus, combining information about the stages and markers, you can build the following graphical scheme: It is worth noting that the minifier leaves the parameters of the quine itself unchanged, otherwise a situation will arise in which the parameter will be assigned a different name, and because of this, the quine will not compile or will output what is not necessary. (True, the parameters are still minified, but in a more cunning way).
More details about all the stages listed below.
Code generation
When generating the code, all source files from the specified folder are analyzed. And the code between the markers / * # ... .. * / ... / * ... ... # * / is copied to the appropriate places in the original template. For example, in the following example, the HuffmanRle2 class is entirely copied to the area between / * # HuffmanRleDecode2 * // * HuffmanRleDecode2 # * / . In order to ignore some code fragments inside these markers (for example, .ToString () methods are not needed in Quine, they are needed for debugging) and not to split this fragment into two, markers were entered / * # Ignore * // * Ignore # * / , which allow you to ignore all the code inside them.
An example of markers in the source code
/*#HuffmanRleDecode2*/publicclassHuffmanRle2 { publicstaticbyte[] Decode(HuffmanTree tree, byte[] bytes, refint curBit, int bytesCount, int bitsCountPerRepLength = 8, int bitsCountPerNotRepLength = 8) { int minLength = Math.Min(bitsCountPerRepLength, bitsCountPerNotRepLength); var maxCount = 1 << (minLength - 1); var root = tree.Root; var result = new List<byte>(bytes.Length * 2); int curBytesCount = 0; int i = 0; while (curBytesCount < bytesCount) { var length = Utils.GetInt(bytes, ref curBit, minLength); if ((length & maxCount) == 0) { curBit -= minLength; length = Utils.GetInt(bytes, ref curBit, bitsCountPerRepLength); var repeatCount = length + 2; bytevalue = Utils.GetValue(root, bytes, ref curBit); for (int j = 0; j < repeatCount; j++) { result.Add(value); curBytesCount++; } } else { curBit -= minLength; length = Utils.GetInt(bytes, ref curBit, bitsCountPerNotRepLength); var notRepeatCount = (((1 << (bitsCountPerNotRepLength - 1)) - 1) & length) + 1; for (int j = 0; j < notRepeatCount; j++) { bytevalue = Utils.GetValue(root, bytes, ref curBit); result.Add(value); curBytesCount++; } } i++; } return result.ToArray(); } } /*HuffmanRleDecode2#*/
Although the code collected from different files may not be optimal from the point of view of the criterion of the minimum length, since it was written in the standard style, however, this approach allowed us to write and run tests (NUnit) for compressing and extracting Quine data immediately before its generation, which simplifies the development process and makes it less tedious. Although code minification, which is described below, allows to virtually neutralize the disadvantage of this approach.
Data generation
The data compression used the simplest lossless compression algorithms, such as RLE and Huffman coding. However, this animation, like any movie, can be divided into main and intermediate frames. Intermediate frames contain information that has changed since the last frame, while the main frames contain all information about the frame. Thus, in order to get all the intermediate frame characters, you must decode all previous frames to the main frame, and then render all frames to the current frame, taking into account the changes received. It turned out such a pseudo-MPEG. The structure of the main and intermediate frames is presented below. In addition to normal and intermediate frames, I decided to also include intermediate frames obtained by shifting to the left, right, up and down, since they are not so rare.
00 - one char: 2..12 - position 12 ... - huffman code
01 - line horizontal: 2..12 - position 12..19 - chars count 19 ... - huffman codes
10 - line vertical: 2..12 - position 12..16 - chars count 16 ... - huffman codes
Unfortunately, the final compression was not as big as we would like, but quite acceptable. And I would like to at the level of lzma in 7-zip, i.e. less than 100 Kb. Most likely, with the help of more advanced methods (interval coding, vocabulary methods), an even greater degree of compression could be achieved.
Code Minification
Minification is needed in order to reduce the size of the code in text format. For processing C # code, the NRefactory library was used . Its advantage over Roslyn is that the parse tree can be modified in the process, which is required in the task of the minifator.
The shortening of the names of classes, methods, fields, local variables
The most important task of the minifier was precisely this stage. To bypass the constructed syntactic tree, the “Visitor” pattern was used, implemented as an overload of the necessary methods of traversing the nodes during a depth crawl.
In order to use the minimum number of identifiers on the one hand, and on the other, so that they do not overlap and do not lead to compilation errors, the shortening algorithm was divided into the following steps:
Minification of local variables. At this stage, the list of all names of local variables , the arguments of the methods from their definitions and the corresponding nodes in the tree are aggregated. Each variable is associated with a specific method. After that, for each old identifier within one method, a new one is replaced.
Minification of member types. At this stage, the list of all names of fields , methods , properties , enumeration members , events and corresponding nodes in the tree is aggregated. Each member is associated with a specific type ( class , structure , interface ). After that, for each old identifier within the same type, a new one is replaced.
Type name minification. At this stage, the list of all type names ( classes , structures , interfaces ) and the corresponding nodes in the tree are aggregated. After that, each name is replaced with a new one.
To generate substitutions, a class is used that can generate either minimal identifiers or random ones. This project uses the first option. When generating a replacement at each stage, any identifier names are ignored (so that there are no intersections and potential errors).
It is worth noting that the library supports ignored identifiers, i.e. those that for some reason should not be shortened, which is used in this project (Quine parameters).
To obtain a list of all references to used identifiers (Find All References in Visual Studio), the NRefactory mechanism is used. To do this, you first need to compile the tree (at each stage: local variables, type members, types):
var unresolvedFile = SyntaxTree.ToTypeSystem(); var projectContent = projectContent.AddOrUpdateFiles(_unresolvedFile); var compilation = projectContent.CreateCompilation(); var resolver = new CSharpAstResolver(_compilation, SyntaxTree, _unresolvedFile);
Then you can search for links as follows:
var resolveResult = resolver.Resolve(node); var findReferences = new FindReferences(); ... findReferences.FindLocalReferences(); ... findReferences.FindReferencesInFile();
Other minification
The following small minifikatsii were also implemented:
Deleting the private keyword
Removing namespaces
Reduction of variable initialization expressions:
List<byte> a = new List<byte>() -> var a = new List<byte>()
Remove curly braces with only one expression inside: if (a) { b; } => if (a) b;if (a) { b; } => if (a) b;
The abbreviation of the comparison entry with a boolean type: if (a == true) => if (a); if (a == false) => if (!a)if (a == true) => if (a); if (a == false) => if (!a)
Deleting regions and comments.
Remove whitespace and line breaks
After all kinds of minifikatsii made, from the code it remains to remove only extra whitespace characters and translations, and so that the length of the lines does not exceed a certain value (for beauty).
Minifiers implemented a separate project and it can be downloaded here: CSharp-Minifier . It was primarily developed with the aim of shortening the code in quineas, so there is nothing surprising in the fact that it has some strange functionality. However, I will gladly accept your comments and / or pull-ricksta.
Code formatting
At this stage, the minified code can be brought to mind so that it looks like some kind of picture. Formatting goes exactly after the minification, because the generated code is basically an array of single characters (they can be broken), from which it is easier to build a picture. The picture should have only black and white format, so that the symbol represents a white pixel, and black - its absence. For a better match, you can use any heuristic selection algorithms (genetic). My formatting was not implemented.
Quine generation
The following was used as a marker where the final line of the quine should be printed: / * @ * / Step by step everything looks like this. All code can be completely arbitrary, but in the example line breaks are added for better readability.
Template initialization: template =
classP { staticvoidMain() { var a = 6; /*@*/int b = 10; } }
Generate the line pattern used for quine printing. Quotation marks, backslashes and line breaks must be escaped, so that their presentation is separate parameters when outputting a formatted string. kernel ="var s = {1}{0}{1}; System.Console.Write(s,s,'{1}','{2}{2}',{1}{2}r{2}n{1});"
Replaces the marker in the template with the string used to print from the previous step. Since in C #, curly brackets are used when displaying parameters, they must also be escaped. This is done by duplicating them. str =template.Replace("{", "{{").Replace("}", "}}").Replace("/*@*/", kernel) str =
classP {{ staticvoidMain() {{ var a = 6; var s = {1}{0}{1}; System.Console.Write(s,s,'{1}','{2}{2}',{1}{2}r{2}n{1}); int b = 10; }} }}
Generating the resulting string used to print a quine: insertToResult ="var s=\"" + str + "\"" insertToResult =
var s=" class P {{ static void Main() {{ int a=6; var s={1}{0}{1}; Console.Write(s,s,'{1}','{2}{2}',{1}{2}r{2}n{1}); int b=10; }} }}"; Console.Write(s,s,'"', '\\', "\r\n");
Replacing the marker in the original template with the resulting string: result =template.Replace("/*@*/", insertToResult) result =
using System; classc { staticvoidMain() { int a=6; var s="using System;class c{{static void Main(){{int a=6;var s={1}{0}{1};Console.Write(s,s,'{1}','{2}{2}',{1}{2}r{2}n{1});int b=10;}}}}"; Console.Write(s,s,'"', '\\', "\r\n"); int b=10; } }
Done: Received a program that can print itself. The described approach can be generalized before generating quine palindromes , but so far I have not implemented it.
Launch
After the quine is generated, for example, you can save it into the Asciimation.cs file and output one frame with the following command: "C:\Windows\Microsoft.NET\Framework64\v4.0.30319\csc.exe" Asciimation.cs && (Asciimation > Asciimation.cs) && Asciimation
And to see the entire generated "movie", you need to use the following script:
echo off SET /ai=0 :LOOP IF %i%==3591 GOTO END "C:\Windows\Microsoft.NET\Framework64\v4.0.30319\csc.exe" Asciimation.cs && (Asciimation > Asciimation.cs) && Asciimation SET /ai=%i%+1 GOTO LOOP :END pause
Conclusion
The principles described in the article are applicable in the case of the implementation of various quineas in other programming languages. Unfortunately, using the online services ( ideone.com , compileonline ), it is impossible to test the developed code, because it still exceeds the maximum size, despite the compression.
The generated files and batch files can be downloaded here: AsciimationQuine_1_3.7z . In the future, you can try to engage in chain quines and, perhaps, repeat the feat of Yusuke Endoh, only by manual generation, not with manual generation.
Especially for those who do not have the ability to compile, but want to see the process: