📜 ⬆️ ⬇️

Brainfuck compiler in .NET

Hello.
I see that you have a BrainFuck week and I decided to write a compiler, especially since in the comments to this article we were asked to tell in more detail about dynamic methods. In this article, we will look at this method of compiling code and try to make the compiler of the most normal simple language.

Introduction


In order to compile any source code in the .NET platform, you need to do 3 things:
  1. Create a new project and connect the System.Reflection and System.Reflection.Emit namespaces to it.
  2. Provide input and verification of source code
  3. Compile this code into an executable file.

And if with the first, I hope, there will be no problems, then with verification and compilation some questions may arise if not problems.

So, source code verification


In order to check the source code for correctness, we need to know what operands it consists of. There are only 8 of them:
  1. > - move to the next cell
  2. <- go to previous cell
  3. + - increase the value in the current cell by 1
  4. - - decrease the value in the current cell by 1
  5. . - print the value from the current cell
  6. , - enter a value from the outside and save it in the current cell
  7. [- if the value of the current cell is zero, go forward in the program text to the cell following the corresponding one] (taking into account nesting)
  8. ] - if the value of the current cell is not zero, go back through the program text to the symbol [(including nesting)


So, in order for the source code to be correct, we need to ensure that the source code is not empty, and the nesting of loops is not violated.
')
Let's write the function checking code:

public bool CheckSource()
{
if (Src.Length == 0) throw new ArgumentException(" ");
int State = 0;
for (int i = 0; i < Src.Length; i++)
{
if (Src[i] == '[') State++;
if (Src[i] == ']') State--;
// , .
if (State < 0) throw new ArgumentException(String.Format(" . : {0}", i++));
}
if (State != 0) Console.WriteLine(" .");
return State == 0;
}


The function simply verifies that the code is and that the opening and closing brackets are in the correct order and their number is the same.

Compilation


At once I will make a reservation that we will implement the compiler with a memory limit of 30,000 bytes. Each cell has a size of 1 byte.

In order to compile in your mind to make a compiler, you need to deal with the AssemblyBuilder class.
The result of operations on it becomes an executable file or library.
AssemblyBuilder ASM = AppDomain.CurrentDomain.DefineDynamicAssembly(new AssemblyName("BrainFuck Compiled Program"), AssemblyBuilderAccess.RunAndSave); //
ASM.Save(Filename);

This code will not create an empty assembly. In order for this assembly not to be empty, it must be filled with modules.
ModuleBuilder MDB = ASM.DefineDynamicModule(Filename); //

This code will create a module with a name equal to the name of the assembly.
And create a class Program in it.
TypeBuilder TPB = MDB.DefineType("Program", TypeAttributes.Class); //
TPB.CreateType(); //

Now we have a class. All operations will be performed in it.
We need an array - a memory and a pointer in it.
FieldBuilder FDB_1 = TPB.DefineField("Memory", typeof(byte[]), FieldAttributes.Private); // private byte[] Memory; // .
FieldBuilder FDB_2 = TPB.DefineField("Point", typeof(int), FieldAttributes.Private); //private int Point; // .

The whole program will be in Main.
Well, let's write Main and make it an entry point:
MethodBuilder MTB = TPB.DefineMethod("Main", MethodAttributes.Static, CallingConventions.Any); //static void Main() //Main Procedure
ASM.SetEntryPoint(MTB.GetBaseDefinition());

Now it is necessary to initialize the variables entered earlier. MSDN says this is done like this:
ILGenerator MTB_IL=MTB.GetILGenerator();
MTB_IL.Emit(OpCodes.Ldc_I4,30000); // 30000 -
MTB_IL.Emit(OpCodes.Newarr,typeof(byte)); // 30000
MTB_IL.Emit(OpCodes.Stsfld, FDB_1); // Memory


Then we describe all the actions: (a bunch of code, but commented)
foreach (var t in Src) // , .
{
switch (t)
{
case '>':
{
MTB_IL.Emit(OpCodes.Ldsfld,FDB_2); // POINT
MTB_IL.Emit(OpCodes.Ldc_I4_1); // 1
MTB_IL.Emit(OpCodes.Add); //
MTB_IL.Emit(OpCodes.Stsfld,FDB_2); // Point
break;
}
case '<':
{
MTB_IL.Emit(OpCodes.Ldsfld, FDB_2);// POINT
MTB_IL.Emit(OpCodes.Ldc_I4_1); // 1
MTB_IL.Emit(OpCodes.Sub); //
MTB_IL.Emit(OpCodes.Stsfld, FDB_2); // Point
break;
}
case '+':
{
MTB_IL.Emit(OpCodes.Ldsfld, FDB_1);// MEMORY
MTB_IL.Emit(OpCodes.Ldsfld, FDB_2);// POINT
MTB_IL.Emit(OpCodes.Ldelema, typeof(byte)); // MEMORY[POINT]
MTB_IL.Emit(OpCodes.Dup);
MTB_IL.Emit(OpCodes.Ldobj, typeof(byte));
MTB_IL.Emit(OpCodes.Ldc_I4_1);
MTB_IL.Emit(OpCodes.Add); //
MTB_IL.Emit(OpCodes.Conv_U1);
MTB_IL.Emit(OpCodes.Stobj, typeof(byte));//
break;
}
case '-':
{
MTB_IL.Emit(OpCodes.Ldsfld, FDB_1);// MEMORY
MTB_IL.Emit(OpCodes.Ldsfld, FDB_2);// POINT
MTB_IL.Emit(OpCodes.Ldelema, typeof(byte));// MEMORY[POINT]
MTB_IL.Emit(OpCodes.Dup);
MTB_IL.Emit(OpCodes.Ldobj, typeof(byte));
MTB_IL.Emit(OpCodes.Ldc_I4_1);
MTB_IL.Emit(OpCodes.Sub); //
MTB_IL.Emit(OpCodes.Conv_U1);
MTB_IL.Emit(OpCodes.Stobj, typeof(byte));//
break;
}
case '[':
{
var Lbl = MTB_IL.DefineLabel(); //
MTB_IL.MarkLabel(Lbl); // ,
Scopes.Push(Lbl); // . :)
break;
}
case ']':
{
MTB_IL.Emit(OpCodes.Ldsfld, FDB_1); // 3
MTB_IL.Emit(OpCodes.Ldsfld, FDB_2); //
MTB_IL.Emit(OpCodes.Ldelem_U1); //FDB_1 FDB_2
MTB_IL.Emit(OpCodes.Ldc_I4_0); // 0
MTB_IL.Emit(OpCodes.Ceq); // =0
MTB_IL.Emit(OpCodes.Brtrue,5); //
MTB_IL.Emit(OpCodes.Br,Scopes.Pop()); // . 5 .
break;
}
case '.':
{
MTB_IL.Emit(OpCodes.Ldsfld, FDB_1);// MEMORY
MTB_IL.Emit(OpCodes.Ldsfld, FDB_2);// POINT
MTB_IL.Emit(OpCodes.Ldelem_U1);// MEMORY[POINT]
MTB_IL.EmitCall(OpCodes.Call,typeof(Console).GetMethod("WriteLine",new[] {typeof(char)}),new[] {typeof(char)}); //Console.WriteLine(MEMORY[POINT]);
MTB_IL.Emit(OpCodes.Nop);
break;
}
case ',':
{
MTB_IL.Emit(OpCodes.Ldsfld, FDB_1);// MEMORY
MTB_IL.Emit(OpCodes.Ldsfld, FDB_2);// POINT
MTB_IL.EmitCall(OpCodes.Call, typeof(Console).GetMethod("ReadLine"), new[] { typeof(string) }); //Console.ReadLine();
MTB_IL.Emit(OpCodes.Call,typeof(Convert).GetMethod("ToByte",new[] {typeof(string)})); // .
MTB_IL.Emit(OpCodes.Stelem_I1); //
break;
}
}
}


And that's it!

MTB_IL.Emit(OpCodes.Ret); //
TPB.CreateType(); //
ASM.Save(Filename); //


For those who wish, here is the source code + .sln for VS2010

UPD: Transferred to Abnormal Programming

Source: https://habr.com/ru/post/113215/


All Articles