⬆️ ⬇️

We write a simple virtual machine in Python

Hello! Now I will tell you how to write a simple virtual machine in Python. I hope someone finds this article interesting.



We will not implement the parser and the compiler, but for now we will only make the interpreter machine of our assembler.



We will have a stack machine, and it will use two stacks:



The code itself will be a list of commands that are also lists:

code = [ ['val', 2], #  2   ['val', 3], #  3   ['get', '*'], #        * ( ) ['call', 2], #          2    (  ),      ['get', 'puts'], #    ['call', 1], #  ] 




We implement the car as a function:

 # ops -   # env -      def vm(ops, env={}): #    ,           class closure: # pc - ""   (    ops) # env -       def __init__(self, pc, env): self.pc, self.env = pc, env # pc -     # stack -   # fstack (frame stack) -    pc, stack, fstack = 0, [], [] 


')

Before starting the main interpreter loop, we need to calculate the method indexes in the code. The label we will have is a special command label: ['label', 'label-name']

 labels = {op[1]: index for index, op in enumerate(ops) if op[0] == 'label'} 




Main loop:

  while pc < len(ops): #      ,   op, args, pc = ops[pc][0], ops[pc][1:], pc + 1 arg = args[0] if args else None #   label if op == 'label': pass #      elif op == 'val': stack.append(arg) #          elif op == 'set' or op == 'get': 


Here you need to tell a little about the device environments. In our case, they are dict objects, in the keys of which the names of variables are stored, and in the values ​​of the value + in the key '' (empty line) the “pointer” to the parent environment is stored. Therefore, in order to find the environment in which the variable we needed was defined, we must first search in the current environment, and if we did not find it, then search in the parent environment and so on:

  e = env while e is not None and arg not in e: e = e[''] if '' in e else None #  ''    ,       #   ,     ,       if op == 'set': (env if e is None else e)[arg] = stack.pop() #    ,       elif op == 'get': if e: stack.append(e[arg]) else: print('undefined variable %s' % arg) 




In our virtual machine, it will be possible to transfer and return functions from functions, respectively, we will implement closures. The easiest way. When we meet the command to create a new function, this function will remember in what environment (associative array variable: value ) it was created. Thus, we will be able to write this in our high level language (which will be compiled into bytecode for our machine):

 make-person = fn(name, age) { return fn() { //   name  age print name, age++; }; }; vasya = make-person("Vasily", 20); petya = make-person("Peter", 24); vasya(); // Vasily 20 vasya(); // Vasily 21 petya(); // Peter 24 petya(); // Peter 25 




Creating a closure we will deal with the team fn . Everything she does: puts on the stack an object of the closure class, which contains the address of the code of the function (in fact, the address of the label with the name of the desired function) and the current environment.

  elif op == 'fn': stack.append(closure(labels[arg], env)) 


Now about function calls.

We can have two types of functions:



They will be processed in different ways:



During the function call, we will indicate the number of arguments passed to n, and our interpreter will take n elements from the stack, make an array of them and put it on the stack, so we can take any number of arguments. Functions in their code themselves will have to parse an array of arguments and set the corresponding variables.

The ret command returns control to where it was called from.

  #        elif op == 'call' or op == 'tcall': #   fn = stack.pop() #   -         if args and arg: stack.append(popn(arg)) #     if isinstance(fn, closure): #       ,      if op == 'call': fstack.append((pc, env)) #     pc, env = fn.pc, {'': fn.env} #   elif hasattr(fn, '__call__'): stack.append(fn(stack.pop())) #    elif op == 'args': vals = stack.pop() if len(args) != len(vals): print 'warning: wrong arguments count: %d expected, %d given' % (len(args), len(vals)) env.update(dict(zip(args, vals))) # return elif op == 'ret': #  return    ,      if not fstack: break #     pc, env = fstack.pop() 




Full code with lexer (for convenience) and test example: gist.github.com/Butjok/a531316e2a32576974d2

Source: https://habr.com/ru/post/270797/



All Articles