Simple interpreter from scratch in Python # 4

In the previous three parts, we created lexer, parser and AST for our toy language IMP. We even wrote our own library of parsers of combinators. In this final article we will write the last component of the interpreter - the performer.

Content

Simple interpreter from scratch in Python # 1
Simple interpreter from scratch in Python # 2
Simple interpreter from scratch in Python # 3
Simple interpreter from scratch in Python # 4

Let's think about how programs are usually performed. At any given time, there are some “control points” that indicate which expression the program is going to perform next. When the following expression is executed, it modifies the state of the program, by improving the “control point” and changing the values of the variables.

To execute an IMP program, we need three things:

Point of control - we need to know the following expression for execution.
Wednesday - we need to simulate a "change in the state of the program."
Execution functions - we need to know how the state and point of control should be modified for each expression.

The simplest is the control point, at least for IMP. We arranged our intermediate representation in the form of a tree structure. We will simply call the evaluation functions for top-level expressions, which will recursively call the execution function for the expressions inside. In essence, we will use the Python control point as our own. It would not be so easy for languages with more complex control structures, like functions or exceptions, but we can keep it simple for IMP.
')
The environment is also simple. IMP has only global variables, so we can simulate the environment using standard Python dictionaries. When the value changes, we will update the value of the variable in the dictionary.

Performance functions are the only thing we need to think about. Each type of expression will have its own execution function, which uses the current environment and returns a value. Arithmetic expressions return integer, Boolean - true or false . The expressions have no side effects, so the environment will not be modified. Each type of expression also has a performance function. Statements act by modifying the environment, so no result will be returned.

Determine the function of performance

We define them as methods in our AST classes. This will give each function a direct access to the structure that it performs. Here are the arithmetic functions:

class IntAexp(Aexp): ... def eval(self, env): return self.i class VarAexp(Aexp): ... def eval(self, env): if self.name in env: return env[self.name] else: return 0 class BinopAexp(Aexp): ... def eval(self, env): left_value = self.left.eval(env) right_value = self.right.eval(env) if self.op == '+': value = left_value + right_value elif self.op == '-': value = left_value - right_value elif self.op == '*': value = left_value * right_value elif self.op == '/': value = left_value / right_value else: raise RuntimeError('unknown operator: ' + self.op) return value

You can see here a bit of extra logic in the case where the programmer uses a variable that is not defined previously (one that is not defined in the environment dictionary). For simplicity and in order to avoid writing an error trapping system, we will give all undefined variables 0 .

In BinopAexp, we handle the case of an “unknown operator” by throwing out a RuntimeError. The parser cannot create an AST from unknown operators, so it only becomes easier for us. But if someone makes their own AST, they will need to take this into account.

Here are the functions of boolean operations:

 class RelopBexp(Bexp): ... def eval(self, env): left_value = self.left.eval(env) right_value = self.right.eval(env) if self.op == '<': value = left_value < right_value elif self.op == '<=': value = left_value <= right_value elif self.op == '>': value = left_value > right_value elif self.op == '>=': value = left_value >= right_value elif self.op == '=': value = left_value == right_value elif self.op == '!=': value = left_value != right_value else: raise RuntimeError('unknown operator: ' + self.op) return value class AndBexp(Bexp): ... def eval(self, env): left_value = self.left.eval(env) right_value = self.right.eval(env) return left_value and right_value class OrBexp(Bexp): ... def eval(self, env): left_value = self.left.eval(env) right_value = self.right.eval(env) return left_value or right_value class NotBexp(Bexp): ... def eval(self, env): value = self.exp.eval(env) return not value

It is quite simple. We use relational and logical python operators.

And here are the execution functions for each type of expression:

 class AssignStatement(Statement): ... def eval(self, env): value = self.aexp.eval(env) env[self.name] = value class CompoundStatement(Statement): ... def eval(self, env): self.first.eval(env) self.second.eval(env) class IfStatement(Statement): ... def eval(self, env): condition_value = self.condition.eval(env) if condition_value: self.true_stmt.eval(env) else: if self.false_stmt: self.false_stmt.eval(env) class WhileStatement(Statement): ... def eval(self, env): condition_value = self.condition.eval(env) while condition_value: self.body.eval(env) condition_value = self.condition.eval(env)

AssignStatement: we simply execute an arithmetic expression on the right side, and then update the environment with the resulting value. The programmer is not limited in redefining the variables that have already been defined.

CompoundStatement: we execute each expression, one after another. Remember that CompoundStatement is allowed wherever an expression is allowed, so long chains of expressions are decoded as nested.

IfStatement: We first execute the boolean condition of the expression If true, we execute a true expression. If false and the false-expression was determined, we execute the false-expression.

WhileStatement: we fulfill the condition to check whether the body of the loop should be executed once. The condition is executed every iteration of the loop, to check the condition.

Putting it all together

Well, we created the main components of our interpreter and now it remains only to write a unifying program:

 #!/usr/bin/env python import sys from imp_parser import * from imp_lexer import * def usage(): sys.stderr.write('Usage: imp filename\n') sys.exit(1) if __name__ == '__main__': if len(sys.argv) != 2: usage() filename = sys.argv[1] text = open(filename).read() tokens = imp_lex(text) parse_result = imp_parse(tokens) if not parse_result: sys.stderr.write('Parse error!\n') sys.exit(1) ast = parse_result.value env = {} ast.eval(env) sys.stdout.write('Final variable values:\n') for name in env: sys.stdout.write('%s: %s\n' % (name, env[name]))

Only one argument is given to the program - a name for interpretation. It reads the file and sends it to the lexer and parser, reporting an error (if any). Then we extract the AST from the result of the parser and execute it using an empty medium. Since the IMP has no output, we simply output the entire environment to the terminal.

A typical example of factorial calculation:

 n := 5; p := 1; while n > 0 do p := p * n; n := n - 1 end

The execution itself:

 $ ./imp.py hello.imp Final variable values: p: 120 n: 0

Conclusion

In the last article, we wrote an interpreter for our simple language from scratch. The language itself is of little use, but the interpreter is quite extensible and its main components can be used in something else.

I hope this material will provide a good start for people to experiment with the design of the language. Some ideas:

Make variables local to neymspeysa.
Add a for loop.
Add expressions for I / O (input / output).
Add features.

Source: https://habr.com/ru/post/207662/

All Articles

Simple interpreter from scratch in Python # 4

Determine the function of performance

Putting it all together

Conclusion

More articles: