📜 ⬆️ ⬇️

Creating a programming language using LLVM. Part 2: Implementing Parser and AST

Welcome to Chapter 2 of the tutorial "Creating a programming language with LLVM." In this chapter, we will see how to use the lexical analyzer created in Chapter 1 to build a complete parser for our Kaleidoscope language. After we have the parser ready, we will build the Abstract Syntax Tree (AST) (Abstract Syntax Tree).

We will develop the Kaleidoscope parser using the combination of recursive descent and operator precedence (the last for binary expressions and the first for everything else). Before we move on to the parsing itself, let's talk about what we get at the output: Abstract Syntax Tree.

Abstract Syntax Tree (AST)


AST displays the program in such a way that for later stages of the compiler (for example, code generation) it is easily interpreted. We need one object for each language construct. In Kaleidoscope, we have expressions, prototypes, and functions. Let's start with the expressions:

/// ExprAST -      . class ExprAST { public: virtual ~ExprAST() {} }; /// NumberExprAST -       (, "1.0"). class NumberExprAST : public ExprAST { double Val; public: NumberExprAST(double val) : Val(val) {} }; 

The above code shows the definition of the base class ExprAST and its subclass that we use for numeric literals.
')
Now we will create only AST without various useful methods of working with it. If necessary, you can, for example, quite easily add a virtual method of formatted code output. Here are the node definitions for other AST expressions that we will use in Kaleidoscope:

 /// VariableExprAST -      (, "a"). class VariableExprAST : public ExprAST { std::string Name; public: VariableExprAST(const std::string &name) : Name(name) {} }; /// BinaryExprAST -      . class BinaryExprAST : public ExprAST { char Op; ExprAST *LHS, *RHS; public: BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) : Op(op), LHS(lhs), RHS(rhs) {} }; /// CallExprAST -      . class CallExprAST : public ExprAST { std::string Callee; std::vector<ExprAST*> Args; public: CallExprAST(const std::string &callee, std::vector<ExprAST*> &args) : Callee(callee), Args(args) {} }; 

Everything is simple: the variable contains the name of the variable, the binary operator contains its opcode (for example, '+') and the left and right expressions (AST nodes), and the function calls contain the name of the function and a list of all the arguments. One of the great things about AST is that it covers language features, regardless of the syntax of a programming language. Please note that there are no controversial points about the priority of binary operators, lexical structure, etc.

We have defined all nodes for expressions of our language. It is not turing-complete, since it does not have a conditional control flow, we will fix this in the next section. The following two things that we need are the interface of functions and the functions themselves:

 /// PrototypeAST -    ""  , ///        (,  , ///    ). class PrototypeAST { std::string Name; std::vector<std::string> Args; public: PrototypeAST(const std::string &name, const std::vector<std::string> &args) : Name(name), Args(args) {} }; /// FunctionAST -     class FunctionAST { PrototypeAST *Proto; ExprAST *Body; public: FunctionAST(PrototypeAST *proto, ExprAST *body) : Proto(proto), Body(body) {} }; 

In Kaleidoscope, functions are typed only by the number of their arguments. Since all values ​​are real double precision numbers, there is no point in storing the type of each argument anywhere. In a real programming language, the class “ExprAST” would probably contain another type field.

Now we can finally talk about parsing expressions and functions in Kaleidoscope.

Parser base


Now that we have AST elements, we need to define a code parser in order to build it. The idea here is that we want to parse something like “x + y” (which is returned as three tokens with lexical parsing) to AST, which can be generated by something like this:

  ExprAST *X = new VariableExprAST("x"); ExprAST *Y = new VariableExprAST("y"); ExprAST *Result = new BinaryExprAST('+', X, Y); 

Before doing this, we will define several auxiliary procedures:

 /// CurTok/getNextToken -    . CurTok -   /// ,  . getNextToken     ///     CurTok. static int CurTok; static int getNextToken() { return CurTok = gettok(); } 

This code implements the simplest token buffer on top of the lexical analyzer. This allows us to look forward to a single token that will be returned by the lexical analyzer. Each function in our parser will assume that CurTok is the current token to parse.

 /// Error* -       . ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;} PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; } FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; } 

These are helper procedures that our parser will use to handle errors. Correction of errors in our parser will be far from the best and not particularly user-friendly, but will be sufficient in the framework of this tutorial.

With these support functions, we can implement the first part of our grammar: numeric literals.

Expression parsing


Let's start with the numeric literals, as with them the easiest. For each grammar rule, we define a function that analyzes this rule. For numeric literals, we have:

 /// numberexpr ::= number static ExprAST *ParseNumberExpr() { ExprAST *Result = new NumberExprAST(NumVal); getNextToken(); //   return Result; } 

The function is very simple: when calling, it expects the current token to be tok_number. It creates a node NumberExprAST , passing it the value of the current number, moves the lexical analyzer to the next token and returns the created node.

There are several interesting aspects. The most important is that this function receives any tokens that match our grammar rule and returns the lexical analyzer buffer with the next token (which is not part of the grammar rule) ready for processing. This is the standard way for parsing a recursive descent. To better understand this, consider parsing the expression in parentheses:

 /// parenexpr ::= '(' expression ')' static ExprAST *ParseParenExpr() { getNextToken(); //  (. ExprAST *V = ParseExpression(); if (!V) return 0; if (CurTok != ')') return Error("expected ')'"); getNextToken(); //  ). return V; } 

This function illustrates a number of interesting things about the parser:
  1. It shows how to use error procedures. When called, the function expects the current token to be the '(' sign, but after parsing the subexpression, it is possible that there is no expected ')'. For example, if the user enters "(4 x" instead of "(4)", then the analyzer should show an error. Since such errors may occur, the parser needs a way to indicate that they occurred: in our parser, if an error occurs, we return null.
  2. Another interesting aspect of this function is the use of recursion when calling ParseExpression (we will soon see that ParseExpression can call ParseParenExpr ). This is a powerful mechanism, because it allows us to process recursive grammars and very easily cope with any rules of such grammars. Please note that no AST nodes are built for parentheses. The important role of brackets is to enable the parser to be grouped. When the parser builds an AST, brackets are no longer needed.
The following grammatical rules are the handling of variable references and function calls:

 /// identifierexpr /// ::= identifier /// ::= identifier '(' expression* ')' static ExprAST *ParseIdentifierExpr() { std::string IdName = IdentifierStr; getNextToken(); //  . if (CurTok != '(') //   . return new VariableExprAST(IdName); //  . getNextToken(); //  ( std::vector<ExprAST*> Args; if (CurTok != ')') { while (1) { ExprAST *Arg = ParseExpression(); if (!Arg) return 0; Args.push_back(Arg); if (CurTok == ')') break; if (CurTok != ',') return Error("Expected ')' or ',' in argument list"); getNextToken(); } } //  ')'. getNextToken(); return new CallExprAST(IdName, Args); } 

This feature works the same as the previous ones. (When called, it expects the current token to be tok_identifier). It also uses recursion and error handling. One interesting point is that she has to look forward to determine whether the current identifier is a reference to a variable or a function call. This is a check, depending on whether the next token token is '(', causes either VariableExprAST or CallExprAST .

Now that we have all the logic to parse the simplest expressions, we can define an auxiliary function to wrap it in one entry point. We will call this class the expression “primary” for the reasons that will become clearer by the end of the textbook . To parse an arbitrary primary expression, we must determine what the expression is:

 /// primary /// ::= identifierexpr /// ::= numberexpr /// ::= parenexpr static ExprAST *ParsePrimary() { switch (CurTok) { default: return Error("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: return ParseNumberExpr(); case '(': return ParseParenExpr(); } } 

Now that we see the definition of this function, it becomes more obvious why we can assume any valid CurTok states in various functions. It uses forward peeking to determine which kind of expression to check, and then the expression itself is analyzed in the appropriately called function.

Now that basic expressions can be processed, we can work with binary expressions. They are much more complicated.

Binary expression parsing


Binary expressions are much harder to make out because they are often ambiguous. For example, for the string "x + y * z" the parser can parse either as "(x + y) * z" or as "x + (y * z)". Knowing the basics of mathematics, we expect the second option, because "*" (multiplication) has a higher priority than "+" (addition).

There are many possible solutions, but the most elegant and efficient way is to parse the primacy (priority) of operators . This parsing technique uses recursion to prioritize operations. First we need a priority table:

 /// BinopPrecedence -      static std::map<char, int> BinopPrecedence; /// GetTokPrecedence -     . static int GetTokPrecedence() { if (!isascii(CurTok)) return -1; // ,     . int TokPrec = BinopPrecedence[CurTok]; if (TokPrec <= 0) return -1; return TokPrec; } int main() { //     // 1 -  . BinopPrecedence['<'] = 10; BinopPrecedence['+'] = 20; BinopPrecedence['-'] = 20; BinopPrecedence['*'] = 40; //  . ... } 

Kaleidoscope will only support 4 binary operators (but it can obviously be extended by our brave and fearless reader). The GetTokPrecedence function returns the priority of the current token, or -1 if the token is not a binary operator. The presence of such a table makes it easy to add new operators and, of course, the algorithm does not depend on specific operators. You can easily remove the priority table and make comparisons in the GetTokPrecedence function.

Now we can start parsing binary expressions. The main idea of ​​parsing the precedence of operators is to split an expression with potentially ambiguous binary operators into parts. Consider, for example, the expression “a + b + (c + d) * e * f + g”. Operator Priority Parser treats it as a stream of primary expressions separated by binary operators. Thus, during the initial parsing, the leading primary expression “a” is first parsed, and it will see the pairs [+, b] [+, (c + d)] [*, e] [*, f] and [+, g ]. Pay attention to the brackets: the parser does not need to worry about nested expressions like (c + d).

To begin with, an expression is a primary expression, potentially followed by a sequence of pairs [binop, primaryexpr]:

 /// expression /// ::= primary binoprhs /// static ExprAST *ParseExpression() { ExprAST *LHS = ParsePrimary(); if (!LHS) return 0; return ParseBinOpRHS(0, LHS); } 

ParseBinOpRHS is a function that analyzes the sequence of pairs. It takes priority and a pointer to the expression to analyze. Note that “x” is an absolutely valid expression: thus, “binoprhs” may be empty, in which case the function returns an expression that was passed to it. In our example above, the code sends the expression “a” to ParseBinOpRHS and the current token “+”.

The priority value transmitted in ParseBinOpRHS indicates the minimum priority of operators necessary for it to be accepted. For example, if the current pair in the stream [+, x] and priority 40 is transmitted in ParseBinOpRHS, it will not accept tokens (since the priority of "+" is only 20). Therefore, ParseBinOpRHS starts with:

 /// binoprhs /// ::= ('+' primary)* static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) { //    ,    while (1) { int TokPrec = GetTokPrecedence(); //           , //  ,    if (TokPrec < ExprPrec) return LHS; 

This code takes the priority of the current token and checks how low it is. Since we have defined invalid tokens that have a priority of -1, an implicit check occurs that determines when the stream of pairs ends. If this test is successfully passed, we know that the token is exactly a binary operator and that it will be included in the expression:

  // ,  ,    . int BinOp = CurTok; getNextToken(); //    //       ExprAST *RHS = ParsePrimary(); if (!RHS) return 0; 

Thus, this code gets (and remembers) the binary operator, and then analyzes the primary expression that follows it. He creates pairs, the first of which in our example would be [+, b].

Now that we have parsed the left side of the expression and one pair from the RHS sequence, we have to decide how to link the expression. In particular, we can have both "(a + b) binop unparsed" and "a + (b binop unparsed)" (where binop is a binary operator, unparsed is an unparsed part). To determine, we look forward to another “binop” (binary operator) to determine its priority and compare it with the priority of BinOp (which in this case is “+”):

  //  BinOp   RHS  ,    RHS, //      RHS  LHS. int NextPrec = GetTokPrecedence(); if (TokPrec < NextPrec) { 

If the priority of the binary operator to the right of “RHS” is less than or equal to the priority of our current operator, then we know that we have the case "(a + b) binop ...". In our example, the current "+" operator and the next "+" operator have the same priority. Therefore, we will create an AST node for “a + b”, and then continue parsing:

  ...     ... } //  LHS/RHS. LHS = new BinaryExprAST(BinOp, LHS, RHS); } } 

In our example, “a + b +” will be returned as “(a + b)” and the next iteration of the loop will be executed, with “+” as the current token. This part will be accepted further, as you remember, should be parsed "(c + d)" as the primary expression, that is, the current pair will be [+, (c + d)]. Then an expression with "*" as a binary operator will be obtained. In this case, the priority "*" is higher than the priority "+", so the above conditional construction will be executed.

The most important question here for the left part will be “how to disassemble the right part in full in this condition”? In particular, in order to correctly construct an AST for our example, it must receive the full expression "(c + d) * e * f" as the variable of the RHS expression. The code for this is surprisingly simple (the code from the above two blocks is duplicated for context):

  //  BinOp   RHS  ,    RHS, //      RHS  LHS. int NextPrec = GetTokPrecedence(); if (TokPrec < NextPrec) { 
  RHS = ParseBinOpRHS(TokPrec+1, RHS); if (RHS == 0) return 0; 
  } //  LHS/RHS. LHS = new BinaryExprAST(BinOp, LHS, RHS); } } 

At the moment, we know that the binary operator on the right side of our primary expression has higher priority than the binary operator that is currently being parsed. Thus, we know that any sequences of pairs of operators having a higher priority than the "+" must be disassembled together and returned as "RHS". To do this, we recursively call the ParseBinOpRHS function with the “TokPrec + 1” parameter as the minimum priority necessary to continue. In our long-suffering example, this will cause the AST node to return to "(c + d) * e * f" as RHS, which then becomes the right side for the expression with '+'.

Finally, on the next iteration of the while loop, the "+ g" part is analyzed and added to the AST. With this small code snippet (14 lines), we correctly handle any binary expressions with an elegant parsing method. It was just a quick review of the code, so I recommend you try it with a few examples to see how it works.

This completes the processing of expressions. At the moment, we can already indicate to the analyzer an arbitrary stream of tokens and construct an expression from it, dwelling on the first token that is not part of the expression. Now we need to deal with the declaration of functions, etc.

Parsing everything else


The next thing we lack is the processing of the function prototypes. In Kaleidoscope, they are used both to declare "external" (extern) functions, and to declare the body of functions. The code for this part of the parsing will be straightforward and not very interesting (after you have gone through the expression parsing):

 /// prototype /// ::= id '(' id* ')' static PrototypeAST *ParsePrototype() { if (CurTok != tok_identifier) return ErrorP("Expected function name in prototype"); std::string FnName = IdentifierStr; getNextToken(); if (CurTok != '(') return ErrorP("Expected '(' in prototype"); //    . std::vector<std::string> ArgNames; while (getNextToken() == tok_identifier) ArgNames.push_back(IdentifierStr); if (CurTok != ')') return ErrorP("Expected ')' in prototype"); //  . getNextToken(); //  ')'. return new PrototypeAST(FnName, ArgNames); } 

Given the simplicity of the function declaration, the function body will contain a prototype + expression:

 /// definition ::= 'def' prototype expression static FunctionAST *ParseDefinition() { getNextToken(); //  def. PrototypeAST *Proto = ParsePrototype(); if (Proto == 0) return 0; if (ExprAST *E = ParseExpression()) return new FunctionAST(Proto, E); return 0; } 


In addition, extern support is needed to declare functions like sin and cos, as well as to support the premature declaration of user-defined functions. 'extern' will consist only of a prototype without a body:

 /// external ::= 'extern' prototype static PrototypeAST *ParseExtern() { getNextToken(); //  extern. return ParsePrototype(); } 

Finally, we need to allow the user to enter arbitrary top-level expressions and calculate them on the fly. We will process them, defining for them anonymous nullary (without arguments) functions:

 /// toplevelexpr ::= expression static FunctionAST *ParseTopLevelExpr() { if (ExprAST *E = ParseExpression()) { //   . PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>()); return new FunctionAST(Proto, E); } return 0; } 

Now that we have all the parts, let's create a small control program that will allow us to try out all the code that we wrote!

Control program


The control program simply calls the parsing cyclically. There is nothing interesting here, see the section “Top level parsing” in the full code below.

 /// top ::= definition | external | expression | ';' static void MainLoop() { while (1) { fprintf(stderr, "ready> "); switch (CurTok) { case tok_eof: return; case ';': getNextToken(); break; //  ';'  . case tok_def: HandleDefinition(); break; case tok_extern: HandleExtern(); break; default: HandleTopLevelExpression(); break; } } } 

The most interesting part of this code is that we ignore high-level semicolons. You ask why? The main reason is that if you enter “4 + 5” on the command line, the parser does not know that this is the end, will you continue to enter text. , «def foo ...», 4 +5 — , "* 6", . «4+5;» , .

findings


400 (240 ), , AST. Kaleidoscope , . :

$ ./a.out
ready> def foo(xy) x+foo(y, 4.0);
Parsed a function definition.
ready> def foo(xy) x+yy;
Parsed a function definition.
Parsed a top-level expr
ready> def foo(xy) x+y );
Parsed a function definition.
Error: unknown token when expecting an expression
ready> extern sin(a);
ready> Parsed an extern
ready> ^D
$

. AST, , .. , AST LLVM Intermediate Representation (IR) ( ).


. , : LLVM - . (, , C C++). :

#
g++ -g -O3 toy.cpp
#
./a.out

, :

 #include <cstdio> #include <cstdlib> #include <string> #include <map> #include <vector> //===----------------------------------------------------------------------===// // Lexer ( ) //===----------------------------------------------------------------------===// //     [0-255],   , //       enum Token { tok_eof = -1, //  ( ) tok_def = -2, tok_extern = -3, //  ( : , ) tok_identifier = -4, tok_number = -5 }; static std::string IdentifierStr; // ,  tok_identifier static double NumVal; // ,  tok_number /// gettok -       . static int gettok() { static int LastChar = ' '; //  . while (isspace(LastChar)) LastChar = getchar(); if (isalpha(LastChar)) { // : [a-zA-Z][a-zA-Z0-9]* IdentifierStr = LastChar; while (isalnum((LastChar = getchar()))) IdentifierStr += LastChar; if (IdentifierStr == "def") return tok_def; if (IdentifierStr == "extern") return tok_extern; return tok_identifier; } if (isdigit(LastChar) || LastChar == '.') { // : [0-9.]+ std::string NumStr; do { NumStr += LastChar; LastChar = getchar(); } while (isdigit(LastChar) || LastChar == '.'); NumVal = strtod(NumStr.c_str(), 0); return tok_number; } if (LastChar == '#') { //     do LastChar = getchar(); while (LastChar != EOF && LastChar != '\n' && LastChar != '\r'); if (LastChar != EOF) return gettok(); } //   . if (LastChar == EOF) return tok_eof; //         ASCII int ThisChar = LastChar; LastChar = getchar(); return ThisChar; } //===----------------------------------------------------------------------===// // Abstract Syntax Tree (     ) //===----------------------------------------------------------------------===// /// ExprAST -      . class ExprAST { public: virtual ~ExprAST() {} }; /// NumberExprAST -       (, "1.0"). class NumberExprAST : public ExprAST { double Val; public: NumberExprAST(double val) : Val(val) {} }; /// VariableExprAST -      (, "a"). class VariableExprAST : public ExprAST { std::string Name; public: VariableExprAST(const std::string &name) : Name(name) {} }; /// BinaryExprAST -      . class BinaryExprAST : public ExprAST { char Op; ExprAST *LHS, *RHS; public: BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) : Op(op), LHS(lhs), RHS(rhs) {} }; /// CallExprAST -      . class CallExprAST : public ExprAST { std::string Callee; std::vector<ExprAST*> Args; public: CallExprAST(const std::string &callee, std::vector<ExprAST*> &args) : Callee(callee), Args(args) {} }; /// PrototypeAST -    ""  , ///        (,  , ///    ). class PrototypeAST { std::string Name; std::vector<std::string> Args; public: PrototypeAST(const std::string &name, const std::vector<std::string> &args) : Name(name), Args(args) {} }; /// FunctionAST -     class FunctionAST { PrototypeAST *Proto; ExprAST *Body; public: FunctionAST(PrototypeAST *proto, ExprAST *body) : Proto(proto), Body(body) {} }; //===----------------------------------------------------------------------===// // Parser (   ) //===----------------------------------------------------------------------===// /// CurTok/getNextToken -    . CurTok -   /// ,  . getNextToken     ///     CurTok. static int CurTok; static int getNextToken() { return CurTok = gettok(); } /// BinopPrecedence -      static std::map<char, int> BinopPrecedence; /// GetTokPrecedence -     . static int GetTokPrecedence() { if (!isascii(CurTok)) return -1; // ,     . int TokPrec = BinopPrecedence[CurTok]; if (TokPrec <= 0) return -1; return TokPrec; } /// Error* -       . ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;} PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; } FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; } static ExprAST *ParseExpression(); /// identifierexpr /// ::= identifier /// ::= identifier '(' expression* ')' static ExprAST *ParseIdentifierExpr() { std::string IdName = IdentifierStr; getNextToken(); //  . if (CurTok != '(') //  . return new VariableExprAST(IdName); //  . getNextToken(); //  ( std::vector<ExprAST*> Args; if (CurTok != ')') { while (1) { ExprAST *Arg = ParseExpression(); if (!Arg) return 0; Args.push_back(Arg); if (CurTok == ')') break; if (CurTok != ',') return Error("Expected ')' or ',' in argument list"); getNextToken(); } } //  ')'. getNextToken(); return new CallExprAST(IdName, Args); } /// numberexpr ::= number static ExprAST *ParseNumberExpr() { ExprAST *Result = new NumberExprAST(NumVal); getNextToken(); //   return Result; } /// parenexpr ::= '(' expression ')' static ExprAST *ParseParenExpr() { getNextToken(); //  (. ExprAST *V = ParseExpression(); if (!V) return 0; if (CurTok != ')') return Error("expected ')'"); getNextToken(); //  ). return V; } /// primary /// ::= identifierexpr /// ::= numberexpr /// ::= parenexpr static ExprAST *ParsePrimary() { switch (CurTok) { default: return Error("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: return ParseNumberExpr(); case '(': return ParseParenExpr(); } } /// binoprhs /// ::= ('+' primary)* static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) { //    ,    while (1) { int TokPrec = GetTokPrecedence(); //           , //  ,    if (TokPrec < ExprPrec) return LHS; // ,  ,    . int BinOp = CurTok; getNextToken(); // eat binop //       ExprAST *RHS = ParsePrimary(); if (!RHS) return 0; //  BinOp   RHS  ,    RHS, //      RHS  LHS. int NextPrec = GetTokPrecedence(); if (TokPrec < NextPrec) { RHS = ParseBinOpRHS(TokPrec+1, RHS); if (RHS == 0) return 0; } //  LHS/RHS. LHS = new BinaryExprAST(BinOp, LHS, RHS); } } /// expression /// ::= primary binoprhs /// static ExprAST *ParseExpression() { ExprAST *LHS = ParsePrimary(); if (!LHS) return 0; return ParseBinOpRHS(0, LHS); } /// prototype /// ::= id '(' id* ')' static PrototypeAST *ParsePrototype() { if (CurTok != tok_identifier) return ErrorP("Expected function name in prototype"); std::string FnName = IdentifierStr; getNextToken(); if (CurTok != '(') return ErrorP("Expected '(' in prototype"); //    . std::vector<std::string> ArgNames; while (getNextToken() == tok_identifier) ArgNames.push_back(IdentifierStr); if (CurTok != ')') return ErrorP("Expected ')' in prototype"); //  . getNextToken(); //  ')'. return new PrototypeAST(FnName, ArgNames); } /// definition ::= 'def' prototype expression static FunctionAST *ParseDefinition() { getNextToken(); //  def. PrototypeAST *Proto = ParsePrototype(); if (Proto == 0) return 0; if (ExprAST *E = ParseExpression()) return new FunctionAST(Proto, E); return 0; } /// toplevelexpr ::= expression static FunctionAST *ParseTopLevelExpr() { if (ExprAST *E = ParseExpression()) { //   . PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>()); return new FunctionAST(Proto, E); } return 0; } /// external ::= 'extern' prototype static PrototypeAST *ParseExtern() { getNextToken(); //  extern. return ParsePrototype(); } //===----------------------------------------------------------------------===// // Top-Level parsing (  ) //===----------------------------------------------------------------------===// static void HandleDefinition() { if (ParseDefinition()) { fprintf(stderr, "Parsed a function definition.\n"); } else { //      . getNextToken(); } } static void HandleExtern() { if (ParseExtern()) { fprintf(stderr, "Parsed an extern\n"); } else { //      . getNextToken(); } } static void HandleTopLevelExpression() { //      . if (ParseTopLevelExpr()) { fprintf(stderr, "Parsed a top-level expr\n"); } else { //      . getNextToken(); } } /// top ::= definition | external | expression | ';' static void MainLoop() { while (1) { fprintf(stderr, "ready> "); switch (CurTok) { case tok_eof: return; case ';': getNextToken(); break; //     . case tok_def: HandleDefinition(); break; case tok_extern: HandleExtern(); break; default: HandleTopLevelExpression(); break; } } } //===----------------------------------------------------------------------===// // Main driver code (  ) //===----------------------------------------------------------------------===// int main() { //    . // 1 -  . BinopPrecedence['<'] = 10; BinopPrecedence['+'] = 20; BinopPrecedence['-'] = 20; BinopPrecedence['*'] = 40; // highest. fprintf(stderr, "ready> "); getNextToken(); //    " ". MainLoop(); return 0; } 


PS
, « », ( ). , , , :)

Source: https://habr.com/ru/post/120005/


All Articles