if/then/else
and a simple "for"
loop.if/then/else
is a fairly simple task. It is required to add support for this “new” concept to the lexical and syntactic analyzers, AST and code generator. This is a good example because it shows how easy it is to “build up” a language over time, gradually expanding it as new ideas appear. def fib(x) if x < 3 then 1 else fib(x-1)+fib(x-2);
if/then/else
expression, like any others, should return a value. Since we mainly use the functional form, we must calculate the condition and then return the value of "then"
or "else"
depending on its value. This is very similar to the C "?:"
Expression.if/then/else
expression is that it evaluates the condition as a logical value: 0.0
is considered false, and everything else is true. If the condition is true, the first subexpression is calculated and returned; if the condition is false, the second subexpression is calculated and returned. Since Kaleidoscope allows for side effects, it is important to fix this behavior. // tok_if = -6, tok_then = -7, tok_else = -8,
... if (IdentifierStr == "def") return tok_def; if (IdentifierStr == "extern") return tok_extern;
if (IdentifierStr == "if") return tok_if; if (IdentifierStr == "then") return tok_then; if (IdentifierStr == "else") return tok_else;
return tok_identifier;
/// IfExprAST - if/then/else. class IfExprAST : public ExprAST { ExprAST *Cond, *Then, *Else; public: IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else) : Cond(cond), Then(then), Else(_else) {} virtual Value *Codegen(); };
/// ifexpr ::= 'if' expression 'then' expression 'else' expression static ExprAST *ParseIfExpr() { getNextToken(); // if. // . ExprAST *Cond = ParseExpression(); if (!Cond) return 0; if (CurTok != tok_then) return Error("expected then"); getNextToken(); // then ExprAST *Then = ParseExpression(); if (Then == 0) return 0; if (CurTok != tok_else) return Error("expected else"); getNextToken(); ExprAST *Else = ParseExpression(); if (!Else) return 0; return new IfExprAST(Cond, Then, Else); }
static ExprAST *ParsePrimary() { switch (CurTok) { default: return Error("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: return ParseNumberExpr(); case '(': return ParseParenExpr();
case tok_if: return ParseIfExpr();
} }
if/then/else
, because here we will introduce new concepts. Everything in the above code was described in detail in previous chapters.
extern foo(); extern bar();
def baz(x) if x then foo() else bar();
declare double @foo() declare double @bar() define double @baz(double %x) { entry: %ifcond = fcmp one double %x, 0.000000e+00 br i1 %ifcond, label %then, label %else then: ; preds = %entry %calltmp = call double @foo() br label %ifcont else: ; preds = %entry %calltmp1 = call double @bar() br label %ifcont ifcont: ; preds = %else, %then %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ] ret double %iftmp }
"llvm-as < t.ll | opt -analyze -view-cfg"
, a window will appear in which you will see this graph:"F->viewCFG()"
or "F->viewCFGOnly()"
(where F
is "Function*"
) by actually including the call in the code and recompiling or calling in the debugger. LLVM has many useful features for visualizing various graphs."x"
) and compares the result with 0.0
using the instruction "fcmp one"
"fcmp one"
. Based on the result of this expression, the code goes into either the "then"
block or the "else"
block, which contain expressions for true / false cases.then/else
blocks are completed, both of these branches go to the "ifcont"
block to execute code that is located after if/then/else
. In this case, the only thing left to do is to return to the calling function. Then the question arises: how do you know the code of the expression to return?Phi
Phi
. If you are not familiar with SSA, then the Wikipedia article will be a good introduction, as well as there are other introductory articles about it that are easily found by your favorite search engine. The short version: the “execution” of the Phi
operation requires the “memorization” of the block from which we came. The operation Phi
takes values ​​and the corresponding input control units. In this case, if the control comes from the "then"
block, it gets the value "calltmp"
. If control comes from an "else"
block, it gets the value "calltmp1"
.= 1; = + 1;
= 1; = + 1;
IfExprAST
method for IfExprAST
: Value *IfExprAST::Codegen() { Value *CondV = Cond->Codegen(); if (CondV == 0) return 0; // 0.0. CondV = Builder.CreateFCmpONE(CondV, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "ifcond");
Function *TheFunction = Builder.GetInsertBlock()->getParent(); // then else. 'then' // . BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction); BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else"); BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont"); Builder.CreateCondBr(CondV, ThenBB, ElseBB);
if/then/else
and correspond to the blocks in the example above. The first line gets the current Function
object (the function being formed). She gets it by asking Builder
about the current base unit and getting its “parent” (current function)."TheFunction"
to the constructor of the "then"
block. This causes the constructor to automatically insert a new block at the end of the specified function. Two other blocks are also created, but not yet inserted into the function.IRBuilder
in IRBuilder
, so the insertion still occurs in the same block in which we are included. Also note that it creates a branch for the "then"
block and the "else"
block, although the "else"
block has not yet been inserted into the function. But everything is in order: this is the standard way to support LLVM links "forward". // . Builder.SetInsertPoint(ThenBB); Value *ThenV = Then->Codegen(); if (ThenV == 0) return 0; Builder.CreateBr(MergeBB); // Codegen of 'Then' , ThenBB PHI. ThenBB = Builder.GetInsertBlock();
"then"
block. Strictly speaking, this call moves the cursor to the end of the specified block. However, since the "then"
block is empty, it also begins our insertion at the beginning of the block. :)"then"
expression from the AST. To complete the block "then"
, we create an unconditional transition to the block merging branches. One of the interesting (and very important) aspects of LLVM IR is that it requires all the basic blocks to “complete” with a control flow statement , such as a return or branch. This means that the entire control flow, including the transfer of control to another code ( fall throughs ), must be explicitly present in the LLVM IR. If this rule is violated, the verifier will return an error.Phi
node in the merge block of the branches, we must specify the block / value pairs that indicate how Phi
will work. It is important to note that the Phi
node expects to get an input point for each block predecessor in the CFG. Why, then, do we get the current block when we just set it in ThenBB
five lines higher? The problem is that the expression "then"
may in fact itself change the block that Builder generates, for example, if it contains nested expressions "if/then/else"
. Since the recursive call of Codegen
can arbitrarily change the current block, we need to get the current actual values ​​for the code to be created in the Phi
node. // else. TheFunction->getBasicBlockList().push_back(ElseBB); Builder.SetInsertPoint(ElseBB); Value *ElseV = Else->Codegen(); if (ElseV == 0) return 0; Builder.CreateBr(MergeBB); // 'Else' , ElseBB PHI. ElseBB = Builder.GetInsertBlock();
"else"
block is almost identical to code generation for the "then"
block. The only significant difference in the first line is that it adds an "else"
block to the function. Recall that earlier the "else"
block was created, but not added to the function. Now that the "then"
and "else"
blocks have been generated, we can end up with the merge code: // . TheFunction->getBasicBlockList().push_back(MergeBB); Builder.SetInsertPoint(MergeBB); PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, "iftmp"); PN->addIncoming(ThenV, ThenBB); PN->addIncoming(ElseV, ElseBB); return PN; }
"merge"
block for the function object. The second block moves the insertion point so that the newly created code will be inserted into the "merge"
block. Then we need to create a PHI
node and set the block / value pairs for the PHI
.CodeGen
function returns the Phi
node as the value calculated for the if/then/else
expression. In our example above, this return value will be passed to the top level function code, which will create a return instruction."for"
:
extern putchard(char)
def printstar(n) for i = 1, i < n, 1.0 in putchard(42); # ascii 42 = '*' # 100 '*' printstar(100);
"i"
), which is iterated from the initial value until the condition (in this case, "i < n"
) becomes true, increasing by the step value (in this case, "1.0"
). If the step value is omitted, then by default it is equal to 1.0
. The loop executes its body expression. Since we have nothing better to return, we will simply define the cycle as always returning 0.0
. Later, when we add mutable variables, it will be more useful.if/then/else
:
... Token ... // tok_if = -6, tok_then = -7, tok_else = -8,
tok_for = -9, tok_in = -10
... in gettok ... if (IdentifierStr == "def") return tok_def; if (IdentifierStr == "extern") return tok_extern; if (IdentifierStr == "if") return tok_if; if (IdentifierStr == "then") return tok_then; if (IdentifierStr == "else") return tok_else;
if (IdentifierStr == "for") return tok_for; if (IdentifierStr == "in") return tok_in;
return tok_identifier;
/// ForExprAST - for/in. class ForExprAST : public ExprAST { std::string VarName; ExprAST *Start, *End, *Step, *Body; public: ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end, ExprAST *step, ExprAST *body) : VarName(varname), Start(start), End(end), Step(step), Body(body) {} virtual Value *Codegen(); };
null
in the AST node: /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression static ExprAST *ParseForExpr() { getNextToken(); // for. if (CurTok != tok_identifier) return Error("expected identifier after for"); std::string IdName = IdentifierStr; getNextToken(); // . if (CurTok != '=') return Error("expected '=' after for"); getNextToken(); // '='. ExprAST *Start = ParseExpression(); if (Start == 0) return 0; if (CurTok != ',') return Error("expected ',' after for start value"); getNextToken(); ExprAST *End = ParseExpression(); if (End == 0) return 0; // . ExprAST *Step = 0; if (CurTok == ',') { getNextToken(); Step = ParseExpression(); if (Step == 0) return 0; } if (CurTok != tok_in) return Error("expected 'in' after for"); getNextToken(); // 'in'. ExprAST *Body = ParseExpression(); if (Body == 0) return 0; return new ForExprAST(IdName, Start, End, Step, Body); }
declare double @putchard(double) define double @printstar(double %n) { entry: ; initial value = 1.0 (inlined into phi) br label %loop loop: ; preds = %loop, %entry %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ] ; %calltmp = call double @putchard(double 4.200000e+01) ; %nextvar = fadd double %i, 1.000000e+00 ; %cmptmp = fcmp ult double %i, %n %booltmp = uitofp i1 %cmptmp to double %loopcond = fcmp one double %booltmp, 0.000000e+00 br i1 %loopcond, label %loop, label %afterloop afterloop: ; preds = %loop ; loop always returns 0.0 ret double 0.000000e+00 }
phi
node, several expressions, and some basic blocks. Let's see how it fits together.Codegen
very simple: we just get the expression for the initial value of the loop: Value *ForExprAST::Codegen() { // Emit the start code first, without 'variable' in scope. Value *StartVal = Start->Codegen(); if (StartVal == 0) return 0;
if/then/else
or for/in
expressions). // , // . Function *TheFunction = Builder.GetInsertBlock()->getParent(); BasicBlock *PreheaderBB = Builder.GetInsertBlock(); BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction); // LoopBB. Builder.CreateBr(LoopBB);
if/then/else
. We remember the block from which we got into the loop, since we will need it to create the node Phi
. Then we create the actual block that starts the loop, and create an unconditional transition between the two blocks. // LoopBB. Builder.SetInsertPoint(LoopBB); // PHI Start. PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str()); Variable->addIncoming(StartVal, PreheaderBB);
"preheader"
header for the loop is set, we proceed to generate the loop body code. First, we move the cursor and create a PHI
node for the loop variable. Since we already know the initial value, we add it to the Phi
node. Please note that Phi
will eventually get the second value, but we cannot yet install it (because it does not exist yet!). // PHI. // , , . Value *OldVal = NamedValues[VarName]; NamedValues[VarName] = Variable; // . , , // . , , // , . if (Body->Codegen() == 0) return 0;
"for"
loop introduces a new variable into the symbol table. This means that our symbol table can contain either function arguments or loop variables. Before we generate the loop body code, we add the loop variable as the current value for its name. Please note - it is possible that there is already a variable with the same name in the outer scope. You can easily make this an error (return an error and return null
if there is already an entry for VarName), but we will allow the overlapping of variables. For proper processing, we memorize the potentially overlapping value in OldVal
(which will be null
if there is no overlapping variable). // . Value *StepVal; if (Step) { StepVal = Step->Codegen(); if (StepVal == 0) return 0; } else { // , 1.0. StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0)); } Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
1.0
, if it is not specified. "NextVar"
will be the value of the loop variable in the next loop step. // . Value *EndCond = End->Codegen(); if (EndCond == 0) return EndCond; // 0.0. EndCond = Builder.CreateFCmpONE(EndCond, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "loopcond");
if/then/else
. // " " . BasicBlock *LoopEndBB = Builder.GetInsertBlock(); BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction); // LoopEndBB. Builder.CreateCondBr(EndCond, LoopBB, AfterBB); // AfterBB. Builder.SetInsertPoint(AfterBB);
phi
), and then creates a block to exit the loop ( "afterloop"
). Based on the value of the exit condition, it creates a conditional transition that chooses between executing the loop again or exiting the loop. Any subsequent code is added to the block "afterloop"
, so we move the insertion point exactly there. // PHI. Variable->addIncoming(NextVar, LoopEndBB); // . if (OldVal) NamedValues[VarName] = OldVal; else NamedValues.erase(VarName); // for 0.0. return Constant::getNullValue(Type::getDoubleTy(getGlobalContext())); }
"NextVar"
and we can add the loop's PHI value to the node. After that, we delete the loop variable from the symbol table, so it is not in scope after the loop. Finally, the code generated for the loop always returns 0.0
, so we return from ForExprAST::Codegen
.if/then/else
and for
... Going as follows:#
g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
#
./toy
#include "llvm/DerivedTypes.h" #include "llvm/ExecutionEngine/ExecutionEngine.h" #include "llvm/ExecutionEngine/JIT.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/PassManager.h" #include "llvm/Analysis/Verifier.h" #include "llvm/Analysis/Passes.h" #include "llvm/Target/TargetData.h" #include "llvm/Target/TargetSelect.h" #include "llvm/Transforms/Scalar.h" #include "llvm/Support/IRBuilder.h" #include <cstdio> #include <string> #include <map> #include <vector> using namespace llvm; //===----------------------------------------------------------------------===// // Lexer ( ) //===----------------------------------------------------------------------===// // [0-255], , // enum Token { tok_eof = -1, // ( ) tok_def = -2, tok_extern = -3, // ( : , ) tok_identifier = -4, tok_number = -5, // tok_if = -6, tok_then = -7, tok_else = -8, tok_for = -9, tok_in = -10 }; static std::string IdentifierStr; // , tok_identifier static double NumVal; // , tok_number /// gettok - . static int gettok() { static int LastChar = ' '; // . while (isspace(LastChar)) LastChar = getchar(); if (isdigit(LastChar) || LastChar == '.') { // : [0-9.]+ IdentifierStr = LastChar; while (isalnum((LastChar = getchar()))) IdentifierStr += LastChar; if (IdentifierStr == "def") return tok_def; if (IdentifierStr == "extern") return tok_extern; if (IdentifierStr == "if") return tok_if; if (IdentifierStr == "then") return tok_then; if (IdentifierStr == "else") return tok_else; if (IdentifierStr == "for") return tok_for; if (IdentifierStr == "in") return tok_in; return tok_identifier; } if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+ std::string NumStr; do { NumStr += LastChar; LastChar = getchar(); } while (isdigit(LastChar) || LastChar == '.'); NumVal = strtod(NumStr.c_str(), 0); return tok_number; } if (LastChar == '#') { // do LastChar = getchar(); while (LastChar != EOF && LastChar != '\n' && LastChar != '\r'); if (LastChar != EOF) return gettok(); } // . if (LastChar == EOF) return tok_eof; // ASCII int ThisChar = LastChar; LastChar = getchar(); return ThisChar; } //===----------------------------------------------------------------------===// // Abstract Syntax Tree ( ) //===----------------------------------------------------------------------===// /// ExprAST - . class ExprAST { public: virtual ~ExprAST() {} virtual Value *Codegen() = 0; }; /// NumberExprAST - (, "1.0"). class NumberExprAST : public ExprAST { double Val; public: NumberExprAST(double val) : Val(val) {} virtual Value *Codegen(); }; /// VariableExprAST - (, "a"). class VariableExprAST : public ExprAST { std::string Name; public: VariableExprAST(const std::string &name) : Name(name) {} virtual Value *Codegen(); }; /// BinaryExprAST - . class BinaryExprAST : public ExprAST { char Op; ExprAST *LHS, *RHS; public: BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) : Op(op), LHS(lhs), RHS(rhs) {} virtual Value *Codegen(); }; /// CallExprAST - . class CallExprAST : public ExprAST { std::string Callee; std::vector<ExprAST*> Args; public: CallExprAST(const std::string &callee, std::vector<ExprAST*> &args) : Callee(callee), Args(args) {} virtual Value *Codegen(); }; /// IfExprAST - if/then/else. class IfExprAST : public ExprAST { ExprAST *Cond, *Then, *Else; public: IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else) : Cond(cond), Then(then), Else(_else) {} virtual Value *Codegen(); }; /// ForExprAST - for/in. class ForExprAST : public ExprAST { std::string VarName; ExprAST *Start, *End, *Step, *Body; public: ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end, ExprAST *step, ExprAST *body) : VarName(varname), Start(start), End(end), Step(step), Body(body) {} virtual Value *Codegen(); }; /// PrototypeAST - "" , /// (, , /// ). class PrototypeAST { std::string Name; std::vector<std::string> Args; public: PrototypeAST(const std::string &name, const std::vector<std::string> &args) : Name(name), Args(args) {} Function *Codegen(); }; /// FunctionAST - class FunctionAST { PrototypeAST *Proto; ExprAST *Body; public: FunctionAST(PrototypeAST *proto, ExprAST *body) : Proto(proto), Body(body) {} Function *Codegen(); }; //===----------------------------------------------------------------------===// // Parser ( ) //===----------------------------------------------------------------------===// /// CurTok/getNextToken - . CurTok - /// , . getNextToken /// CurTok. static int CurTok; static int getNextToken() { return CurTok = gettok(); } /// BinopPrecedence - static std::map<char, int> BinopPrecedence; /// GetTokPrecedence - . static int GetTokPrecedence() { if (!isascii(CurTok)) return -1; // , . int TokPrec = BinopPrecedence[CurTok]; if (TokPrec <= 0) return -1; return TokPrec; } /// Error* - . ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;} PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; } FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; } static ExprAST *ParseExpression(); /// identifierexpr /// ::= identifier /// ::= identifier '(' expression* ')' static ExprAST *ParseIdentifierExpr() { std::string IdName = IdentifierStr; getNextToken(); // . if (CurTok != '(') // . return new VariableExprAST(IdName); // . getNextToken(); // ( std::vector<ExprAST*> Args; if (CurTok != ')') { while (1) { ExprAST *Arg = ParseExpression(); if (!Arg) return 0; Args.push_back(Arg); if (CurTok == ')') break; if (CurTok != ',') return Error("Expected ')' or ',' in argument list"); getNextToken(); } } // ')'. getNextToken(); return new CallExprAST(IdName, Args); } /// numberexpr ::= number static ExprAST *ParseNumberExpr() { ExprAST *Result = new NumberExprAST(NumVal); getNextToken(); // return Result; } /// parenexpr ::= '(' expression ')' static ExprAST *ParseParenExpr() { getNextToken(); // (. ExprAST *V = ParseExpression(); if (!V) return 0; if (CurTok != ')') return Error("expected ')'"); getNextToken(); // ). return V; } /// ifexpr ::= 'if' expression 'then' expression 'else' expression static ExprAST *ParseIfExpr() { getNextToken(); // if. // . ExprAST *Cond = ParseExpression(); if (!Cond) return 0; if (CurTok != tok_then) return Error("expected then"); getNextToken(); // then ExprAST *Then = ParseExpression(); if (Then == 0) return 0; if (CurTok != tok_else) return Error("expected else"); getNextToken(); ExprAST *Else = ParseExpression(); if (!Else) return 0; return new IfExprAST(Cond, Then, Else); } /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression static ExprAST *ParseForExpr() { getNextToken(); // for. if (CurTok != tok_identifier) return Error("expected identifier after for"); std::string IdName = IdentifierStr; getNextToken(); // . if (CurTok != '=') return Error("expected '=' after for"); getNextToken(); // '='. ExprAST *Start = ParseExpression(); if (Start == 0) return 0; if (CurTok != ',') return Error("expected ',' after for start value"); getNextToken(); ExprAST *End = ParseExpression(); if (End == 0) return 0; // . ExprAST *Step = 0; if (CurTok == ',') { getNextToken(); Step = ParseExpression(); if (Step == 0) return 0; } if (CurTok != tok_in) return Error("expected 'in' after for"); getNextToken(); // 'in'. ExprAST *Body = ParseExpression(); if (Body == 0) return 0; return new ForExprAST(IdName, Start, End, Step, Body); } /// primary /// ::= identifierexpr /// ::= numberexpr /// ::= parenexpr /// ::= ifexpr /// ::= forexpr static ExprAST *ParsePrimary() { switch (CurTok) { default: return Error("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: return ParseNumberExpr(); case '(': return ParseParenExpr(); case tok_if: return ParseIfExpr(); case tok_for: return ParseForExpr(); } } /// binoprhs /// ::= ('+' primary)* static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) { // , while (1) { int TokPrec = GetTokPrecedence(); // , // , if (TokPrec < ExprPrec) return LHS; // , , . int BinOp = CurTok; getNextToken(); // // ExprAST *RHS = ParsePrimary(); if (!RHS) return 0; // BinOp RHS , RHS, // RHS LHS. int NextPrec = GetTokPrecedence(); if (TokPrec < NextPrec) { RHS = ParseBinOpRHS(TokPrec+1, RHS); if (RHS == 0) return 0; } // LHS/RHS. LHS = new BinaryExprAST(BinOp, LHS, RHS); } } /// expression /// ::= primary binoprhs /// static ExprAST *ParseExpression() { ExprAST *LHS = ParsePrimary(); if (!LHS) return 0; return ParseBinOpRHS(0, LHS); } /// prototype /// ::= id '(' id* ')' static PrototypeAST *ParsePrototype() { if (CurTok != tok_identifier) return ErrorP("Expected function name in prototype"); // . std::string FnName = IdentifierStr; getNextToken(); if (CurTok != '(') return ErrorP("Expected '(' in prototype"); std::vector<std::string> ArgNames; while (getNextToken() == tok_identifier) ArgNames.push_back(IdentifierStr); if (CurTok != ')') return ErrorP("Expected ')' in prototype"); // . getNextToken(); // ')'. return new PrototypeAST(FnName, ArgNames); } /// definition ::= 'def' prototype expression static FunctionAST *ParseDefinition() { getNextToken(); // def. PrototypeAST *Proto = ParsePrototype(); if (Proto == 0) return 0; if (ExprAST *E = ParseExpression()) return new FunctionAST(Proto, E); return 0; } /// toplevelexpr ::= expression static FunctionAST *ParseTopLevelExpr() { if (ExprAST *E = ParseExpression()) { // . PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>()); return new FunctionAST(Proto, E); } return 0; } /// external ::= 'extern' prototype static PrototypeAST *ParseExtern() { getNextToken(); // extern. return ParsePrototype(); } //===----------------------------------------------------------------------===// // Code Generation () //===----------------------------------------------------------------------===// static Module *TheModule; static IRBuilder<> Builder(getGlobalContext()); static std::map<std::string, Value*> NamedValues; static FunctionPassManager *TheFPM; Value *ErrorV(const char *Str) { Error(Str); return 0; } Value *NumberExprAST::Codegen() { return ConstantFP::get(getGlobalContext(), APFloat(Val)); } Value *VariableExprAST::Codegen() { // . Value *V = NamedValues[Name]; return V ? V : ErrorV("Unknown variable name"); } Value *BinaryExprAST::Codegen() { Value *L = LHS->Codegen(); Value *R = RHS->Codegen(); if (L == 0 || R == 0) return 0; switch (Op) { case '+': return Builder.CreateFAdd(L, R, "addtmp"); case '-': return Builder.CreateFSub(L, R, "subtmp"); case '*': return Builder.CreateFMul(L, R, "multmp"); case '<': L = Builder.CreateFCmpULT(L, R, "cmptmp"); // 0 1 0.0 1.0 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), "booltmp"); default: return ErrorV("invalid binary operator"); } } Value *CallExprAST::Codegen() { // . Function *CalleeF = TheModule->getFunction(Callee); if (CalleeF == 0) return ErrorV("Unknown function referenced"); // . if (CalleeF->arg_size() != Args.size()) return ErrorV("Incorrect # arguments passed"); std::vector<Value*> ArgsV; for (unsigned i = 0, e = Args.size(); i != e; ++i) { ArgsV.push_back(Args[i]->Codegen()); if (ArgsV.back() == 0) return 0; } return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp"); } Value *IfExprAST::Codegen() { Value *CondV = Cond->Codegen(); if (CondV == 0) return 0; // 0.0. CondV = Builder.CreateFCmpONE(CondV, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "ifcond"); Function *TheFunction = Builder.GetInsertBlock()->getParent(); // then else. 'then' // . BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction); BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else"); BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont"); Builder.CreateCondBr(CondV, ThenBB, ElseBB); // . Builder.SetInsertPoint(ThenBB); Value *ThenV = Then->Codegen(); if (ThenV == 0) return 0; Builder.CreateBr(MergeBB); // 'Then' , ThenBB PHI. ThenBB = Builder.GetInsertBlock(); // else. TheFunction->getBasicBlockList().push_back(ElseBB); Builder.SetInsertPoint(ElseBB); Value *ElseV = Else->Codegen(); if (ElseV == 0) return 0; Builder.CreateBr(MergeBB); // 'Else' , ElseBB PHI. ElseBB = Builder.GetInsertBlock(); // . TheFunction->getBasicBlockList().push_back(MergeBB); Builder.SetInsertPoint(MergeBB); PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, "iftmp"); PN->addIncoming(ThenV, ThenBB); PN->addIncoming(ElseV, ElseBB); return PN; } Value *ForExprAST::Codegen() { // Output this as: // ... // start = startexpr // goto loop // loop: // variable = phi [start, loopheader], [nextvariable, loopend] // ... // bodyexpr // ... // loopend: // step = stepexpr // nextvariable = variable + step // endcond = endexpr // br endcond, loop, endloop // outloop: // Emit the start code first, without 'variable' in scope. Value *StartVal = Start->Codegen(); if (StartVal == 0) return 0; // , // . Function *TheFunction = Builder.GetInsertBlock()->getParent(); BasicBlock *PreheaderBB = Builder.GetInsertBlock(); BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction); // LoopBB. Builder.CreateBr(LoopBB); // LoopBB. Builder.SetInsertPoint(LoopBB); // PHI Start. PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str()); Variable->addIncoming(StartVal, PreheaderBB); // PHI. // , , . Value *OldVal = NamedValues[VarName]; NamedValues[VarName] = Variable; // . , , // . , , // , . if (Body->Codegen() == 0) return 0; // . Value *StepVal; if (Step) { StepVal = Step->Codegen(); if (StepVal == 0) return 0; } else { // , 1.0. StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0)); } Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar"); // . Value *EndCond = End->Codegen(); if (EndCond == 0) return EndCond; // 0.0. EndCond = Builder.CreateFCmpONE(EndCond, ConstantFP::get(getGlobalContext(), APFloat(0.0)), "loopcond"); // " " . BasicBlock *LoopEndBB = Builder.GetInsertBlock(); BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction); // LoopEndBB. Builder.CreateCondBr(EndCond, LoopBB, AfterBB); // AfterBB. Builder.SetInsertPoint(AfterBB); // PHI. Variable->addIncoming(NextVar, LoopEndBB); // . if (OldVal) NamedValues[VarName] = OldVal; else NamedValues.erase(VarName); // for 0.0. return Constant::getNullValue(Type::getDoubleTy(getGlobalContext())); } Function *PrototypeAST::Codegen() { // : double(double,double) .. std::vector<const Type*> Doubles(Args.size(), Type::getDoubleTy(getGlobalContext())); FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()), Doubles, false); Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule); // (F) , 'Name'. // , . if (F->getName() != Name) { // , . F->eraseFromParent(); F = TheModule->getFunction(Name); // (F) , . if (!F->empty()) { ErrorF("redefinition of function"); return 0; } // (F) , . if (F->arg_size() != Args.size()) { ErrorF("redefinition of function with different # args"); return 0; } } // . unsigned Idx = 0; for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size(); ++AI, ++Idx) { AI->setName(Args[Idx]); // . NamedValues[Args[Idx]] = AI; } return F; } Function *FunctionAST::Codegen() { NamedValues.clear(); Function *TheFunction = Proto->Codegen(); if (TheFunction == 0) return 0; // . BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); Builder.SetInsertPoint(BB); if (Value *RetVal = Body->Codegen()) { // . Builder.CreateRet(RetVal); // , (). verifyFunction(*TheFunction); // . TheFPM->run(*TheFunction); return TheFunction; } // , . TheFunction->eraseFromParent(); return 0; } //===----------------------------------------------------------------------===// // Top-Level parsing ( ) JIT //===----------------------------------------------------------------------===// static ExecutionEngine *TheExecutionEngine; static void HandleDefinition() { if (FunctionAST *F = ParseDefinition()) { if (Function *LF = F->Codegen()) { fprintf(stderr, "Read function definition:"); LF->dump(); } } else { // . getNextToken(); } } static void HandleExtern() { if (PrototypeAST *P = ParseExtern()) { if (Function *F = P->Codegen()) { fprintf(stderr, "Read extern: "); F->dump(); } } else { // . getNextToken(); } } static void HandleTopLevelExpression() { // . if (FunctionAST *F = ParseTopLevelExpr()) { if (Function *LF = F->Codegen()) { // JIT-, . void *FPtr = TheExecutionEngine->getPointerToFunction(LF); // ( , double), // . double (*FP)() = (double (*)())(intptr_t)FPtr; fprintf(stderr, "Evaluated to %f\n", FP()); } } else { // . getNextToken(); } } /// top ::= definition | external | expression | ';' static void MainLoop() { while (1) { fprintf(stderr, "ready> "); switch (CurTok) { case tok_eof: return; case ';': getNextToken(); break; // . case tok_def: HandleDefinition(); break; case tok_extern: HandleExtern(); break; default: HandleTopLevelExpression(); break; } } } //===----------------------------------------------------------------------===// // "" , // ("extern") . //===----------------------------------------------------------------------===// /// putchard - 0. extern "C" double putchard(double X) { putchar((char)X); return 0; } //===----------------------------------------------------------------------===// // Main driver code ( ) //===----------------------------------------------------------------------===// int main() { InitializeNativeTarget(); LLVMContext &Context = getGlobalContext(); // . // 1 - . BinopPrecedence['<'] = 10; BinopPrecedence['+'] = 20; BinopPrecedence['-'] = 20; BinopPrecedence['*'] = 40; // . // Prime the first token. fprintf(stderr, "ready> "); getNextToken(); // , . TheModule = new Module("my cool jit", Context); // JIT. . std::string ErrStr; TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create(); if (!TheExecutionEngine) { fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str()); exit(1); } FunctionPassManager OurFPM(TheModule); // . , // . OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData())); // AliasAnalysis GVN. OurFPM.add(createBasicAliasAnalysisPass()); // "peephole" "bit-twiddling". OurFPM.add(createInstructionCombiningPass()); // . OurFPM.add(createReassociatePass()); // . OurFPM.add(createGVNPass()); // ( ..). OurFPM.add(createCFGSimplificationPass()); OurFPM.doInitialization(); // , . TheFPM = &OurFPM; // " ". MainLoop(); TheFPM = 0; // . TheModule->dump(); return 0; }
Source: https://habr.com/ru/post/120881/
All Articles