Nim Tutorial (Part 2)

Note from the translator

The first part is here: "Nim Tutorial (Part 1)"

The translation was done for myself, that is clumsily and in haste. The wording of some phrases had to give birth in terrible agony, so that they even remotely resembled Russian. Who knows how to write better - write in a personal, I will rule.

Introduction

"Repetition gives absurdity a look of prudence." - Norman Wildberger

(Original: "Repetition renders the ridiculous reasonable." - Norman Wildberger)
')
This document is a tutorial on the complex constructs of the Nim language . Remember that this document is somewhat out of date, and the manual has many more relevant examples on the complex features of the language.

Pragmas

Pragmas are Nim's way of telling the compiler additional information or commands without entering new keywords. Pragmas are enclosed in special curly braces with dots {. and .} {. and .} . They are not covered in this tutorial. For a list of available pragmas, refer to the manual or user manual .

Object oriented programming

Although support for object-oriented programming (OOP) in Nim is minimalist, powerful OOP techniques can still be used. OOP is considered to be one of, but not the only way to develop programs. It happens that the procedural approach simplifies the code and increases its efficiency. For example, using composition instead of inheritance often leads to a better architecture.

Objects

Objects, like tuples, are designed to pack different values into a single structure. But objects have some features that tuples do not have: inheritance and information hiding. Since objects encapsulate data, the T() object constructor is usually used only in internal development, and the programmer must provide a special procedure for initialization (it is called a constructor).

Objects have access to their type at run time. There is an operator of , with which you can check the type of an object:

 type Person = ref object of RootObj name*: string #  * ,  `name`      age: int #         Student = ref object of Person # Student   Person id: int #    id var student: Student person: Person assert(student of Student) #  true #  : student = Student(name: "Anton", age: 5, id: 2) echo student[]

Object fields that should be visible outside the module in which they are defined are marked with an asterisk ( * ). Unlike tuples, different object types are never equivalent. New object types can only be defined in the type section.

Inheritance is done using object of syntax. Multiple inheritance is not currently supported. If there is no suitable ancestor for the object type, then it can be made the RootObj ancestor, but this is just an agreement. Objects without an ancestor are implicitly declared as final . To enter a new object that is not inherited from system.RootObj , you can use the inheritable pragma (this is used, for example, in the GTK wrapper).

Reference objects can be used regardless of inheritance. This is not strictly necessary, but if non-reference objects are assigned, for example, let person: Person = Student(id: 123) fields of the child class will be truncated.

Note: for simple code reuse, composition ( “part of” relation ) is often preferable to inheritance ( “is” relation ). . Because objects in Nim are value types, composition is just as effective as inheritance.

Mutually Recursive Types

With the help of objects, tuples, and links, you can model rather complex data structures that are dependent on each other and, thus, mutually recursive. In Nim, such types can only be declared within a single type section. (Other solutions would require additional character viewing, which slows down compilation.)

Example:

 type Node = ref NodeObj #    NodeObj NodeObj = object le, ri: Node #     sym: ref Sym # ,    Sym Sym = object #  name: string #   line: int # ,      code: PNode #

Type conversion

Nim distinguishes between type casts and type conversions. The cast is done using the cast operator and forces the compiler to interpret binary data as the specified type.

Type conversion is a more elegant way to turn one type into another: it checks whether types can be converted. If type conversion is not possible, either the compiler will report this, or an exception will be thrown.

The syntax for type conversion is: destination_type(expression_to_convert) (like a normal call).

 proc getID(x: Person): int = Student(x).id

If x not an instance of Student , then an InvalidObjectConversionError will be thrown.

Variant objects

There are situations for which the object hierarchy is overkill, and everything can be solved by simple variant types.

For example:

 #   ,        #   Nim type NodeKind = enum #     nkInt, #     nkFloat, #       nkString, #     nkAdd, #  nkSub, #  nkIf #  if Node = ref NodeObj NodeObj = object case kind: NodeKind #  ``kind``   of nkInt: intVal: int of nkFloat: floatVal: float of nkString: strVal: string of nkAdd, nkSub: leftOp, rightOp: PNode of nkIf: condition, thenPart, elsePart: PNode var n = PNode(kind: nkFloat, floatVal: 1.0) #     `FieldError`,   # n.kind  : n.strVal = ""

As you can see from the example, unlike the object hierarchy, there is no need to do transformations between different object types. However, accessing the wrong fields of an object raises an exception.

Methods

In ordinary object-oriented languages, procedures (also called methods) are tied to a class. This approach has the following disadvantages:

adding a method to the class, the programmer either loses control of it, or makes clumsy workarounds if it is necessary to work with the method separately from the class;
it is often unclear what the method should relate to: join is a string or array method?

Nim avoids these problems by not associating methods with classes. All methods in Nim are multi methods. As we will see later, multimethods differ from procedures only with dynamic binding.

Method Call Syntax

For calling subroutines in Nim there is a special syntactic sugar: the construction obj.method(args) means the same as method(obj, args) . If there are no arguments, then you can skip parentheses: obj.len instead of len(obj) .

This method call syntax is not limited to objects, it can be used for any type:

 echo("abc".len) #  ,   echo(len("abc")) echo("abc".toUpper()) echo({'a', 'b', 'c'}.card) stdout.writeLine("Hallo") #  ,   writeLine(stdout, "Hallo")

(Another point of view on the method call syntax is that it implements the missing postfix notation.)

This makes it easy to write "clean object-oriented code":

 import strutils, sequtils stdout.writeLine("Give a list of numbers (separated by spaces): ") stdout.write(stdin.readLine.split.map(parseInt).max.`$`) stdout.writeLine(" is the maximum!")

Properties

As you can see from the example above, Nim does not need get-properties: they are replaced by ordinary get-procedures, called using the method call syntax. But assigning a value is another matter; for this, a special syntax is needed:

 type Socket* = ref object of RootObj host: int #  ,   proc `host=`*(s: var Socket, value: int) {.inline.} = ##    s.host = value proc host*(s: Socket): int {.inline.} = ##    s.host var s: Socket new s s.host = 34 #  ,   `host=`(s, 34)

(The example also shows inline procedures.)

To implement array properties, you can overload the array access operator [] :

 type Vector* = object x, y, z: float proc `[]=`* (v: var Vector, i: int, value: float) = # setter case i of 0: vx = value of 1: vy = value of 2: vz = value else: assert(false) proc `[]`* (v: Vector, i: int): float = # getter case i of 0: result = vx of 1: result = vy of 2: result = vz else: assert(false)

The example is clumsy, since the vector is best modeled by a tuple that already has access to v[] .

Dynamic binding (dynamic dispatch)

Procedures always use static binding. For dynamic binding, replace the proc keyword with method :

 type PExpr = ref object of RootObj ##      PLiteral = ref object of PExpr x: int PPlusExpr = ref object of PExpr a, b: PExpr #  : 'eval'     method eval(e: PExpr): int = #    quit "to override!" method eval(e: PLiteral): int = ex method eval(e: PPlusExpr): int = eval(ea) + eval(eb) proc newLit(x: int): PLiteral = PLiteral(x: x) proc newPlus(a, b: PExpr): PPlusExpr = PPlusExpr(a: a, b: b) echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4)))

Notice that in the example, the constructors newLit and newPlus are procedures, since for them it is better to use static binding, and eval already a method, because it needs dynamic binding.

In a multimethod, all parameters that have an object type are used for binding:

 type Thing = ref object of RootObj Unit = ref object of Thing x: int method collide(a, b: Thing) {.inline.} = quit "to override!" method collide(a: Thing, b: Unit) {.inline.} = echo "1" method collide(a: Unit, b: Thing) {.inline.} = echo "2" var a, b: Unit new a new b collide(a, b) #  : 2

It can be seen from the example that a multimethod call cannot be ambiguous: collide 2 is preferable to collide 1, since the resolution works from left to right. Thus, Unit , Thing preferable to Thing , Unit .

Performance note : Nim does not create a table of virtual methods, but generates dispatch trees. This avoids costly indirect branching on method calls and allows embedding. But other optimizations, such as computations at compile time or removing dead code, do not work with methods.

Exceptions

In Nim, exceptions are objects. By convention, exception types end with “Error”. The system module defines an exception hierarchy to which you can bind. Exceptions occur from system.Exception , which provides a generic interface.

Exceptions should be placed on the heap, since their lifetime is unknown. The compiler will not allow you to raise an exception placed on the stack. All excited exceptions should at least indicate the reason for their occurrence in the msg field.

It is assumed that exceptions should be raised in exceptional cases: for example, if a file cannot be opened, it should not raise exceptions (the file may not exist).

`raise` command

Exceptions are raised using the raise command:

 var e: ref OSError new(e) e.msg = "the request to the OS failed" raise e

If the raise keyword is not followed by an expression, the last exception is raised again . In order not to write the above code, you can use the newException template from the system module:

 raise newException(OSError, "the request to the OS failed")

`try` command

The try command handles exceptions:

 #      ,    ,  #    var f: File if open(f, "numbers.txt"): try: let a = readLine(f) let b = readLine(f) echo "sum: ", parseInt(a) + parseInt(b) except OverflowError: echo "overflow!" except ValueError: echo "could not convert string to integer" except IOError: echo "IO error!" except: echo "Unknown exception!" # reraise the unknown exception: raise finally: close(f)

Commands after try are executed until an exception occurs. In this case, the corresponding except branch will be executed.

An empty except block is executed if the exception that is thrown is not included in the list of those explicitly listed. This is similar to the else branch in the if command.

If the finally branch is present, then it is always executed after executing exception handlers.

The exception is absorbed in the except branch. If an exception is not handled, it is distributed through the call stack. This means that if an exception occurs, the rest of the procedure, which is not inside the finally block, will not be executed.

If you need to get the current exception object or its message within the except branch, you can use the getCurrentException() and getCurrentExceptionMsg() procedures from the system module. Example:

 try: doSomethingHere() except: let e = getCurrentException() msg = getCurrentExceptionMsg() echo "Got exception ", repr(e), " with message ", msg

Annotation of procedures by exclusions

Using the optional {.raises.} you can specify that the procedure can raise a specific set of exceptions or not raise exceptions at all. If the {.raises.} Pragma is used, the compiler will verify that it is true. For example, if you specify that the procedure raises IOError , and at some point it (or one of the called procedures) raises another exception, the compiler will refuse to compile it. Usage example:

 proc complexProc() {.raises: [IOError, ArithmeticError].} = ... proc simpleProc() {.raises: [].} = ...

After you have such a code, if the list of exceptions changes, the compiler stops with an error indicating the line in the procedure that stopped the pragma validation and the exception that is not in the list. In addition, there will also be a file and a line where the exception appeared, which will help you find a suspicious code, the change of which led to this.

If you want to add the {.raises.} to existing code, the compiler can also help you. You can add the pragma command {.effects.} To the procedure and the compiler will output all the effects that appear at this point (exception tracking is part of the Nim effect system). Another workaround for getting the list of exceptions thrown by the procedure is to use the Nim doc2 , which generates documentation for the entire module and decorates all the procedures with a list of the exceptions thrown. You can read more about the effects system and the corresponding pragmas in the manual .

Generalizations

Generalizations are what allow Nim to parameterize procedures, iterators, or types using type parameters. They are most useful for creating high-performance type safe containers:

 type BinaryTreeObj[T] = object # BinaryTree      #  ``T`` le, ri: BinaryTree[T] #    ;   nil data: T #     BinaryTree*[T] = ref BinaryTreeObj[T] # ,   proc newNode*[T](data: T): BinaryTree[T] = #   new(result) result.data = data proc add*[T](root: var BinaryTree[T], n: BinaryTree[T]) = #     if root == nil: root = n else: var it = root while it != nil: #   ;    ``cmp`` #     ,   ``==``  ``<`` var c = cmp(it.data, n.data) if c < 0: if it.le == nil: it.le = n return it = it.le else: if it.ri == nil: it.ri = n return it = it.ri proc add*[T](root: var BinaryTree[T], data: T) = #  : add(root, newNode(data)) iterator preorder*[T](root: BinaryTree[T]): T = #     .   #    ,    (    # ): var stack: seq[BinaryTree[T]] = @[root] while stack.len > 0: var n = stack.pop() while n != nil: yield n.data add(stack, n.ri) #      n = n.le #      var root: BinaryTree[string] #  BinaryTree  ``string`` add(root, newNode("hello")) #  ``newNode``    add(root, "world") #     for str in preorder(root): stdout.writeLine(str)

The example shows a generic binary tree. Depending on the context, square brackets are used either to enter type parameters or to instantiate a generic procedure, an iterator, or a type. As you can see from the example, generalizations work with overload: the best match is used. The built-in add procedure for sequences is not hidden and is used in the preorder iterator.

Templates

Patterns are a simple substitution mechanism that operates on abstract Nim syntax trees (AST). Templates are processed on a semantic compilation pass. They are well integrated with the rest of the language and do not have the usual drawbacks of C-shny preprocessor macros.

To call a template, call it as a procedure.

Example:

 template `!=` (a, b: expr): expr = #      System not (a == b) assert(5 != 6) #    : assert(not (5 == 6))

Operators != , > , >= , in , isnot , isnot are actually templates: as a result, if you overloaded the operator == , then the operator != Becomes available automatically and works correctly (except for floating-point numbers IEEE - NaN breaks a strict boolean logic).

a > b turns into b < a . a in b transforms into contains(b, a) . isnot and isnot get the obvious meaning.

Patterns are especially useful when it comes to lazy computing. Consider a simple procedure for logging:

 const debug = true proc log(msg: string) {.inline.} = if debug: stdout.writeLine(msg) var x = 4 log("x has the value: " & $x)

There is a flaw in this code: if debug once set to false , then the rather costly operations $ and & will still be executed! (The calculation of the arguments for the procedures is made "greedy.")

Turning the log procedure into a template solves this problem:

 const debug = true template log(msg: string) = if debug: stdout.writeLine(msg) var x = 4 log("x has the value: " & $x)

Parameter types can be ordinary types or metatypes expr (for expressions), stmt (for commands) or typedesc (for type typedesc ). If the template does not explicitly indicate the type of the return value, then stmt used for compatibility with procedures and methods.

If there is a stmt parameter, then it must be the last in the template declaration. The reason is that the commands are passed to the template using a special syntax with a colon ( : :

 template withFile(f: expr, filename: string, mode: FileMode, body: stmt): stmt {.immediate.} = let fn = filename var f: File if open(f, fn, mode): try: body finally: close(f) else: quit("cannot open: " & fn) withFile(txt, "ttempl3.txt", fmWrite): txt.writeLine("line 1") txt.writeLine("line 2")

In the example, two writeLine commands writeLine bound to the body parameter. The withFile template contains service code and helps avoid a common problem: forget to close the file. Note that the let fn = filename command ensures that the filename will be evaluated only once.

Macros

Macros allow intensive transformation of the code at the compilation stage, but they cannot change the syntax of Nim. But this is not a serious limitation, since the syntax of Nim is quite flexible. Macros must be implemented on pure Nim, since the interface of external functions (FFI) is not allowed in the compiler, but apart from this restriction (which will be removed sometime in the future) you can write any code on Nim and the compiler will run it at compile time .

There are two ways of writing macros: either generating the Nim source code and passing it to the compiler for parsing, or manually creating an abstract syntax tree (AST) that is fed to the compiler. To build an AST, you need to know how a particular Nim syntax is converted to an abstract syntax tree. AST is documented in the macros module.

When your macro is ready, there are two ways to call it:

macro call as a procedure (expression macro)
macro call using special macrostmt syntax (command macro)

Expression macros

The following example implements a powerful debug command that takes any number of arguments:

 #      Nim   API,  #   ``macros``: import macros macro debug(n: varargs[expr]): stmt = # `n`  AST Nim,   ; #     : result = newNimNode(nnkStmtList, n) #  ,   : for i in 0..n.len-1: #     ,   ; # `toStrLit`  AST    : result.add(newCall("write", newIdentNode("stdout"), toStrLit(n[i]))) #     ,   ": " result.add(newCall("write", newIdentNode("stdout"), newStrLitNode(": "))) #     ,    : result.add(newCall("writeLine", newIdentNode("stdout"), n[i])) var a: array[0..10, int] x = "some string" a[0] = 42 a[1] = 45 debug(a[0], a[1], x)

The macro call expands to:

 write(stdout, "a[0]") write(stdout, ": ") writeLine(stdout, a[0]) write(stdout, "a[1]") write(stdout, ": ") writeLine(stdout, a[1]) write(stdout, "x") write(stdout, ": ") writeLine(stdout, x)

Command macros

Command macros are defined in the same way as expression macros. But they are called through an expression ending with a colon.

The following example shows a macro that generates a lexical analyzer for regular expressions:

 macro case_token(n: stmt): stmt = #       # ... ( --    :-) discard case_token: #    ,     of r"[A-Za-z_]+[A-Za-z_0-9]*": return tkIdentifier of r"0-9+": return tkInteger of r"[\+\-\*\?]+": return tkOperator else: return tkUnknown

Create your first macro

To give you directions on how to write macros, we will demonstrate how to turn your typical dynamic code into something that can be compiled statically. For example, we use the following code fragment as a starting point:

 import strutils, tables proc readCfgAtRuntime(cfgFilename: string): Table[string, string] = let inputString = readFile(cfgFilename) var source = "" result = initTable[string, string]() for line in inputString.splitLines: #    if line.len < 1: continue var chunks = split(line, ',') if chunks.len != 2: quit("Input needs comma split values, got: " & line) result[chunks[0]] = chunks[1] if result.len < 1: quit("Input file empty!") let info = readCfgAtRuntime("data.cfg") when isMainModule: echo info["licenseOwner"] echo info["licenseKey"] echo info["version"]

Presumably, this code snippet could be used in commercial programs to read the configuration file and display information about who bought the program. This external file could be generated at purchase to include licensing information in the program:

 version,1.1 licenseOwner,Hyori Lee licenseKey,M1Tl3PjBWO2CC48m

The procedure readCfgAtRuntimewill open the given file name and return Tablefrom the module tables. File parsing is done (without error handling or boundary cases) using a procedure splitLinesfrom a module strutils. There are many things that can go wrong; Remember that it explains how to run the code at compile time, and not how to implement copy protection correctly.

Implementing this code as a compile stage procedure will allow us to get rid of the file data.cfgthat would otherwise need to be distributed along with the binary. Plus, if the information is really constant, then from the point of view of logic, there is no sense in keeping it changeable.global variable, better if it is constant. Finally, one of the most valuable pieces is that we can implement some checks at the compilation stage. You can take it as an improved unit test, which does not allow you to get a binary in which something is not working. This prevents users from delivering broken programs that do not start because of a failure in one small critical file.

Source Code Generation

Let's try to change the program so that at the compilation stage we can create a line with the generated source code, which we then pass to the procedure parseStmtfrom the module macros. Here is the modified source code that implements the macro:

  1 import macros, strutils 2 3 macro readCfgAndBuildSource(cfgFilename: string): stmt = 4 let 5 inputString = slurp(cfgFilename.strVal) 6 var 7 source = "" 8 9 for line in inputString.splitLines: 10 # Ignore empty lines 11 if line.len < 1: continue 12 var chunks = split(line, ',') 13 if chunks.len != 2: 14 error("Input needs comma split values, got: " & line) 15 source &= "const cfg" & chunks[0] & "= \"" & chunks[1] & "\"\n" 16 17 if source.len < 1: error("Input file empty!") 18 result = parseStmt(source) 19 20 readCfgAndBuildSource("data.cfg") 21 22 when isMainModule: 23 echo cfglicenseOwner 24 echo cfglicenseKey 25 echo cfgversion

It is good that almost nothing has changed! First, the processing of the input parameter has changed (line 3). In the dynamic version, the procedure readCfgAtRuntimegets a string parameter. However, in the macro version, although it is declared string, it is only the external interface of the macro. When a macro is run, it actually gets the object PNimNode, not the string, and we need to call the procedure strValfrom the module macros(line 5) to get the string passed to the macro.

Secondly, we cannot use the procedure readFilefrom the modulesystemdue to FFI limitations at compile time. If we try to use this procedure (or any other, depending on the FFI), the compiler will generate an error with the message that it cannot calculate the source code dump of the macro and add a stack listing to it showing where the compiler was at the time of the error. We can bypass this restriction by using the procedure slurpfrom the module system, which is made specifically for the compilation stage (there is also a similar procedure gorgethat executes the external program and intercepts its output).

Interestingly, our macro does not return a runtime object.Table. Instead, it generates the Nim source code in the source variable. For each line of the configuration file, a constant variable will be generated (line 15). To avoid conflicts, we prefix these variables cfg. In general, all that the compiler does is replace the macro call line with the following code fragment:

 const cfgversion= "1.1" const cfglicenseOwner= "Hyori Lee" const cfglicenseKey= "M1Tl3PjBWO2CC48m"

You can check it yourself by adding a line with the output of the source code to the screen at the end of the macro and compiling the program. Another difference is that instead of calling the normal procedure quitto exit (which we could call), this version calls the procedure error(line 14). The procedure errordoes the same as quitbut also displays the source code and the line number of the file where the error occurred, which helps the programmer to find the error during the compilation. In this situation, we would be pointed at the line that calls the macro, and not at the line data.cfgthat we process: we have to control this ourselves.

Manual AST generation

To generate AST, we, in theory, would need to be perfectly aware of the structures used by the Nim compiler, which are represented in the module macros. At first glance, this seems like a daunting task. But we can use the macro dumpTree, using it as a command macro, not an expression macro. Since we know that we want to generate a chunk of characters const, we can create the following source file and compile it to see what the compiler expects from us :

 import macros dumpTree: const cfgversion: string = "1.1" const cfglicenseOwner= "Hyori Lee" const cfglicenseKey= "M1Tl3PjBWO2CC48m"

In the course of compiling the source code, we should see the output of the following lines (since this is a macro, compilation will be enough, no binaries need to be run):

 StmtList ConstSection ConstDef Ident !"cfgversion" Ident !"string" StrLit 1.1 ConstSection ConstDef Ident !"cfglicenseOwner" Empty StrLit Hyori Lee ConstSection ConstDef Ident !"cfglicenseKey" Empty StrLit M1Tl3PjBWO2CC48m

With this information, we already have a better idea of what data the compiler needs from us. We need to generate a list of commands. For each source code constant generated ConstSectionand ConstDef. If we moved all these constants into a single block const, we would see only one ConstSectionwith three descendants.

You may not have noticed, but in the example with the dumpTreefirst constant, it explicitly defines the type of the constants. That is why, in the output tree, the two last constants have a second descendant Empty, and the first has a string identifier. So, in general, a definition constconsists of an identifier, an optional type (which may be an empty node), and a value. Armed with this knowledge, let's look at the complete version of the AST macro:

  1 import macros, strutils 2 3 macro readCfgAndBuildAST(cfgFilename: string): stmt = 4 let 5 inputString = slurp(cfgFilename.strVal) 6 7 result = newNimNode(nnkStmtList) 8 for line in inputString.splitLines: 9 #    10 if line.len < 1: continue 11 var chunks = split(line, ',') 12 if chunks.len != 2: 13 error("Input needs comma split values, got: " & line) 14 var 15 section = newNimNode(nnkConstSection) 16 constDef = newNimNode(nnkConstDef) 17 constDef.add(newIdentNode("cfg" & chunks[0])) 18 constDef.add(newEmptyNode()) 19 constDef.add(newStrLitNode(chunks[1])) 20 section.add(constDef) 21 result.add(section) 22 23 if result.len < 1: error("Input file empty!") 24 25 readCfgAndBuildAST("data.cfg") 26 27 when isMainModule: 28 echo cfglicenseOwner 29 echo cfglicenseKey 30 echo cfgversion

Since we repelled from the previous example of generating the source code, we will note only the differences from it. Instead of creating a temporary type variable stringand writing the source code into it as if it were written manually, we use the variable directly resultand create a command list node ( nnkStmtList) that will contain our descendants (line 7).

For each input line, we create a definition for the constant ( nnkConstDef) and wrap it with a section of constants (nnkConstSection). Once these variables are created, we fill them hierarchically (line 17), as shown in the previous dump of the AST tree: the definition of a constant is a descendant of the section definition and contains an identifier node, an empty node (let the compiler itself guess what type it is) and the string literal by value.

The final tip is to write macros: if you are not sure that the AST you built looks fine, you can try using a macro dumpTree. But it cannot be used inside a macro that you write or debug. Instead, display the line generated treeRepr. If at the end of this example you add echo treeRepr(result), you will see the same conclusion as when using the macro dumpTree. Call it at the end Optionally, you can call it at any point in the macro with which you have problems.

Source: https://habr.com/ru/post/271361/

All Articles