ObjectScript is a new embedded and very lightweight object-oriented
open source programming language. ObjectScript extends the capabilities of languages such as JavaScript, Lua, Ruby and PHP. You can read the original syntax of the language in
this article .
ObjectScript 0.99-vm3 - a new fast virtual machine and new features.
Removed some operators, for example,
clone, numberof , etc., replaced by functions. The last value in the function is returned automatically. Added a short entry for accessing members of the object -
@varname , a new short syntax for declaring functions, and more. other
')
Part 1. Register ObjectScript virtual machine
Traditionally, programming languages use a stack virtual machine (Java, .Net, ActionScript, and some others). Such VMs use the
push, pop commands to add values to the stack and remove it if necessary. When calculating mathematical operations, two values are used as arguments at the top of the stack, and the result of the operation replaces the arguments. For example, the following code:
i = j + k
on the stack VM will be compiled into commands:
push_local j - push_local k - operator + - set_local i - pop -
Previously, ObjectScript also used a stack VM, but not now. The virtual machine has been completely rewritten. Now ObjectScript uses the Register-based Virtual Machine, and the above example will be compiled into one command:
add i, j, k
ObjectScript in the debug information shows this command as follows:
var i = var j [operator +] var k
Immediately striking is the difference in the number of commands to perform an action (5 for a stack VM and 1 for a register VM). Indeed, the stack VM code has a lot more commands than registered ones, the number of VM commands also varies greatly. For example, the previous stack version of ObjectScript VM2 had 111 commands against the current VM3 with 36 commands. The fewer the commands, the simpler the VM, the easier it is to maintain and optimize, the easier it is to implement in different programming languages if necessary, including the easier it is to make a JIT (compile into machine code).
An example of changing values in variables:
i, j = j, i
ObjectScript compiles this code into three commands:
move: # = var j move: var j = var i move: var i = #
where # means using a register for a temporary variable. If you look at the commands, you can see that changing the values of variables occurs in an optimal way and is completely equivalent to the following code:
temp = j j = i i = temp
What is a Register Virtual Machine?
The register virtual machine does not operate with a stack of values, but with values in registers. A register is a temporary local variable that exists all the time when access is possible.
In this lies the first highlight of the register VM. The register is always ready to read and write, the VM does not need to do extra checks for the existence of such a variable (register), its allocation and destruction.
The second highlight is that there are no push and pop commands. Such commands are quite expensive for a VM, since you need to change the stack and control its top, you need to constantly check the stack for overflow and the need for reallocation. When we get rid of the work teams with the stack, it becomes possible to save the results of operations directly to the final variables, bypassing the temporary shelter in the stack.
The third highlight is a much smaller number of commands, a simpler VM for implementation, including for JIT.
As was seen from the above example, the command for a stack VM represents an atomic action from itself; it is simpler than the command for a register VM and takes up less space. For example, the
push command can be encoded with 1 byte and the next command can be encoded with the next byte. In this lies a plus and a fat minus.
Plus stack VM that commands are encoded with a smaller number of bytes and the resulting code takes up less space.
The downside is that part of the commands is still impossible to encode with one byte, for example,
push_double, jump , etc. Such commands require an additional argument, for
push_double , this is a double-double number (8 bytes), for
jump an offset (for example, 4 byte - int32) for the transition. Since The commands in the stack VM follow each other, they are not aligned in any way, so the double number or int32 offset can be in memory at any address, including at even and whatever you want. Those who are familiar with the processor architecture will immediately pay attention to this, because such arguments cannot be read with a single processor command. If you try to do this, for example, read a
dword at an odd address in memory, an exception will occur on the ARM architecture and the program will close with an error. At different architects can be either errors or a catastrophic decline in performance. Therefore, it is only necessary to read such multibyte arguments from the byte stream and by using bitwise offset operations to form the final argument. This somewhat reduces the speed of the VM.
In a registered VM, many commands have three arguments and occupy (along with the arguments) 4 bytes. In order to eliminate the problems described above, all the commands in ObjectScript are aligned and always occupy 4 bytes, even if some of the bits are not really used. For example, the
move command in ObjectScript has two arguments. But even in this case, ObjectScript comes out of the situation as follows. Very often, the
move commands follow each other and fill two consecutive registers. In this case, ObjectScript (at the optimization stage) generates one
move2 command instead of two
move :
move2 R, A, B
which is executed as two
move commands:
move R, A move R+1, B
at the same time
move2 has three arguments and fully uses 4 bytes. Alignment of commands allows you to read them with a single processor command, which increases the speed of the VM.
In combination with other pluses, ObjectScript with the new register VM3 has become one third faster compared to the previous stack VM2.
How a registered VM works and where the registers come from
The ObjectScript compiler collects information not only about local variables that the programmer uses in functions, but also about all temporary variables (compiled by the compiler). Thus, a register is a temporary local variable in a function. VM refers to local variables and registers are fully equivalent, by index. Therefore, the same VM command can work with local variables and registers, and the result of an operation can be saved immediately to a local variable, bypassing the temporary one.
For example, the following code:
k = i - j*k / (x + y - z*i / j) + i
ObjectScript compiles to:
# (59) = var j (5) [operator *] var k (6) # (60) = var x (7) [operator +] var y (8) # (61) = var z (9) [operator *] var i (4) # (61) = # (61) [operator /] var j (5) # (60) = # (60) [operator -] # (61) # (59) = # (59) [operator /] # (60) # (58) = var i (4) [operator -] # (59) var k (10) = # (58) [operator +] var i (4)
In parentheses are the indices where local variables or registers are located (indices are calculated at compile time). The information collected by the compiler on all local variables (including temporary ones - registers) allows us to calculate the maximum size of the stack required to perform this function. When the function starts, it saves the top of the stack and reserves the stack size specified by the compiler. Indexes to registers (local variables) are used as relative offsets from the stored top of the stack, which allows you to make recursive function calls without any difficulty.
Calling a function in a registered VM
In order to call a function, ObjectScript reserves a continuous sequence of registers into which it places function arguments. Then the function itself is called with information about where the first argument begins (the beginning of the sequence) and the number of values in the sequence. The first argument becomes the top of the stack for the new function, and the arguments are already at the required offsets in the stack and become its local variables. An example of what it looks like when calling a function with three parameters:
func(i, k - x*y * (z + i), j*k)
ObjectScript compiles to:
begin call move: # (59) = var func (11) move: # (60) = const null (-1) move: # (61) = var i (4) # (63) = var x (7) [operator *] var y (8) # (64) = var z (9) [operator +] var i (4) # (63) = # (63) [operator *] # (64) # (62) = var k (10) [operator -] # (63) # (63) = var j (5) [operator *] var k (10) end call: start 59, params 5
The function starts with a sequence of five values, starting with index 59. The first two values are service parameters, namely, 59 is the function itself, 60 is this for the function (in this case, null):
move: # (59) = var func (11) move: # (60) = const null (-1)
Next, the parameters themselves are passed, they are located in registers 61 to 63. The first parameter is just the variable
i , it is copied to register 61:
move: # (61) = var i (4)
The second parameter is the result of the mathematical operations
k - x*y * (z + i) , it is stored in register 62:
# (63) = var x (7) [operator *] var y (8) # (64) = var z (9) [operator +] var i (4) # (63) = # (63) [operator *] # (64) # (62) = var k (10) [operator -] # (63)
The third parameter (
j*k ) is stored in register 63:
# (63) = var j (5) [operator *] var k (10)
Now the sequence is completely ready and the function can be called.
Where else is the register virtual machine used
Register VM is also used in Lua 5.0 and higher, in quakec (there was such a scripting programming language for quake 1 and numerous ports), there are register VMs for Java (for example, Dalvik VM). Formally, any program in C ++ / C and other languages that is compiled into machine code uses the register model for internal operations and the stack for calling functions.
Total
The new
ObjectScriptScript (VM3)
Register Virtual Machine has become faster by a third of the previous VM2 stack and has fewer commands, a total of 36 vs. 111 in VM2. A smaller number of commands greatly simplifies the VM and increases the chances of implementing JIT when necessary.
The stack is used for local variables of functions, providing the possibility of a recursive call. The required stack size is reserved once when the function is called, the VM3 commands themselves do not reallocate the stack.
The stack is also used in the ObjectScript API to simplify integration with custom code. ObjectScript API remained unchanged and is fully compatible with the previous version.
Continuation
In the next part, we will discuss other innovations that appeared in ObjectScript, for example, the factorial function can be written as follows:
print "factorial(10) = " .. {|a| a <= 1 ? 1 : a * _F(a-1)}(10)
will output:
factorial(10) = 3628800
and many others other
Other relevant ObjectScript articles: