📜 ⬆️ ⬇️

Type control of function arguments in Lua

Task


Lua is a dynamic typing language.

This means that a type in a language is not associated with a variable, but with its value:

a = "the meaning of life" --> , <br/>
a = 42 -->

It's comfortable.
')
However, there are often cases when you want to tightly control the type of a variable. The most frequent such case is the checking of function arguments.

Consider a naive example:

function repeater ( n, message ) <br/>
for i = 1 , n do <br/>
print ( message ) <br/>
end <br/>
end <br/>
<br/>
repeater ( 3 , "foo" ) --> foo <br/>
--> foo <br/>
--> foo

If we confuse the arguments of the repeat function, we get a runtime error:

 > repeater ("foo", 3)
 stdin: 2: 'for' limit must be a number
 stack traceback:
	 stdin: 2: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

“What kind of for ?!!” The user of our function will say when he sees this error message.

The function suddenly ceased to be a black box. User became visible internals.

It will be even worse if we accidentally forget to pass the second argument:

 > repeater (3)
 nil
 nil
 nil

No error occurred, but the behavior is potentially incorrect.

This is due to the fact that in Lua inside functions, the non-passed arguments turn into nil.

Another typical error occurs when calling methods on objects:

foo = { } <br/>
function foo:setn ( n ) <br/>
self.n_ = n<br/>
end <br/>
function foo:repeat_message ( message ) <br/>
for i = 1 , self.n_ do <br/>
print ( message ) <br/>
end <br/>
end <br/>
<br/>
foo:setn ( 3 ) <br/>
foo:repeat_message ( "bar" ) --> bar <br/>
--> bar <br/>
--> bar

The colon is syntactic sugar, implicitly passing the object itself as the first argument, self. If we remove all the sugar from the example, we get the following:

foo = { } <br/>
foo.setn = function ( self, n ) <br/>
self.n_ = n<br/>
end <br/>
foo.repeat_message = function ( self, message ) <br/>
for i = 1 , self.n_ do <br/>
print ( message ) <br/>
end <br/>
end <br/>
<br/>
foo.setn ( foo, 3 ) <br/>
foo.repeat_message ( foo, "bar" ) --> bar <br/>
--> bar <br/>
--> bar

If, when calling a method, write a point instead of a colon, self will not be passed:

 > foo.setn (3)
 stdin: 2: attempt to index local 'self' (a number value)
 stack traceback:
	 stdin: 2: in function 'setn'
	 stdin: 1: in main chunk
	 [C]:?

 > foo.repeat_message ("bar")
 stdin: 2: 'for' limit must be a number
 stack traceback:
	 stdin: 2: in function 'repeat_message'
	 stdin: 1: in main chunk
	 [C]:?

Slightly distracted


If in the case of setn, the error message is clear enough, then the error with repeat_message at first glance looks mystical.

What happened? Let's try to look more closely in the console.

In the first case, we write the value at the index "n_" into the number:

 > (3) .n_ = nil

To which we were completely legitimately answered:

 stdin: 1: attempt to index a number value
 stack traceback:
	 stdin: 1: in main chunk
	 [C]:?

In the second case, we tried to read the value from the line at the same index "n_".

 > return ("bar"). n_
 nil

It's simple. The string type in Lua has a metatable that redirects indexing operations to the string table.

 > return getmetatable ("a") .__ index == string
 true

This allows you to use an abbreviated entry when working with strings. The following three options are equivalent:

a = "A" <br/>
print ( string.rep ( a, 3 ) ) --> AAA <br/>
print ( a:rep ( 3 ) ) --> AAA <br/>
print ( ( "A" ) :rep ( 3 ) ) --> AAA

Thus, any read operation of an index from a string is addressed to the string table .

It's good that the recording is disabled:

 > return getmetatable ("a") .__ newindex          
 nil
 > ("a") ._ n = 3
 stdin: 1: attempt to index a string value
 stack traceback:
	 stdin: 1: in main chunk
	 [C]:?

In the string table there is no our key "n_" - therefore for and swears that it was slipped nil instead of the upper border:

 > for i = 1, string ["n_"] do
 >> print ("bar")
 >> end
 stdin: 1: 'for' limit must be a number
 stack traceback:
	 stdin: 1: in main chunk
	 [C]:?

But we digress.

Decision


So we want to control the types of the arguments of our functions.

It's simple, let's check them.

function repeater ( n, message ) <br/>
assert ( type ( n ) == "number" ) <br/>
assert ( type ( message ) == "string" ) <br/>
for i = 1 , n do <br/>
print ( message ) <br/>
end <br/>
end <br/>

Let's see what happened:

 > repeater (3, "foo")
 foo
 foo
 foo

 > repeater ("foo", 3)
 stdin: 2: assertion failed!
 stack traceback:
	 [C]: in function 'assert'
	 stdin: 2: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

 > repeater (3)
 stdin: 3: assertion failed!
 stack traceback:
	 [C]: in function 'assert'
	 stdin: 3: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

Already closer to the point, but not very clearly.

Fighting for clarity


Let's try to improve error messages:

function repeater ( n, message ) <br/>
if type ( n ) ~ = "number" then <br/>
error ( <br/>
"bad n type: expected `number', got `" .. type ( n ) <br/>
2 <br/>
) <br/>
end <br/>
if type ( message ) ~ = "string" then <br/>
error ( <br/>
"bad message type: expected `string', got `" <br/>
.. type ( message ) <br/>
2 <br/>
) <br/>
end <br/>
<br/>
for i = 1 , n do <br/>
print ( message ) <br/>
end <br/>
end

The second parameter of the error function is the level on the call stack, which needs to be shown in the frame rate. Now the “fault” is not our function, but the one who called it.

Error messages have become much better:

 > repeater (3, "foo")
 foo
 foo
 foo

 > repeater ("foo", 3)
 stdin: 1: bad n type: expected `number ', got` string'
 stack traceback:
	 [C]: in function 'error'
	 stdin: 3: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

 > repeater (3)
 stdin: 1: bad message type: expected `string ', got` nil'
 stack traceback:
	 [C]: in function 'error'
	 stdin: 6: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

But now error handling takes up to five times the useful part of the function.

Fighting for brevity


We will handle error handling separately:

function assert_is_number ( v, msg ) <br/>
if type ( v ) == "number" then <br/>
return v<br/>
end <br/>
error ( <br/>
( msg or "assertion failed" ) <br/>
.. ": expected `number', got `" <br/>
.. type ( v ) .. "'" ,<br/>
3 <br/>
) <br/>
end <br/>
<br/>
function assert_is_string ( v, msg ) <br/>
if type ( v ) == "string" then <br/>
return v<br/>
end <br/>
error ( <br/>
( msg or "assertion failed" ) <br/>
.. ": expected `string', got `" <br/>
.. type ( v ) .. "'" ,<br/>
3 <br/>
) <br/>
end <br/>
<br/>
function repeater ( n, message ) <br/>
assert_is_number ( n, "bad n type" ) <br/>
assert_is_string ( message, "bad message type" ) <br/>
<br/>
for i = 1 , n do <br/>
print ( message ) <br/>
end <br/>
end

This can already be used.

A more complete implementation of assert_is_ * is here: typeassert.lua .

Working with methods


We now remake the implementation of the method:

foo = { } <br/>
function foo:setn ( n ) <br/>
assert_is_table ( self, "bad self type" ) <br/>
assert_is_number ( n, "bad n type" ) <br/>
self.n_ = n<br/>
end

The error message looks a little embarrassing:

 > foo.setn (3)
 stdin: 1: bad self type: expected `table ', got` number'
 stack traceback:
	 [C]: in function 'error'
	 stdin: 5: in function 'assert_is_table'
	 stdin: 2: in function 'setn'
	 stdin: 1: in main chunk
	 [C]:?

Error with a dot instead of a colon when calling a method is very common, especially among inexperienced users. Practice shows that in a message to check self it’s better to point it out directly:

function assert_is_self ( v, msg ) <br/>
if type ( v ) == "table" then <br/>
return v<br/>
end <br/>
error ( <br/>
( msg or "assertion failed" ) <br/>
.. ": bad self (got `" .. type ( v ) .. "'); use `:'" ,<br/>
3 <br/>
) <br/>
end <br/>
<br/>
foo = { } <br/>
function foo:setn ( n ) <br/>
assert_is_self ( self ) <br/>
assert_is_number ( n, "bad n type" ) <br/>
self.n_ = n<br/>
end

Now the error message is as clear as possible:

 > foo.setn (3)
 stdin: 1: assertion failed: bad self (got `number ');  use `: '
 stack traceback:
	 [C]: in function 'error'
	 stdin: 5: in function 'assert_is_self'
	 stdin: 2: in function 'setn'
	 stdin: 1: in main chunk
	 [C]:?

We have achieved the desired result in terms of functionality, but is it still possible to improve usability?

Increase usability


I want to visually see in the code what type each argument should be. Now the type is sewn into the function name assert_is_ * and is not very prominent.

It is better to be able to write like this:

function repeater ( n, message ) <br/>
arguments ( <br/>
"number" , n,<br/>
"string" , message<br/>
) <br/>
<br/>
for i = 1 , n do <br/>
print ( message ) <br/>
end <br/>
end

The type of each argument is visually highlighted. You need less code than in the case of assert_is_ *. The description is even somewhat reminiscent of Old Style C function declarations (they are also called K & R-style):

void repeater ( n , message ) <br/>
int n ; <br/>
char * message ; <br/>
{ <br/>
/* ... */ <br/>
}

But back to Lua. Now that we know what we want, it can be realized.

function arguments ( ... ) <br/>
local nargs = select ( "#" , ... ) <br/>
for i = 1 , nargs, 2 do <br/>
local expected_type, value = select ( i, ... ) <br/>
if type ( value ) ~ = expected_type then <br/>
error ( <br/>
"bad argument #" .. ( ( i + 1 ) / 2 ) <br/>
.. " type: expected `" .. expected_type<br/>
.. "', got `" .. type ( value ) .. "'" ,<br/>
3 <br/>
) <br/>
end <br/>
end <br/>
end

Let's try what happened:

 > repeater ("bar", 3)
 stdin: 1: bad argument # 1 type: expected `number ', got` string'
 stack traceback:
	 [C]: in function 'error'
	 stdin: 6: in function 'arguments'
	 stdin: 2: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

 > repeater (3)
 stdin: 1: bad argument # 2 type: expected `string ', got` nil'
 stack traceback:
	 [C]: in function 'error'
	 stdin: 6: in function 'arguments'
	 stdin: 2: in function 'repeater'
	 stdin: 1: in main chunk
	 [C]:?

disadvantages


We have lost a custom error message, but it's not so scary - to understand what argument is being said, its number is enough.

Our function lacks checks on the correctness of the call itself - that an even number of arguments were passed, and that all types are correct. These checks are suggested for the reader to add on their own.

Working with methods


The variant for methods differs only in that we must additionally check self:

function method_arguments ( self, ... ) <br/>
if type ( self ) ~ = "table" then <br/>
error ( <br/>
"bad self (got `" .. type ( v ) .. "'); use `:'" ,<br/>
3 <br/>
) <br/>
end <br/>
arguments ( ... ) <br/>
end <br/>
<br/>
foo = { } <br/>
function foo:setn ( n ) <br/>
method_arguments ( <br/>
self,<br/>
"number" , n<br/>
) <br/>
self.n_ = n<br/>
end

The full implementation of the * arguments () family of functions can be found here: args.lua .

Conclusion


We have created a convenient mechanism to check the function arguments in Lua. It allows you to visually define the expected types of arguments and effectively verify that the values ​​passed to them.

The time spent on assert_is_ * will also not be wasted. Function arguments are not the only place in Lua where types should be controlled. Using the functions of the assert_is_ * family makes such control more intuitive.

Alternatives


There are other solutions. See the Lua Type Checking on the Lua-users wiki . The most interesting is the solution with decorators :

random = <br/>
docstring [ [ Compute random number. ] ] ..<br/>
typecheck ( "number" , '->' , "number" ) ..<br/>
function ( n ) <br/>
return math.random ( n ) <br/>
end

Metalua includes an extension types to describe the types of variables ( description ).

With this extension, you can do this:

- { extension "types" } <br/>
<br/>
function sum ( x :: list ( number ) ) :: number<br/>
local acc :: number = 0 <br/>
for i = 1 , #x do acc = acc+x [ i ] end <br/>
return acc<br/>
end

But this is not quite Lua. :-)

Source: https://habr.com/ru/post/76001/


All Articles