Experience creating a mutation testing tool for Erlang

A few weeks ago, I heard about mutation testing for Clojure, this is a way to test the quality of tests, in which small changes are made to the source code and tests either notice this or not. For example, if the program used the condition "a> 1", and replacing with "a <1" does not change the test results, then the tests could be better.

I thought that such a tool would be easy to write for Erlang, because the language provides a wide range of functions for processing internal representations generated by the compiler. But it was not so easy. Under the cut, I described the problems and the solution to which I eventually came to.

The main thing I tried to achieve is the usability of the tool. That is, it should not require complex configuration or installation of Emacs. I downloaded, launched and was delighted.

The following scenario was assumed: a mutation is entered into the code and tests are run. If the tests did not recognize the mutation, then this information should be stored in an understandable way for the report. Difficulties began.
')

Difficulty number 1: run tests

The most universal way to run tests is a shell command, and in Erlang, unfortunately, there are difficulties with getting a return code, this problem is described for example here:
erlang.org/pipermail/erlang-questions/2008-October/039176.html

That is, you can, if you connect an external library (https://github.com/saleyn/erlexec), which requires gcc tricky version and is not tested for osx, and Linux in some cases requires magic, which is described in the Travis file.
github.com/saleyn/erlexec/blob/master/.travis.yml#L23

Okay, let there be an external library, or some additional utility that starts the mutator first, written in Erlang, then the tests themselves, and then generates a report if required.

Difficulty number 2: pretty printer

I started writing a mutator, in order not to be too abstract, I took jsx, there are a lot of unit tests in this package, everything seems to be necessary.

I wrote the following code in Erlang:

{ok, Forms} = epp:parse_file("src/jsx_decoder.erl", []), io:format("~s", [erl_prettypr:format(erl_syntax:form_list(Forms))]).

I was a little embarrassed by the fact that the design:

 -compile({inline, [handle_event/3]}).

turned into

 -compile({inline, [{handle_event, 3}]}).

It turns out the parser understands this and that way.

But most of all the aesthetic pain was caused by the fact that all the code was re-introduced, without comments and to compare it with the initial version, it became quite difficult.

I decided to try to “catch” the parsed code a little earlier in order to somehow save the comments, but it turned out that they disappear immediately after tokenization:

 1> erl_scan:string("a. % this is my atom"). {ok,[{atom,1,a},{dot,1}],1}

as far as I understand, this is a rather low-level function that is performed before macro substitution:

 erl_scan:string("-define(A)."). {ok,[{'-',1}, {atom,1,define}, {'(',1}, {var,1,'A'}, {')',1}, {dot,1}], 1}

Difficulty number 3: macros

If you look at the pretty-printer exhaust, you will notice their absence. In the case of jsx, all its tests were eaten when processing a file, because they were inside "-ifdef (TEST).".

This is of course a solvable problem, it was enough to pass the correct definition set to epp: parse_file, but the disclosure of macros made the code even less recognizable than just after pretty print.

Decision

After a couple days of hesitation, I took up and wrote the Erlang parser on python. To begin with, I discovered the original Erlang grammar:
github.com/erlang/otp/blob/master/lib/stdlib/src/erl_parse.yrl

and decided not to transfer it one-to-one, since it did not seem to be a simple task, instead I wrote an Erlang grammar from memory, and then, having run the parser on several well-known projects, I debugged it. The identity of building AST was not required, it simplified the task. During the debugging process, I found some funny places in these repositories, like hackney for example:
github.com/benoitc/hackney/blob/master/src/hackney.erl#L1025

 -define(METHOD_TPL(Method), Method(URL) -> hackney:request(Method, URL)). -include("hackney_methods.hrl"). -define(METHOD_TPL(Method), Method(URL, Headers) -> hackney:request(Method, URL, Headers)). -include("hackney_methods.hrl") ...

or here, for example, in Virding, in most cases, meaningless use of begin and end:
github.com/rvirding/lfe/blob/develop/src/lfe_parse.erl#L238

 reduce(0, [__1|__Vs]) -> [ begin __1 end | __Vs]; reduce(1, [__1|__Vs]) -> [ begin value (__1) end | __Vs]; reduce(2, [__1|__Vs]) -> [ begin value (__1) end | __Vs]; reduce(3, [__1|__Vs]) -> [ begin value (__1) end | __Vs]; reduce(4, [__1|__Vs]) -> [ begin value (__1) end | __Vs]; reduce(5, [__1|__Vs]) -> [ begin make_fun (value (__1)) end | __Vs];

The parser stumbled upon all constructions that I did not describe in it, that is, on all constructions that I did not remember or did not know, which turned out to be a rather interesting experience.

After the parser was ready the description of transformations (mutations) was already quite a simple matter.

Result

A bunch of fun.
Erlang Python parser under BSD license.
A library of mutations that changes the code without changing the formatting and deleting comments, as a result, it creates a diff that can be applied to the source code and get mutated.

Library code:
github.com/parsifal-47/muterl

Source: https://habr.com/ru/post/319268/

All Articles