I offer readers of "Habrakhabr" a free translation of the article "Rewriting Your Test Suite in Clojure in 24 hours" from the founder of CircleCI .
This story is about how I wrote a compiler for automatic translation of the CircleCI test suite (14,000 lines) to another test library in 24 hours.
To date, this set of tests is probably one of the largest in the world of Clojure. Our server code is 100% Clojure, including tests consisting of 14,000 lines in 140 files, with 5000 asserts. Without parallelization, execution takes 40 minutes.
At the start of this adventure, all tests were written in Midje - a library for BDD testing, something like RSpec. We were not particularly happy with Midje, and decided to switch to clojure.test - probably the most widely used library for testing. clojure.test
simpler and has less magic in it, and at the same time a more developed ecosystem of tools and plugins.
Obviously, it is impractical to rewrite 5000 tests with your hands. Instead, we decided to use Clojure to rewrite them automatically using Clojure’s metaprogramming functions.
Clojure is homo-ionic, which means that any code can be represented as a data structure. Our translator translates each test file into a Clojure data structure. Then we convert the code and write the result back to disk. Once it is recorded, we can run the tests, and even automatically add the file back to the version control system if the tests have passed, and all this without leaving the REPL.
The key to the entire conversion is the read
function. read-string
is a Clojure-embedded function that takes a string containing any Clojure code and returns it as a data structure. The same function is used by the compiler when loading source files. Example: (read-string "[1 2 3]")
returns [1 2 3]
.
We use read
to turn the code of our tests into a large nested list, which can be changed with the usual Clojure code.
Our tests were written in midje
, and we want to convert them to clojure.test
. An example of a test using midje
:
(ns circle.foo-test (:require [midje.sweet :refer :all] [circle.foo :as foo])) (fact "foo works" (foo x) => 42)
and a converted version using clojure.test
:
(ns circle.foo-test (:require [clojure.test :refer :all])) (deftest foo-works (is (= 42 (foo x))))
Conversion includes replacement:
midje.sweet
on clojure.test
in ns form
(fact "a test name"...)
on (deftest a-test-name ...)
, because in the clojure.test
for the naming of tests identifiers are used, not strings
(foo x) => 42
at (is (= 42 (foo x)))
Transformation is a simple deeper tree traversal:
(defn munge-form [form] (let [form (-> form (replace-midje-sweet) (replace-foo) ...)] (cond (or (list? form) (vector? form)) (-> form (replace-fact) (replace-arrow) (replace-bar) ... (map munge-form))) :else form))
Behavior ->
similar to chaining in Ruby or JQuery, or to Bash's pipes: passes the result of a function call calculation, as an argument to a call to the next function.
The first part (let [form ...])
takes a Clojure form and applies each conversion function to it. The second part takes a list of forms representing other Clojure expressions and functions - and recursively transforms them.
An interesting process occurs in the replacement functions. They all look something like this:
(if (this-form-is-relevant? form) (some-transformation form) form)
i.e., they check whether the submitted form complies with the replacement criterion, and if so, transforms it as necessary. For example, replace-midje-sweet
looks like this:
(defn replace-midje-sweet [form] (if (= 'midje.sweet form) 'clojure.test form))
The entire syntax of tests in Midje revolves around the “arrows” - a non-idiomatic design that Midje uses to enhance the declarativeness of tests in the style of BDD. A simple example:
(foo 42) => 5
checks that (foo 42)
returns 5.
Depending on which arrows are used and which types are on the other side of the arrow, a large number of different behaviors vary.
(foo 42) => map?
If in the example above map?
Is a function, it is verified that the result of applying this function to the left side of the expression is true (truthy is not equal to nil or false). In Clojure it would be like this:
(map? (foo 42))
A few examples of midje shooter:
(foo 42) => falsey (foo 42) => map? (foo 42) => (throws Exception) (foo 42) =not=> 3 (foo 42) => #"hello world" ;; regex (foo 42) =not=> "hello"
The real transformation uses the order of forty core.match rules. But they all look something like this:
(match [actual arrow expected] [actual '=> 'truthy] `(is ~actual) [actual '=> expected] `(is (= ~expected ~actual) [actual '=> (_ :guard regex?)] `(is (re-find ~contents ~actual)) [actual '=> nil] `(is (nil? ~actual)))
(For Clojure experts: to improve readability, I omitted the set of ~ 'characters in the macro above. To see how it actually looks, see the source.)
Most transformations are very straightforward. However, everything becomes much more complicated with the form of contains
:
(foo 42) => (contains {:a 1}) (foo 42) => (contains [:a :b] :gaps-ok) (foo 42) => (contains [:a :b] :in-any-order) (foo 42) => (contains "hello")
The last case is especially interesting. For expression
(foo 42) => (contains "hello")
There are two completely different situations in which the test will be successfully passed. (foo 42)
can return a list that contains the “hello” element, or can return a string that contains the substring “hello”:
"hello world" => (contains "hello") ["foo" "hello" "bar"] => (contains "hello")
In general, the contains
form is complex for automatic conversion. Some cases require additional information at run time (as a final example), and since there is no implementation for many contains
cases in Clojure, such as (contains [:a :b] :in-any-order)
, we decided to ignore all cases of contains
. Instead of trying to broadcast them automatically, we use the “disastrous” rule, which looks like this:
[actual arrow expected] (is (~arrow ~expected ~actual))
It turns (foo 42) => (contains bar)
into (is (=> (contains bar) (foo 42)))
. Such code will not compile, because the definition of the function of the arrow from Midje is not loaded, and we can fix it manually.
There was another added complexity with automatic conversion. If we have two expressions:
(let [bar 3] (foo) => bar
and
(let [bar clojure.core/map?] (foo) => bar
The interpretation of the Midje arrow depends on the expression to the right, which can be determined (without problems) only at run time. If bar
resolves to data, for example, string, number, list, or map — Midje checks for equality. But if bar
resolves to a function, Midje calls this function, i.e. (is (= bar (foo)))
versus (is (bar (foo)))
. Our 90% solution includes ( require
) the namespace from the source test, and resolves functions during the conversion process:
(defn form-is-fn? [ns f] (let [resolved (ns-resolve ns f)] (and resolved (or (fn? resolved) (and (var? resolved) (fn? @resolved)))))))
In most cases this works fine, but the problem occurs when the local variable overlaps the global one:
(let [s [1 2 3] count (count s)] (foo s) => count)
In this case, we want (is (= count (foo s)))
, but we get (is (count (foo s)))
, which is wrong, because in the local environment, count
is a number, and (3 [1 2 3])
causes an error. Fortunately, there were few such situations, because solving this problem would require writing a full-fledged compiler with definition of local variables in the environment.
When the transform code was written, we needed to understand if it worked. Since we run the code in the REPL at runtime, we need (after conversion) to simply run the tests using the built-in function clojure.test
.
The implementation of clojure.test
helps bind the transformation and calculation processes together. All test functions can be called from REPL, and even (clojure.test/run-all-tests)
returns a meaningful value — a map
containing the number of tests passed and dropped:
{:pass 61, :test 21, :error 0, :fail 0}
The ability to run tests in the REPL makes the process very convenient, you can make changes in the compiler and retest it, immediately receiving feedback.
However, not everything worked so easy.
A “reader” (a term in Clojure to indicate the part of the compiler that implements the read
function) is designed to convert source files into data structures, primarily for use by the compiler. It removes comments, opens macros, which requires us to check all diffs manually to return these lines. Fortunately, there were only a few of them in the tests. In our programming style, we usually prefer docstrings to comments, and isolate macros in a small number of files, so this doesn’t affect us much.
We have not found a good enough library that would make idiomatic indents in our new code. We used clojure.pprint
, which is probably the best available library, does not do very well with this task. We had no desire to write such a library within the framework of this project, so some files were recorded back onto a disk with non-idiomatic spaces and indents. Now, when we work directly with the file, we can fix it by hand. Otherwise, it would require a tool that understands idiomatic formatting and takes into account the file and string metadata at the data reading stage.
There was a big delay between rewriting test scripts and publishing this article. During this time, the release of rewrite-clj . I did not use it, but at first glance it has something that we lacked so much.
About 40% of the test files passed without our intervention, which is actually amazing, considering how quickly we compiled this solution. In the remaining files, about 90% of the test asserts were converted and completed. Total 94% of assertions in all files were converted automatically - a great result.
Our code can be found on github here . Let us know if you will use it. Since we would not recommend it for an uncontrollable conversion, especially because of comments and macros. This code worked well for CircleCI as part of a controlled process.
From the translator. Thanks for the help: comerc , Source , chort409 and artemyarulin .
Source of the title picture
Source: https://habr.com/ru/post/308734/
All Articles