Heisenbag: Version 1.0

Can the very first release of a product be well-tested enough, or will you inevitably stuff a bunch of cones in production? The testing conference “Heisenbag” , which we recently held in Moscow, was held for the very first time, so that it was possible to look at her illustrative example. How was she? Are there any problems? And what should a testing conference look like if there are completely different specificities within it, and experts from different fields deal with it?

On the morning of December 10, not only testers could be seen in the lobby of the Radisson-Slavyanskaya, but also developers who did not want to just “throw code through the wall”, but felt the importance of testing. In the same hall, among other things, it was possible to play Robohokkey - and since in this game it is necessary to control the beetles, it looked at “Heisenbag” rather symbolically.
')

Or play # robohockey on #heisenbugconf pic.twitter.com/3pfcxDXJO3
- Heisenbug (@HeisenbugConf) December 10, 2016

Ilari Henrik Egerter spoke with the opening keynoat - the owner of such a beard, that when she sees it, people tweet, "alive, it is three times more impressive." Warned, “I am a consultant and manager, so doubt everything I said,” he began to express thoughts that are really capable of raising objections. For example, that the division into manual and automated testing is contrived, because "in both cases it is not the hands that are important, but the brain". And that TDD cannot be considered as full-fledged testing at all, “because the law of the required diversity requires from the controlling system more diversity than from the controlled one”.

Calling “not to try to automate everything at all”, Ilari accompanied it with a visual experiment, testing testers. He proposed to jointly draw up a list of criteria for testing "2 + 2" on the calculator. From the hall they called a lot of things worth paying attention to - for example, the intervals between clicks. When the options dried up, Ilari continued: “Well, you thought about the conditions under which the device might not perform the desired action. But what if it does more than just it? What if in addition to the four, it will display something else? If we look at the screen, we notice something wrong right away. But to make complete criteria for the automated determination of all this is difficult, a lot of unknown unknowns . ”

He finished his speech no less decisively: with the words "do not consider yourself to be the bearers of truth, maintain a level of doubt within yourself."

Dan Quyar , who develops Appium, recently told us in an interview about his brainchild and about mobile testing in general. The report was about the same, but much more detailed: “On iOS, a test warning about a phishing site may interfere with testing, and there’s no point in paying for a certificate for testing only. In such cases, this warning can be circumvented with safariIgnoreFraudWarning. ” The end of the report he also got loud: “Believe in yourself. I'm sure you can do better than Appium. If in SpaceX they put a rocket on a barge, this does not mean that they understand something! ”

The report by Igor khroliz Hrola (Toptal) about automatic tests was perhaps the most obvious: on a huge screen in real time, he demonstrated the code of one test after another, chased them and recorded their performance in a separate table. Each next test turned out to be more productive than the previous one, making the column “how many such tests can be driven out in five minutes” all the more impressive.

Meanwhile, Vladimir vladimirsitnikov Sitnikov (Netcracker) sorted out the intricacies of load testing on the example of the excellent JMeter tool he knows. For example, if the “requests once a minute” are strictly at the beginning of a minute, there is nothing good about this, the picture is not illustrative. On the other hand, if one simply takes and randomizes time, it is not clear how then it is correct to compare measurements with each other - their conditions are obviously different. In addition, if the frequency is set to “100 requests per hour” and the test is conducted for half an hour, I want this to mean “50 per half hour”, and this does not promise a random one. What to do with all this difficult situation? According to Vladimir, to create your right bike - and he did just that, writing a timer for JMeter. This timer produces a Poisson distribution, starting from the random seed (so that the same random delays can be repeated), and at the same time it has a test duration parameter that allows to get an “even half hour”.

The topics covered by Sitnikov in the next slot were developed in two different rooms at once: while Alexei Ragozin (Deutsche Bank) spoke about load testing, Stanislav Bashkirtsev (EPAM) also turned to randomization, but in a different context. He suggests using Random Ext (for JavaScript) or the Datagen library (for Java) written by him to get randomized data for testing. Why does it matter? Stanislav, as an example, showed a JIRA user complaint about an error when trying to insert Emoji into a text field: without randomization, such a scenario might not even come to mind, and users, as it turns out, need it.

The ending of the report turned out to be memorable, and in this case - there was a list “why randomize? 5 "Y", and the fifth "Y" immediately crashed into the memory:

Improves coverage
Reduces the number of tests required
Simplifies the code (preparing unnecessary data)
Simplifies development in general (utilitarian tasks)
Kills tools, infrastructure (unicode)

The report of Philip Kex differed sharply from others: he spoke of such an extreme topic as the automation of testing games (about which he had previously written a blog post ), and urged not to be afraid to create his own tools for this. In the process, Cupcake visually showed, using live coding, how Creative Mobile is testing its mobile DREG racing NitroNation (more than 10 million installations on Google Play). At the same time, I showed a photo of the Cthulhu server running the builds on a number of connected mobile devices - the server got its name because of how it all looks together. All this so impressed the audience that almost the first question from the audience after the report was “Do you have any vacancies”.

Jan Jaap Cannegieter began his speech with the words “it's difficult for you to pronounce my name correctly, but that's okay, yours are also difficult for me” (so we will leave it in Latin just in case). Unlike Ilari, he did not propose to abolish the division into manual and automated testing, but at the same time he agreed that the brain is the main tool used. And to demonstrate this, he began testing the Heisenbag site on the stage: “Take the Website Crawler Tool and analyze the site. He reports a number of errors - but now you need a person to figure out where the errors really are. Let's follow this link - everything seems to be in order, although I don’t understand what it is. And here is the real 404th. Hurray, I found a mistake! ”

In general, it is useful to hold testing conferences: at the same time, the speakers will test everything for you.

And finally, the closing keyout was at Rex Black - such a significant figure in the testing world, that at the conference some took a selfie with him. “I'm from Texas - maybe you expected to be wearing a cowboy hat? We have the expression “all hat and no cattle” about those who dress like a cowboy without being him. I don't want to be all hat and no cattle, so I don’t wear a hat. ”

In his keynote on how to avoid mistakes in the use of metrics, there was an interesting example with the Hawthorne effect : sometimes the very fact of measuring something affects the measurable, making it difficult to get a correct idea of the situation. It curiously echoed the name of the conference: a “heisenbag” is a bug that disappears or changes properties when it is attempted to detect it.

In the course of the same closing keyout, the online broadcast was interrupted for a while, so the anti-bug conference didn’t do without a small bug of its own. However, this became the most serious problem for the whole day - the experience of holding conferences about other topics helped. It turns out that there is a situation where, from the first, it is quite possible to do without large-scale problems: in the event that you have already filled your hand with something similar.

The question “what should be the conference on such a versatile phenomenon as testing” also found its logical answer: it should also be diversified. Reports on the nuances of a specific product and reports on all activities in general, a report on distributed systems on the Yandex experience and a report on games on the experience of the most popular application, reports with code and reports with general reasoning - and in the end, judging by feedback , you could extract both testers and developers.

Following the conference, the Radio QA podcast released a release about Heisenbug with participants in the program committee. And the top 5 conference reports estimated by the audience turned out to be:

Philip Keks (Creative Mobile) - How to teach robots to play games?
Alexander Bayandin (Badoo) - ChromeDriver Jailbreak
Dan Quillard (Appium) - Appium: Automation for Apps
Stanislav Bashkir (EPAM) - Randomized Testing
Vladimir Sitnikov (Netcracker) - Pitfalls in load testing

Source: https://habr.com/ru/post/317938/

All Articles

Heisenbag: Version 1.0

More articles: