📜 ⬆️ ⬇️

Probabilistic Unit Testing. (Chaos driven unit testing.)

All more or less complex software systems contain errors (if not their own, then induced by the libraries used, or due to inaccurate awareness of the behavioral paradigms of the frameworks used).
Often, Unit tests are used to test the system at the development stage.

So the programmer can control the behavior of the system at control points and boundary values.
Often it is the incorrect processing of boundary values ​​that leads to problems. And experienced programmers know this and take into account when designing Unit-tests.

The convenience of Unit-tests is also in the fact that by changing the code you expect to get predictable results and conduct fully automatic testing according to the existing scenarios in order to quickly identify the troubles induced by the changes.
')
For example, you write code to work on Intel and PPC, develop it on Intel, but consider the byte order. Then you run your Unit tests to compare the output with the standard and you find discrepancies - understandably, you forgot to invert bytes somewhere - correct - everything is fine.

However, any user always carries the element of chance.

An experienced programmer combines the talent of a quality tester and can catch a lot of mistakes before the program is released.

If the program does more than print “Hello World!”, Then hidden errors remain in any case.
It may be errors in logic as well.

The program is compiled, all Warning'i are eliminated ... but sometimes something goes wrong ... at the user (who lives far away in a house on an island in the Pacific Ocean - there is no possibility to come to him and feel). The programmer clicked and tested everything he could, but he didn’t find any errors. What to do?

Any application can be considered as an array of interrelated C components integrated into a logical network.
Each component accepts I arguments as input, and gives O results as output.
We compose generators for obtaining random arguments I, feed them to the input of components C and check the outputs O, as well as check the integrity of the state of the component C with additional tests.

So we test each component with a random data set. The method can be extended to a complete network composed of components or to selected subnets.

With this kind of stress testing, random data will surely find certain hidden boundary points, and test those branches of your conditions that your imaginary tests did not reach under normal circumstances.

We run thousands of iterations more and more new random data, select and request random (valid) operations on these data from components.

Of course, at the 1st stage you will have to adjust the integrity monitoring subsystem, but then everything will go like clockwork.
When a failure is detected, the probabilistic testing system saves the command log to a file, so that instead of a random factor, we get a very specific script that causes an error and can be played.

Before finding an error, this scenario can be attempted to be reduced to the minimum length, so that the error still manifests itself.
After the elimination of the error, this scenario should work without failures, and we have another Unit-test more in the collection of our tests for the future.

As a practical example, a probabilistic testing system can be used to catch errors when creating dictionaries.
The following operations can be distinguished:



Testing code framework (Objective-C fragment):

 srand (time (0));

 NSMutableString * log = [NSMutableString string]; // for commands
 int prev = -1;
 unsigned i;
 #define ST_COUNT 2000

 id model = [SomeModelFactory createModelObject];

 for (i = 0;! status && i <ST_COUNT; i ++)
 {
     int todo;
     do
     {
         todo = rand ()% 4;
     }
      while (3 == todo && todo == prev);
     prev = todo;

     if (i + 1 == ST_COUNT) // last iter.
         todo = 3; // force int.  check

     switch (todo)
      {
         case 0: // add new word to the model
         {
             ...
         }
         case 1: // set existing word
         {
             ...
         }
         case 2: // remove word
         {
             ...
         }
         case 3: // pint.  check
         {
             if (i + 1 == ST_COUNT || rand ()% 2)
             {
                 ...
                 status = 3; // set some error code if fail
             }
         }
     }
 }

 if (status)
 {
     [log writeToFile: @ "/ tmp / commands.log" atomically: YES encoding: NSUTF8StringEncoding error: NULL];
     exit (status);
 }


Argument Generator:

 char genChar ()
 {
     // allowed chars
     static char allowed [] = "ABCDEFGHIJKLMNOPQRSTUVWXUZabcdefghijklmnopqrstuvwxyz1234567890 /";
     return allowed [rand ()% (sizeof (allowed) -1)];
 }

 NSString * genWord (int min, int max)
 {
     NSMutableString * res = [NSMutableString string];

     if (max <min)
         max = min;

     int toGen = min + rand ()% (max - min + 1); 

     int i;
     for (i = 0; i <toGen; i ++)
          [res appendFormat: @ "% c", genChar ()];

     return res;
 }


All logs with errors can be shifted from / tmp, for example, to daddy issues, daddies case-1, case-2, ...
To use case number to get rid of any verification in the future.

The probability of choosing certain operations (from the items listed above) can be adjusted.
Thus, the integrity check can be performed not every iteration, and in the event of a failure, you can more accurately identify the location by performing the check after each operation.
When performing operations, we maintain the independent state of the system in the usual NSMutableDictionary dictionary, and when checking integrity, we check that the number of words in the control and created dictionaries matches that all words are and have the same text of the article as in the test example. That a word can be extracted by an index, and by an index one can get the same word.

To influence the verification system, you can change the output of the argument generators — by forcing more intersections ... and thereby shifting the likelihood of certain events.
For the example above, we can generate words from 1 to 10 letters and articles from 1 to 100 characters.
We can change the conditions and generate words in 1-3 letters and articles from 1 to 10 characters, respectively, we can be in a probabilistic field of other errors.
We can also influence the probabilities of choosing available operations and either make the dictionary grow sharply or lose weight dramatically.
We can even change the policy of choice probabilities also randomly, like the wind, which changes its direction ...

In fact, only thanks to the method of probabilistic testing, we in our project caught 5 hidden and very sophisticated errors in an already tested engine in which there were no visible hints of malfunctions!

Probabilistic testing can bring us one step closer to imitating the end user and help us detect hidden defects.

Source: https://habr.com/ru/post/110007/


All Articles