Problems testing 64-bit applications

This note describes a number of issues related to software testing. It indicates the difficulties that a developer of resource-intensive applications may face, and ways to overcome them. The terms "resource-intensive" and "64-bit" applications are considered synonymous in the article.

The size of the average computer program is growing steadily from year to year. Programs are becoming more complicated and confusing, they are processing more and more data, acquiring an ever more functional and beautiful graphical interface. If earlier a program with a size of several kilobytes, with simplest editing capabilities, was considered a full-fledged text editor, now word processors occupy tens and hundreds of megabytes, providing greater functionality by the same order. Naturally, with the same speed, the requirements for the hardware performance of computing equipment are also growing.
')
The next step in increasing computing power was the transition to 64-bit microprocessor systems. This step can not be called revolutionary, but it allows you to significantly expand the capabilities of computer systems. First of all, 64-bit systems allowed to overcome the 4GB barrier, which has already begun to restrict many software developers. First of all, this concerns the developers of numerical modeling packages, three-dimensional editors, DBMS, games. A large amount of RAM significantly expands the capabilities of applications, allowing you to store large amounts of data and access them directly, without loading them from external data stores. Do not forget about the higher performance of 64-bit versions of programs, due to the large number of registers, advanced floating arithmetic capabilities, and the ability to work with 64-bit numbers.

The complication of software solutions, respectively, complicates the task of their maintenance and testing. The impossibility of manual testing of large software systems has given impetus to the development of automation systems for testing and quality control programs. There are various approaches to ensuring the necessary quality of software, and we briefly recall about them.

The oldest, proven and reliable approach to the search for defects is the joint code review (English code review ) [ 1 ]. This technique is based on the joint reading of a code with the implementation of a number of rules and recommendations, well described, for example, in Steve McConnell's “Code Complete” [2]. Unfortunately, this practice is not applicable for large-scale testing of modern software systems because of their large volume. Although this method gives the best results, it is not always used in the conditions of modern software development life cycles, where the development time and time to market for a product are an important point. Therefore, viewing the code is often reduced to infrequent meetings, the purpose of which is to train new and less experienced employees to write high-quality code, rather than to check the performance of a number of modules. This is a very good way to improve the skills of programmers, but it cannot be considered as a full-fledged means of quality control of the developed program.

To help developers who are aware of the need to regularly review the code, but do not have enough time, come tools static code analysis [ 3 ]. Their main task is to reduce the amount of code that requires the attention of a person, and thereby reduce the time it is viewed. Static code analyzers include a rather large class of programs implemented for various programming languages and having a diverse set of functions, from the simplest control of code alignment to complex analysis of potentially dangerous places. Systematic use of static analyzers can significantly improve the quality of the code and find many errors. The approach based on static analysis has many fans and many interesting works are devoted to it (for example, [ 4 ], [ 5 ]). The advantage of this approach is that it does not depend on the complexity and size of the software solution being developed.

In addition, a method of selective testing is a noteworthy way to improve the quality of software products. This method is based on a well-known and intuitive way of testing only those parts of software products that were directly affected by the changes. The main problem with the use of selective testing is to obtain a reliable list of all affected parts of the software product. The method of selective testing, supported, for example, by the software product Testing Relief, solves this problem.

The white box method [ 6 ]. Under the white box testing method, we mean the execution of the maximum available number of different code branches using a debugger or other means. The more code coverage was achieved, the more fully the testing was performed. White box testing is also sometimes understood to mean simple debugging of an application in order to find a known error. Full-fledged white-box testing of the entire program code has long been impossible due to the sheer volume of modern program code. Now testing by the white box method is convenient to apply at the stage when the error is found and it is necessary to understand the reason for its occurrence. White-box testing has opponents that deny the utility of debugging programs in real time. The main motive lies in the fact that the ability to observe the progress of the program and at the same time make changes to its state generates an unacceptable approach to programming, based on a large number of code corrections by trial and error. We will not deal with these disputes, but note that white box testing is in any case a very expensive way to improve the quality of large and complex software systems.

The black box method has proved much better [ 7 ]. This also includes unit testing ( unit test ) [ 8 ]. The main idea of the method is to write a set of tests for individual modules and functions, checking all the main modes of their work. A number of sources attribute unit testing to the white box method, since it is based on knowledge of the program device. The author takes the position that the tested functions and modules should be considered as black boxes, since unit tests should not take into account the internal structure of the function. The rationale for this is such a methodology, when tests are developed before writing the functions themselves, which contributes to increasing the control of their functionality in terms of specifications.

A lot of literature has been devoted to the unit testing method, for example [ 9 , 10]. Unit testing has proven itself in the development of simple and complex projects. One of the advantages of unit testing is that it is easy to verify the correctness of the patches being made to the program right during the development process. They try to make sure that all tests pass within a few minutes, which allows the developer who made the changes to the code to immediately notice the error and correct it. If the run of all tests is impossible, then usually long tests are taken out separately and run, for example, at night. It also contributes to the rapid detection of errors, at least the next morning.

Manual testing. This is perhaps the final stage of any development, but it should not be considered as a good and reliable method. Manual testing must necessarily exist, since it is impossible to detect all errors automatically or by viewing the code. But rely on this method is not particularly worth it. If a program is of poor quality and a large number of internal defects, its testing and correction may take a very long time and still it is impossible to ensure the proper quality of the program. The only way to get a quality program is to use quality code. Therefore, we also will not consider manual testing as a complete method for the development of large projects.

So, what do we have left that deserves the most attention when developing large software systems? This is static analysis and unit testing. These approaches can significantly improve the quality and reliability of software code, and they should be given the most attention, although, of course, we should not forget about others.

Now let's move on to the question of testing 64-bit programs, since the application of the methods we have chosen faces several unpleasant difficulties. Let's start with static code analyzers.

Strange as it may seem, despite all its enormous capabilities, long development period and practice of use, static analyzers turned out to be poorly prepared to search for errors in 64-bit programs. Consider the situation on the example of the analysis of C ++ code as an area where static analyzers have found the greatest use. Many static analyzers support a number of rules related to finding code that has incorrect behavior when porting it to 64-bit systems. But they realize this by very disparate methods and very incompletely. This was especially evident after the start of mass development of applications for the 64-bit version of the Windows OS in the Microsoft Visual C ++ 2005 environment.

The explanation for this can be the fact that most of the checks are based on fairly old materials on researching problems of porting programs to 64-bit systems from the point of view of the C language. As a result, a number of constructions appearing in C ++ language were deprived of attention from the point of view of portability control and did not find their reflection in analyzers. A number of other changes are not taken into account, such as, for example, a significantly increased amount of RAM and the use of different data models in different compilers (LP64, LLP64, ILP64 [ 11 ]).

For clarity, consider a couple of examples:

  double * DoubleArray;
 unsigned Index = 0;
 while (...)
   DoubleArray [Index ++] =;

You will not be able to receive a warning for such code even with such powerful analyzers as Parasoft C ++ test and Gimpel Software PC-Lint . No wonder. The above code does not cause any suspicion to the average developer who is used to the practice of using variables of the type int or unsigned as indices. Unfortunately, the above code on a 64-bit system will not work if the size of the DoubleArray array being processed exceeds the size of 4Gb elements. In this case, the Index variable will overflow, and the program output will be incorrect. The correct option would be to use the size_t type when programming under Windows x64 (LLP64 data model) or size_t / unsigned long when programming under Linux (LP64 data model).

The reason why static analyzers cannot diagnose such code probably lies in the fact that when the issues of porting to 64-bit systems were studied, it is unlikely that anyone imagined arrays of more than 4 billion items. And 4 billion elements of double type is 4 * 8 = 32 GB of memory for one array. Huge volume, especially considering that this is the year 1993-1995. It is at this time that most of the publications and discussions devoted to the use of 64-bit systems occur.

As a result, no one paid any attention to possible incorrect indexing when using the int type, and further transfer issues were rarely investigated. And almost no static analyzer will issue a warning to the code above. Perhaps the only exception is Viva64 analyzer included in PVS-Studio . It was designed to close gaps in the diagnosis of 64-bit C / C ++ code by other analyzers, and is based on newly conducted research. But it has a significant drawback, which is that it is not a general-purpose analyzer. It specializes only in analyzing errors that occur when porting code to 64-bit Windows systems, and therefore should only be used in combination with other analyzers to ensure proper quality of the code.

Consider another example:

  char * p;
 long g = (long) p;

With this simple example, you can check which data models the static analyzer you use can understand. The problem is that most of them are designed only for the LP64 data model. This is again caused by the history of the development of 64-bit systems. It is the LP64 data model that has gained the most popularity at the initial stages of development of 64-bit systems and is now widely used in the Unix-world. In this data model, the long type is 8 bytes in size, which means that such code is completely correct. But in 64-bit Windows systems, the LLP64 data model is implemented, where the length of the long remained 4-byte and, therefore, the code will be incorrect. In Windows, for example, use LONG_PTR or ptrdiff_t.

Fortunately, the code will be diagnosed as dangerous by both the Microsoft Visual C ++ 2005 compiler itself and the Viva64 analyzer. But you should always remember about these pitfalls when using static analyzers.

The result was an interesting situation. The issue of transferring programs to 64-bit systems was discussed in detail, various test methods and rules were implemented in static analyzers, after which the interest in this topic disappeared. Many years have passed, much has changed, but the rules by which the analysis is carried out remain unchanged and modified. What caused this is difficult to explain. Perhaps the developers simply do not notice the changes, considering that the issue of testing and testing 64-bit applications has long been resolved. I want you not to fall into the same trap. Be careful. What was relevant 10 years ago may not be one now, but a lot of new things have appeared. Using static analysis tools, make sure that they are compatible with the 64-bit data model you are using. If the analyzer does not meet the necessary conditions, do not be lazy to look for another one or fill in the gap using the Viva64 narrowly focused analyzer. The effort spent on this will more than pay off by increasing the reliability of your program, reducing the time for debugging and testing.

Now let's talk about unit tests. With them on 64-bit systems, we also have a number of unpleasant moments. In an effort to reduce test execution time, in their development they try to use a small amount of calculations and the amount of data processed. For example, when developing a test for the function of searching for an element in an array, it does not matter much whether it will process 100 elements or 10.000.000. One hundred elements will be enough, but compared to processing 10,000,000 elements, the speed of the test can be significantly higher. But if you want to develop full-fledged tests, in order to test this function on a 64-bit system, you will need to process more than 4 billion elements! It seems to you that if the function works on 100 elements, it will work on billions? Not. If you do not believe, then to demonstrate, try the given example on a 64-bit system:

  bool FooFind (char * Array, char Value,
              size_t Size)
 {
   for (unsigned i = 0; i! = Size; ++ i)
     if (i% 5 == 0 && Array [i] == Value)
       return true;
   return false;
 }

 #ifdef _WIN64
   const size_t BufSize = 5368709120ui64;
 #else
   const size_t BufSize = 5242880;
 #endif

 int _tmain (int, _TCHAR *) {
   char * Array =
     (char *) calloc (BufSize, sizeof (char));
   if (Array == NULL)
     std :: cout << "Error allocate memory" << std :: endl;
   if (foofind (Array, 33, BufSize))
     std :: cout << "Find" << std :: endl;
   free (Array);
 }

As you can see from the example, if your program on a 64-bit system starts processing a larger amount of data, then you should not count on the old sets of unit tests. It is necessary to expand them necessarily in view of processing large amounts of data.

But, unfortunately, it is not enough to write new tests. Here we are faced with the problem of the speed of execution of a modified set of tests covering the processing of large amounts of data. The first consequence will be that such tests cannot be added to the test suite launched by the developer during development. Making them into the night tests can also be difficult. The total execution time of all tests can grow by an order of two or even more. As a result, the test run time may exceed 24 hours. It should be remembered and approach the refinement of tests for the 64-bit version of the program with all seriousness.

A way out of this situation can be the partitioning of all tests into several groups performed
in parallel on multiple computers. You can also use multiprocessor systems. Of course, this will somewhat complicate the testing system and will require additional hardware resources, but this will be the most correct and, as a result, a simple step towards solving the task of building a unit-testing system.

Naturally, you will need to use an automated testing system that allows you to organize the launch of tests on several machines. An example is the AutomatedQA TestComplete test automation system for Windows applications. With its help, you can perform distributed testing of applications on several workstations, synchronize and collect results [ 12 ].

It can also significantly simplify the life of using modern tools, such as Intel Parallel Studio . Since in resource-intensive applications a part of the code can be written by some scientist who is not a strong master in programming technique, it is better to search for errors in such code with a tool than without it.

In conclusion, I would like to once again return to the question of testing using the white box method, which we considered unacceptable for large systems. It should be added that when debugging 64-bit applications that process large data arrays, this method becomes completely inapplicable. Debugging of such applications can take much more time or be difficult on developer machines. Therefore, it is worthwhile to consider in advance the possibility of using logging systems for debugging applications or using other methods. For example, remote debugging.

Summing up, I would like to say that you should not rely on a separate method. A quality application can only be developed using several of the tested approaches considered.

Summarizing the problems of developing 64-bit systems, I would like to remind you again the key points:

Be prepared for surprises when developing and testing 64-bit applications.
Be prepared that debugging 64-bit applications using the white box method may become impossible or extremely difficult if large data sets are processed.
Carefully familiarize yourself with the capabilities of your static analyzer. And if it does not meet all the necessary requirements, do not be lazy to find another one or use an additional static analyzer.
Do not trust the old sets of unit tests. Be sure to review them and add new tests that take into account the features of 64-bit systems.
Be aware of the significant slowing down of unit test suites and take care of new computers in advance for their launch.
Use a test automation system that supports distributed application launch, like the TestComplete system, which provides a quick test of applications.
The best result can be achieved using a combination of different techniques.

The comments are very interesting to read the opinions of people involved in the development of demanding applications.

Bibliographic list

Wikipedia, " Code review ".
Steve McConnell, "Code Complete, 2nd Edition" Microsoft Press, Paperback, 2nd Edition, Published June 2004, 914 pages, ISBN: 0-7356-1967-0.
Wikipedia, " Static code analysis ".
Scott Meyers, Martin Klaus " A First Look at C ++ Program Analyzers. ", 1997.
Walter W. Schilling, Jr. and Mansoor Alam. " Integrate Static Analysis Software Development Process ", 01, 2006.
Wikipedia, " White box testing ".
Wikipedia, " Black box testing ".
Wikipedia, " Unit testing ".
Justin Gehtland, " 10 Ways to Make Your Code More Testable ", July, 2004.
Paul Hamill, “Unit Test Frameworks”, November 2004, 212 pages, ISBN 10: 0-596-00689-6
Andrew Josey, " Data Size Neutrality and 64-bit Support ".
AutomatedQA, " TestComplete - Distributed Testing Support ".

Source: https://habr.com/ru/post/97079/

All Articles

Problems testing 64-bit applications

Bibliographic list

More articles: