Three interviews about static code analyzers

Hello, dear readers!

The author brings to your attention three interviews with representatives of modern, large, interesting projects about the methodology of software development and, in particular, about the use of static code analyzers . The author hopes that readers will be interested to read this text. Participants: Acronis, AlternativaPlatform, NGO-Echelon.

Respectfully,
Alexander Timofeev

Participants and article structure

For the interview, the author turned to three companies:
- Acronis with Acronis Backup, a product designed for creating backup copies of data and subsequent recovery

')
- AlternativaPlatform with the TanksOnline project, a multiplayer game

- NPO Echelon with a number of products for revision of the security-related code

The question was the same for all companies, except for the changes for NPO Echelon, so that the interview would better reflect the specifics of the company.

Interview with Acronis

Speaker - Kirill Korotaev, vice president of product development, Acronis Backup

1) overview of the main and most ambitious product of the company / project (product essence, language of the product written, how many people work on writing it, what are the usual rates of making changes to the project code in lines or kilobytes of code, for example, per day / week \ month, which CVS is used)

The essence of the Acronis Backup product that we are developing is to create backup copies of user data on their computers, laptops, and servers. To give people the opportunity to restore their data from these backups in the future. Recovery may be required, for example, if the computer has failed. Or if you need an earlier version of the file or document. Or if the file has been lost.

Our entire project is written in C ++ at 99%. About 70 developers are working on it. On average, between 100 and 300 edits (commits) are made to the project code per week. The version control system used is SVN (Subversion).

2) who and how analyzes the project code? How is the testing cycle organized? How wide is the staff of testers? How does the company react to the appearance of error information - is there a developed protocol for such situations?

There are architects, there are leaders who know the code of those parts of the project for which they are responsible, respectively, they analyze this code and know how to improve it. Each commit goes through code review — that is, any change is first analyzed by people who are responsible for their code section.
At the moment, the number of testers we have is comparable to the number of developers. Automatic and manual tests are used for testing. We have, for example, build validation tests - a set of tests that check each new build. And ideally, after each commit, a new build should be built into the code and checked immediately.

The process of reaction to the found error is as follows. Any issue found by the testing department is logged in the Jira system (a more advanced paid version of BugZilla). And all this is integrated with SVN - when, for example, a commit is made that fixes a specific issue, a link to this commit is added to Jira. Also, error information can come from users of our product. First, they are associated with our technical support (support). If the support reveals any bugs that need to be analyzed, then again information about them first gets into Jira. And bugs are fixed in the next product updates.

3) Are the tools used for static code analysis? if used, which ones? if used, please ask the expert to give an example of the most remarkable and interesting mistake that the analyzers helped to find. What are the usual results and statistics from using analyzers? How often and by what plan are the checks carried out? What is the scenario of responding to an error found by the analyzer?

We use or used previously different static analyzers - for example, both free open-source cppcheck and PVS-Studio. It makes sense to use them, of course, in any project. But different code analyzers are very different from each other and they catch different classes of errors - that is why I am for diversity in the means used.

There are such interesting potential bugs. Of the more complex ones, for example, PVS-Studio finds an incorrect use of standard auto-points from the STL library. Or, for example, it finds such an interesting mistake - if you multiply one sizeof from some structure or parameter by another sizeof, then PVS-Studio correctly notes that, generally speaking, it is strange that one sizeof is multiplied by another. Here, you see, even the dimension is logically obtained in a square.

Sometimes static analyzers can understand that the pointer is not checked for null before use. But these are more complex checks, because it is not always obvious, and whether the pointer can be null in this place of the code. Once a day, checking code with static analyzers is quite a good task. And at the same time automatically clog bugs in the same Jira - this is useful for the product being developed.

4) expert opinion on future methodologies for creating large software products. Separately - what does the expert expect and would like to see from the static code analysis tools?

Automated tools are developing and will be developing. For example, now there is not a single automatic system that would select tests based on the changes made. For example, to select those tests that need to be run, for the sake of a specific change in the code.

As for the future of static analyzers, I think that the number of situations that they will handle will increase. In this case, static analyzers will shift towards more complex analysis. And even guarantees of conformity of the code, for example, to some protocol.

5) an expert appeal to colleagues and readers

Write quality code, test it and use a variety of techniques. Including static analyzers.

Interview with AlternativaPlatform

Speaker - Alexey Quiring, Technical Director, Tanki Online LLC

1) overview of the main and most ambitious product of the company / project (product essence, language of the product written, how many people work on writing it, what are the usual rates of making changes to the project code in lines or kilobytes of code, for example, per day / week \ month, which CVS is used)

At the moment we have one such product - the online game Tanki Online. The server part is written in Java. The client part is on AS3. We have about 20 programmers. About 5K lines are added per week. We use GIT as CVS.

2) who and how analyzes the project code? How is the testing cycle organized? How wide is the staff of testers? How does the company react to the appearance of error information - is there a developed protocol for such situations?

We have a typical process for Git-a. All code passes mandatory Code Review. Continuous integration is also implemented, the build server constantly checks the code, runs tests.

Testing takes place in several stages - first with automatic tests, then the developers themselves test with hands (play), then a team of testers. If everything is ok, then testers from the community are connected. And only after that the changes fall into production. We have a small team of testers - three people. But we are actively using testers from the community, we have several dozen volunteer helpers.

If the error still snuck in production, then it is corrected immediately after detection. Usually all such errors are corrected in a couple of days.

3) Are the tools used for static code analysis? if used, which ones? if used, please ask the expert to give an example of the most remarkable and interesting mistake that the analyzers helped to find. What are the usual results and statistics from using analyzers? How often and by what plan are the checks carried out? What is the scenario of responding to an error found by the analyzer?

At the company level, we do not use such tools. In the past, for the sake of interest, I ran a couple of tools for analysis, but they found nothing fatal (JetBrain IDEA checker).

I think that static analysis is very good for complex languages like C and C ++. But for simpler ones, such as Java, its relevance is not very big. In Java, as a class, there are no memory problems. The syntax is simple and straightforward, discrepancies are not allowed, the compiler checks many things at the compilation stage. Development environments provide convenient tools for refactoring, which eliminates random errors when manually changing the code.

There is one area in which I would use a static analyzer for Java. This is a program check for correctness of multi-threaded execution. But at the moment there are simply no such tools. In general, if a static analyzer is qualitative and really looks for errors, then this is a useful thing for a project.

4) expert opinion on future methodologies for creating large software products. Separately - what does the expert expect and would like to see from the static code analysis tools?

The future lies in automated testing, continuous integration and code analyzers. From static analysis, I expect analysis of multithreaded applications and analysis of the correctness of architectural solutions.

5) an expert appeal to colleagues and readers

Do not be afraid to introduce new technologies in the production cycle. Learn from more experienced programmers. Revise your old decisions. And everything will work out.

Interview with NPO Echelon

Speaker - Andrei Fadin (a.fadin@cnpo.ru), Chief Designer, NPO Echelon

1) Overview of your company and its software related activities.

NPO Echelon is both a developer of security analysis tools and an active user of these products in the framework of projects for certifying information security tools and commercial code auditing.

The number of security analysis tools developed by our company includes:

AK-VS 2 - a cloud environment for carrying out certification tests of source texts on the requirements for controlling the absence of undeclared capabilities (up to and including control level 1);
AppChecker - carrying out the signature-heuristic analysis of a program code in order to identify software bookmarks, critical software vulnerabilities and other problems associated with defects in the program code;
PIK - means of fixing and comparing checksums of files, folders and machine-readable media;
Scanner-VS is a set of tools and environments for network and local security auditing, including security scanners, traffic analysis tools, search for residual information on media and a number of other components.

The code for security code analysis and penetration testing “Echelon” is an association of qualified specialists in the field of information technology and information security, created at the personnel, research, engineering base of “Echelon” JSC and the leading technical university of the country, MGTU. N. E. Bauman.

We work with most popular programming languages such as: PHP, Java, C #, C / C ++, Perl, Python, JavaScript, including their latest standards.

The audit of the program code conducted by the specialists of NPO Echelon allows to solve the following tasks:
quality control of internal and external (outsourced) code, detection of typical defects (coding or design errors);
identification of intentional program bookmarks in the code;
control of borrowed code (analysis of external dependencies of software on open-source and other external components)

For audited software, certification of information security requirements is possible in the testing laboratory of the Echelon Scientific and Production Association.

2) overview of how your experts work (not secret and not secret information) - who and how analyzes the project code, how the testing cycle is organized, what is the usual protocol when it detects an important point in the code?

The team of code auditors consists of two main types of specialists:

The first type of specialists is experts from the testing laboratory of NPO Echelon, who have experience in organizing interaction with developers of large software projects (operating systems, firewalls), as well as collaborative work on reviewing large amounts of code.

The second type of specialists is developers (employees of the Echelon Research and Development departments) who have high technical competencies in various programming languages, their frameworks and sample libraries. Whenever possible we try to involve developers of static analysis tools directly into the code audit, this allows them to directly, through their own experience, assess the usability of our analysis tools. In addition, since developers have more skills in creating new signatures for statistical analyzers, it makes sense to connect developers for timely updating the defect base, if this is required by the specifics of the software project under investigation.

In general, the development and testing process is associated with the following stages:
1. The decomposition of the project code into components (if there is an analysis of a third-party project)
2. Building a threat model, analyzing these components and their interaction interfaces for criticality from the point of view of ensuring information security.
3. Run static and dynamic analysis tools based on the results of p.2
4. Selective review of the code, based on the results of paragraph 3 and 2
5. Preparation of the protocol of identified potentially hazardous structures and discussion of the results with the development team of the software project.
Stages 3, 4 and 5 are usually repeated 3-4 times, because according to the results of the analysis of each of the potentially dangerous structures, as a rule, either the software project is modified to eliminate the defect (which entails repeating the stages starting from 3), or that the given code fragment is an erroneous assumption of an expert, or a false operation of a static analyzer (which entails repeating stages, starting with 4).

3) information about static analysis tools - which static analyzers are used; an example of the most remarkable and interesting mistake that analyzers helped to find; what are the usual results and statistics from the use of analyzers; What is the moment response scenario in the code found by the analyzer?

In their work, auditors use both our own development (AK-BC2, AppChecker), and open-source tools (CppCheck, PMD), as well as purchased third-party commercial products (CppCat CAPC).

The response scenario was described in paragraph 2. As for the statistics on the use of analyzers, as a rule, the proportion of analyzer false alarms (false-positives) on large projects exceeds 50% and in order to compile a list of identified potentially dangerous structures, one way or another, an expert is required. However, since it reviews not the full amount of code, but only its critical areas, on average not exceeding 5% of the total code, a significant time saving is achieved on code analysis.

In order to avoid violation of agreements on non-disclosure of trade secrets, unfortunately, we can not tell you about the errors found in specific products. But, from our experience, most of the interesting mistakes were related:

with hard-coded passwords (Use of Hard-coded Password, CWE-259) and other authentication data (Use of Hard-coded Credentials, CWE-798);
with Easter eggs and other hidden functionality (Hidden Functionality, CWE-912);
quite often, errors associated with “races” and shared resources “pop up” (Race Condition, CWE-362).

4) expert's opinion on the future methods of creating software products, as well as what the expert expects and would like to see separately from the static code analysis tools.

In our opinion, in the future, software verification will be more closely linked to the development processes of the software, both within the framework of Continuous Integration systems and in the context of their continuous deployment (Continuous Delivery).

Tight integration with these systems in the future will allow full control over the development and delivery of software, thus, within these processes, the stat. the analyzer begins to play the role of a kind of IPS, blocking at the level of commits and releases code that does not pass the requirements of quality standards (quality gate). From this point of view, any CI / CD system is also an interesting source of events for safety management systems (SIEM).

The introduction of the stat also has great prospects. analyzers into a model-driven development paradigm, tight integration with CASE tools will allow checking errors at the syntax level, at the level of software components and their interfaces, and even at the level of business requirements, so that, for example, the analyst can still at the design stage of the system, justify to customers the need to add one or another role in access control.

5) an expert appeal to colleagues and readers

Dear colleagues, in the past decade, while ensuring information security in the enterprise, first of all, emphasis was placed on network security (network security), as well as on node and workplace security (endpoint security).

However, if we talk about solving such tasks as identifying zero-day vulnerabilities, finding bookmarks and “implants” (code fragments and configurations embedded in software for state or industrial espionage), we are confronted with the fact that Protecting information at the network or site level (intrusion detection system, anti-virus tools) are not ways to effectively deal with these threats.

When addressing these issues, an integrated approach is needed, which, on the one hand, is related to the centralization of information security management in the enterprise (SIEM systems), and on the other hand, using structural decomposition of software into components with control of their origin, as well as static analysis of the contents of components and materials their production (including source code).

Conclusion

The author thanks the press services and experts of the companies participating in the study for their efficient work and the completeness of the answers to the interview questions. And also thanks the company “Program Verification LLC” , which is developing a modern static code analyzer PVS-Studio , which sponsored this article. And without the assistance of which this article would hardly have seen the light.

Source: https://habr.com/ru/post/238341/

All Articles