
In this article I will explain how to use PVS-Studio for static analysis of program code in C / C ++ using the example of the open-source Wireshark project. I will start with a brief description of the Wireshark network traffic analyzer and the PVS-Studio product. I will describe the pitfalls of the assembly process and the preparation of the project for static analysis. I will try to form an overall picture of the PVS-Studio product, its advantages and usability, giving analyzer warnings, code samples and own comments.
Network traffic analyzer Wireshark
To demonstrate the capabilities of PVS-Studio, it was required to find a well-known, useful and interesting open-source project, the analysis of which no one has done before. I stopped at Wireshark, because He himself is not indifferent to him, and if you still do not know about him, then perhaps after reading this article, share my feelings for him.
The rapid development of the Internet and a large number of films about hacker programmers have long attracted my attention to computer networks. And now, it seems to me that any qualified system administrator and security programmer is required to understand network technologies.
')
Networks are based on the reception and transmission of data over certain protocols. In order to conduct research of network applications and protocols, as well as to find problems in the network, and, importantly, to find out the causes of these problems, you must use the tools for capturing and analyzing network traffic or sniffers.
Wireshark is a well-known tool for capturing and analyzing network traffic with a graphical user interface. The program is based on the library for capturing network traffic Pcap, and allows you to parse the packets of the vast majority of known network protocols, displaying the value of each protocol field at any level.
Wireshark is a cross-platform tool distributed under the free GNU GPL license, designed to work on Windows and Linux. The GTK + and Qt libraries are used to form the graphical interface.
Documentation and source code of the program can be found on the
site .
PVS-Studio static code analyzer
Static code analysis allows you to find bugs in the software without running applications and regardless of the working environment. Using static analysis, you can improve the quality of a software product, reduce development and testing time, and ensure security.
PVS-Studio is a static code analyzer C / C ++ / C ++ 11 that supports the compilers MS Visual C ++, GNU GCC (MinGW), Clang, Borland C ++.
PVS-Studio contains the following diagnostics:
- general purpose diagnostics;
- diagnostics of 64-bit errors;
- diagnostics of possible optimizations.
Additional information about PVS-Studio can be found on the
website .
Building the Wireshark project
For static analysis, download the source code of the latest stable version of Wireshark 1.12.4. The build will be carried out in the Windows 7 operating system for the Win64 platform using the compiler included in Visual Studio 2013. Additionally, install the Qt SDK 5.4.1 and WinPcap 4.1.3 libraries.
We will build from the command line using nmake. For proper operation of the build scripts, install Cygwin and Python 2.7.9.
Additional information about the assembly can be found on the
website .
Despite the fact that the assembly was carried out in full accordance with the instructions, a number of errors occurred. To eliminate them, it took:
- Write the path to Cygwin in the PATH environment variable to make the bash command shell accessible from the console.
- Disable ACL access control for NTFS in Cygwin to give the owner rights to write, read, and run files.
- Install the optional dos2unix package in Cygwin. Since for compilation u2d utility was required.
- It was necessary to copy the Makefile.nmake file from “asn1 \ hnbap” to “asn1 \ kerberos” to get the “clean” cleaning command for nmake to work.
Conducting static analysis with PVS-Studio tools
I have installed PVS-Studio 5.25 with a license, but for an initial acquaintance with the program, you can download and install a
demo version .
In introductory mode, only warnings of the first level are available; you can make only 50 code transitions after installation and 50 transitions after sending information about yourself. After 100 transitions you will need a license, the terms of which you can find on the site. Of course, these 100 transitions are not enough to use, and they are given for the first acquaintance with the program. If you want to get to know the analyzer more closely, you can
write in support and get a registration key for a few days.
Since the Wireshark project is being built using nmake from the command line, we will need a monitoring system, which is included in the PVS-Studio package. It keeps track of compiler launches and collects information about their environment: working directory, command line, full path to the compiled file, process environment variables.
To monitor, launch “Start \ PVS-Studio \ PVS-Studio Standalone”, select the menu item “Tools \ Analyze Your Files ...” and press the “Start Monitoring” button. Next, from the command line, run the build of the project “nmake -f Makefile.nmake all”, as described above. Check that the build was successful and complete monitoring by clicking on the “Stop monitoring” button.
Reserve patience, tk. then the static analysis will automatically start. After it is completed, save the plog file of the report in order not to run the assembly and static analysis several times.
Already at this stage in the PVS-Studio Standalone program, you can start searching for errors. But in order to use the advanced features of IntelliSense code navigation, I recommend opening our report in Microsoft Visual Studio.
To do this, perform a series of actions:
- Create an empty Visual C ++ project in the Wireshark source folder.
- In Solution Explorer go to the file display mode.
- Add source to our project.
- Open the plog file of the report using the plugin: “PVS-Studio \ Open Analysis Report”.
Well, here we come to the most interesting - the search for errors.
Finding errors in the Wireshark project
Let's start searching for errors by looking at the PVS-Studio warnings and using IntelliSense navigation.
From the very beginning I was attracted by the comments in the code:
void decode_ex_CosNaming_NamingContext_NotFound(....) { .... (void)item; .... item = proto_tree_add_uint(....); .... }
Probably, the Wireshark project is already regularly checked by the static Coverity analyzer, which is used in projects requiring high reliability. Such projects include: software for medical devices, for nuclear stations, aviation, and more recently for embedded systems. So it’s curious to find bugs that Coverity missed.
In order to form an overall picture of the capabilities of PVS-Studio, we will look for errors of various types that are difficult to detect due to indeterminate behavior during testing, requiring advanced knowledge of the C / C ++ language and just interesting. To do this, we will be sufficient warnings of the first and a quick view of the warnings of the second level.
Code:
typedef struct AIRPDCAP_SEC_ASSOCIATION { .... AIRPDCAP_KEY_ITEM *key; .... }; void AirPDcapWepMng(....,AIRPDCAP_KEY_ITEM* key, AIRPDCAP_SEC_ASSOCIATION *sa, ....) { .... memcpy(key, &sa->key, sizeof(AIRPDCAP_KEY_ITEM)); .... }
Warning:
V512 A call of the memcpy function will lead to the '& sa-> key' buffer becoming out of range. airpdcap.c 1192
C / C ++ languages ​​provide efficient low-level working with RAM due to the lack of built-in checks for array boundaries when reading and writing. Errors filling, copying and comparing memory buffers can lead to undefined program behavior or segmentation errors that are hard to detect.
To fill the 'AIRPDCAP_KEY_ITEM' structure located at the 'key' address, not the address 'sa-> key' to the same structure is used, but the address of the pointer to it. To correct this error, it is enough to remove the extra operation of obtaining the address '&'.
Code:
typedef struct _h323_calls_info { e_guid_t *guid; .... } h323_calls_info_t; static const e_guid_t guid_allzero = {0, 0, 0, { 0, 0, 0, 0, 0, 0, 0, 0 } }; void q931_calls_packet(....) { h323_calls_info_t *tmp2_h323info; .... memcmp(&tmp2_h323info->guid, &guid_allzero, 16) == 0; .... }
Warning:
V512 A call of the "memcmp" function will lead you to tmp2_h323info-> guid. voip_calls.c 1570
Another example with incorrect use of the buffer. In one of the arguments of the 'memcmp ()' function, a pointer is passed to a pointer to the 'e_guid_t' structure, instead of a pointer to it.
Code:
#define ETHERCAT_MBOX_HEADER_LEN ((int) sizeof(ETHERCAT_MBOX_HEADER)) void dissect_ecat_datagram(....) { if (len >= sizeof(ETHERCAT_MBOX_HEADER_LEN) && ....) { .... } }
Warning:
V568 It's an odd that it is the operator of the sizeof () operator is the '(int) sizeof (ETHERCAT_MBOX_HEADER)' expression. packet-ethercat-datagram.c 519
When working with memory in C ++, the 'sizeof ()' operator is used, which returns the size of the object or buffer in bytes. In our case, 'sizeof ()' will return the size of the 'int' type in bytes, instead of the size of the 'ETHERCAT_MBOX_HEADER' structure. To correct the error, remove the extra operation 'sizeof ()'.
Code:
void Proto_new(....) { .... if (!name[0] || !desc[0]) luaL_argerror(L,WSLUA_ARG_Proto_new_NAME, "must not be an empty string"); .... if ( name ) { .... loname_a = g_ascii_strdown(name, -1); .... } .... }
Warning:
V595 The 'name' of the pointer was used before it was verified against nullptr. Check lines: 1499, 1502. wslua_proto.c 1499
Usually, in order to show that the pointer does not refer to an object, a special zero value is written into it, and additional checks are made before using the pointer. Static analysis allows you to find the missing checks that can violate security, and redundant checks that are heavily cluttering the code.
Checking the 'name' pointer is done after using 'name [0]'. On the one hand, this check is redundant if the pointer is not zero, and on the other hand, if it is zero, an error will still occur.
Code:
void create_byte_graph(....) { .... u_data->assoc=(sctp_assoc_info_t*)g_malloc( sizeof(sctp_assoc_info_t)); u_data->assoc=userdata->assoc; .... }
Warning:
V519 The 'u_data-> assoc' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 1526, 1527. sctp_byte_graph_dlg.c 1527
In C / C ++ languages, allocating and freeing memory is done manually. Failure to allocate allocated errors can lead to memory leaks.
The 'g_malloc ()' function allocates a section of dynamic memory of the size 'sizeof (sctp_assoc_info_t)' bytes and returns a pointer to it. But after changing the value of the variable storing this pointer, we will not be able to either access this section or release it, which will lead to a memory leak.
Code:
PacketList::PacketList(QWidget *parent) { QMenu *submenu; .... submenu = new QMenu(tr("Colorize with Filter")); submenu = new QMenu(tr("Copy")); ctx_menu_.addMenu(submenu); .... }
Warning:
V519 The 'submenu' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 287, 363. packet_list.cpp 363
In the constructor, the visual interface elements are dynamically created and added to the Qt object hierarchy. This allows recursive destruction of created objects when deleting a top-level object. However, one of the menu items was never added to the object hierarchy, which would lead to a memory leak.
Code:
void dissect_display_switch(gint offset, guint msg_len, ....) { .... if((address_byte&DISPLAY_WRITE_ADDRESS_LINE_FLAG) !=DISPLAY_WRITE_ADDRESS_LINE_FLAG) offset+=1;msg_len-=1; .... }
Warning:
V640 The code's operational logic does not correspond with its formatting. The second statement will always be executed. It is possible that curly brackets are missing. packet-unistim.c 1134
Incorrect placement of curly brackets '{}' to select blocks of conditional 'if' statements can lead to errors.
The body of the conditional operator 'if' will consist of one instruction, although the formatting and logic of the program require several. To correct the error, you need to enclose a few instructions in curly brackets '{}'.
Code:
void dissect_ssc_readposition (....) { .... switch (service_action) { .... case LONG_FORM: if (!(flags & MPU)) { .... } else break; .... } .... }
Warning:
V705 It is possible that 'else' block was forgotten or commented out, thus altering the program's operation logics. packet-scsi-ssc.c 831
It's funny, but just one comment can lead to a change in the logic of the program. Exit from the 'case LONG_FORM' block will be executed only when 'else' is triggered, which will inevitably lead to an error.
Code:
void set_has_console(gboolean set_has_console) { has_console = has_console; }
Warning:
V570 The 'has_console' variable is assigned to itself. console_win32.c 235
Also errors due to inattention. The 'set_has_console ()' function is supposed to change the value of 'has_console' to 'set_has_console', but this does not happen. To correct the error, you need to assign a value to the 'has_console' variable that is passed using the 'set_has_console' argument.
Code:
void dissect_dcc(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data _U_) { client_is_le = ( (tvb_get_guint8(tvb, offset+4) | tvb_get_guint8(tvb, offset+4)) &&(tvb_get_guint8(tvb, offset+8) | tvb_get_guint8(tvb, offset+9)) && (tvb_get_guint8(tvb, offset+12) | tvb_get_guint8(tvb, offset+13)) ); }
Warning:
V501 There are identical sub-expressions 'tvb_get_guint8 (tvb, offset + 4)' to the left operator. packet-dcc.c 272
The expression tvb_get_guint8 (tvb, offset + 4) is repeated. By analogy, we can assume that we planned to write tvb_get_guint8 (tvb, offset + 5).
There were other mistakes that I did not write about so as not to clutter up the article. The given examples should be enough to show the possibilities of static analysis and to draw people's attention to PVS-Studio. If you need to get a complete picture of the capabilities of PVS-Studio, you can find a
list of all possible warnings on the site. A more thorough analysis of Wireshark can be carried out by the developers themselves. It will be much easier for them to understand whether something is a mistake or not.
Conclusion
There were not many suspicious code sections. This is probably due to the use of the Coverity static analyzer, comments on which we have seen. Therefore, I recommend everyone to regularly use static analyzers in their projects in order to detect errors even before testing at the stage of writing code.
I wish you all success in programming and fewer errors.
If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Kalashnikov.
Static Analysis of Wireshark by PVS-Studio .