⬆️ ⬇️

GNU Arm Embedded Toolchain appeared in PVS-Studio

GNU Arm Embedded Toolchain + PVS-Studio


Embedded systems have long been firmly established in our lives. The requirements for their stability and reliability are very high, and error correction is expensive. Therefore, for embedded developers it is especially important to regularly use specialized tools to ensure the quality of the source code. This article will tell about the appearance of support for GNU Arm Embedded Toolchain in the PVS-Studio analyzer and code defects found in the Mbed OS project.



Introduction



The PVS-Studio analyzer already supports several commercial compilers for embedded systems, for example:





Now another developer tool is added to the support - GNU Embedded Toolchain.



GNU Embedded Toolchain is a collection of compilers from Arm, based on the GNU Compiler Collection. The first official release took place in 2012, and since then the project has been developing along with the GCC.

')

The main purpose of GNU Embedded Toolchain is to generate code that runs on bare metal (bare metal), that is, directly on a processor without an interlayer in the form of an operating system. The package includes compilers for C and C ++, assembler, the GNU Binutils toolkit and the Newlib library. The source code of all components is fully open and distributed under the GNU GPL license. From the official site you can download versions for Windows, Linux and macOS.



Mbed os



To test the analyzer, you need as much source code as possible. Usually there are no problems with this, but when we are dealing with embedded development aimed primarily at devices included in IoT, it can be difficult to find a sufficient number of large projects. Fortunately, this problem was solved with the help of specialized operating systems, the source code of which is in most cases open. Then we will discuss one of them.



Med OS + PVS-Studio




Although the main purpose of the article is to talk about support for the GNU Embedded Toolchain, it’s hard to write a lot about it. Moreover, the readers of our articles are certainly waiting for the description of some interesting errors. Well, let's not deceive their expectations and run the analyzer on the Mbed OS project. This is an open source operating system that is being developed with Arm.



Official website: https://www.mbed.com/



Source code: https://github.com/ARMmbed/mbed-os



The choice on Mbed OS was not accidental, this is how the authors describe the project:



It is an open source operating system for the Internet of Things. It includes a microcontroller, including security, connectivity, an I / O devices.



This is an ideal build project using the GNU Embedded Toolchain, especially considering Arm’s participation in its development. I’ll just say that I didn’t have the goal of finding and showing as many errors as possible in a particular project, so the results of the review are briefly reviewed.



Errors



During the verification of the Mbed OS code, the PVS-Studio analyzer issued 693 warnings, 86 of them with high priority. I will not consider them all in detail, especially since many of them are repeated or do not represent much interest. For example, the analyzer issued many warnings V547 (Expression is always true / false) related to the same type of code fragments. The analyzer can be configured to significantly reduce the number of false and uninteresting positives, but there was no such task when writing an article. Those interested can see an example of such a configuration described in the article " Characteristics of the PVS-Studio analyzer on the example of EFL Core Libraries, 10-15% of false positives ."



For the article, I selected several interesting errors to demonstrate the operation of the analyzer.



Memory leaks



Let's start with a common class of errors in C and C ++ - memory leaks.



Analyzer Warning: V773 CWE-401 The function was exited without reading the 'read_buf' pointer. A memory leak is possible. cfstore_test.c 565



int32_t cfstore_test_init_1(void) { .... read_buf = (char*) malloc(max_len); if(read_buf == NULL) { CFSTORE_ERRLOG(....); return ret; } .... while(node->key_name != NULL) { .... ret = drv->Create(....); if(ret < ARM_DRIVER_OK){ CFSTORE_ERRLOG(....); return ret; // <= } .... free(read_buf); return ret; } 


The classic situation when working with dynamic memory. The buffer allocated with malloc is used only inside the function and is released before exiting. The problem is that this does not happen if the function stops working ahead of time. Note the same code in the if blocks. Most likely, the author copied the top fragment and forgot to add the free call.



Another example, similar to the previous one.



Analyzer Warning: V773 CWE-401 The function of the interface was pointer. A memory leak is possible. nanostackemacinterface.cpp 204



 nsapi_error_t Nanostack::add_ethernet_interface( EMAC &emac, bool default_if, Nanostack::EthernetInterface **interface_out, const uint8_t *mac_addr) { .... Nanostack::EthernetInterface *interface; interface = new (nothrow) Nanostack::EthernetInterface(*single_phy); if (!interface) { return NSAPI_ERROR_NO_MEMORY; } nsapi_error_t err = interface->initialize(); if (err) { return err; // <= } *interface_out = interface; return NSAPI_ERROR_OK; } 


A pointer to the allocated memory is returned via the output parameter, but only if the call to initialize is successful, and if an error occurs, a leak occurs because the local interface variable goes out of scope, and the pointer is simply lost. Here, you should either call delete , or at least return the address stored in the interface variable in any case, so that the calling code can take care of this.



Memset



Using the memset function often leads to errors, examples of related problems can be found in the article “ The most dangerous function in the C / C ++ world ”.



Consider the following analyzer warning:



V575 CWE-628 The 'memset' function processes '0' elements. Inspect the third argument. mbed_error.c 282



 mbed_error_status_t mbed_clear_all_errors(void) { .... //Clear the error and context capturing buffer memset(&last_error_ctx, sizeof(mbed_error_ctx), 0); //reset error count to 0 error_count = 0; .... } 


The programmer intended to reset the memory occupied by the last_error_ctx structure, but mixed up the second and third arguments. As a result, 0 bytes is filled with the sizeof (mbed_error_ctx) value.



Exactly the same error is present a hundred lines above:



V575 CWE-628 The 'memset' function processes '0' elements. Inspect the third argument. mbed_error.c 123



Unconditional 'return' statement in a loop



Analyzer Warning: V612 CWE-670 An unconditional 'return' within a loop. thread_network_data_storage.c 2348



 bool thread_nd_service_anycast_address_mapping_from_network_data ( thread_network_data_cache_entry_t *networkDataList, uint16_t *rlocAddress, uint8_t S_id) { ns_list_foreach(thread_network_data_service_cache_entry_t, curService, &networkDataList->service_list) { // Go through all services if (curService->S_id != S_id) { continue; } ns_list_foreach(thread_network_data_service_server_entry_t, curServiceServer, &curService->server_list) { *rlocAddress = curServiceServer->router_id; return true; // <= } } return false; } 


In this snippet, ns_list_foreach is a macro that expands into a for statement. The inner loop performs no more than one iteration because of the call to return immediately after the line in which the output parameter of the function is initialized. Perhaps this code works as intended, but using the inner loop looks rather strange in this context. Most likely, the initialization of rlocAddress and exit from the function should be performed according to the condition, or you can get rid of the internal loop.



Errors in conditions



As I said above, the analyzer issued a fairly large number of uninteresting warnings from the V547 , so I studied them briefly and wrote out only two cases for the article.



V547 CWE-570 Expression 'pcb-> state == LISTEN' is always false. lwip_tcp.c 689



 enum tcp_state { CLOSED = 0, LISTEN = 1, .... }; struct tcp_pcb * tcp_listen_with_backlog_and_err(struct tcp_pcb *pcb, u8_t backlog, err_t *err) { .... LWIP_ERROR("tcp_listen: pcb already connected", pcb->state == CLOSED, res = ERR_CLSD; goto done); /* already listening? */ if (pcb->state == LISTEN) { // <= lpcb = (struct tcp_pcb_listen*)pcb; res = ERR_ALREADY; goto done; } .... } 


The analyzer considers that the condition pcb-> state == LISTEN is always false, let's see why.



Before the if statement , the macro LWIP_ERROR is used , which, according to the logic of its work, resembles assert . His ad looks like this:



 #define LWIP_ERROR(message, expression, handler) do { if (!(expression)) { \ LWIP_PLATFORM_ERROR(message); handler;}} while(0) 


If the condition is false, the macro reports an error and executes the code passed through the handler parameter, in this code fragment it is an unconditional jump using goto .



In this example, the condition 'pcb-> state == CLOSED' is checked, that is, a transition to the done label occurs when pcb-> state has any other value. The if statement following the LWIP_ERROR call checks pcb-> state for LISTEN equality, but this condition is never met, because the state in this line can only contain the CLOSED value.



Consider another warning related to the conditions: V517 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a possibility of logical error presence. Check lines: 62, 65. libdhcpv6_server.c 62



 static void libdhcpv6_address_generate(....) { .... if (entry->linkType == DHCPV6_DUID_HARDWARE_EUI64_TYPE) // <= { memcpy(ptr, entry->linkId, 8); *ptr ^= 2; } else if (entry->linkType == DHCPV6_DUID_HARDWARE_EUI64_TYPE)// <= { *ptr++ = entry->linkId[0] ^ 2; *ptr++ = entry->linkId[1]; .... } } 


Here, if and else if check the same condition, as a result of which the code in the body of else if is never executed. Such errors often occur when writing code using the ' copy-paste ' method.



Ownerless expression



Finally, let's look at a funny code snippet.



Analyzer warning: V607 Ownerless expression '& discover_response_tlv'. thread_discovery.c 562



 static int thread_discovery_response_send( thread_discovery_class_t *class, thread_discovery_response_msg_t *msg_buffers) { .... thread_extension_discover_response_tlv_write( &discover_response_tlv, class->version, linkConfiguration->securityPolicy); .... } 


And now let's take a look at the thread_extension_discover_response_tlv_write macro declaration :



 #define thread_extension_discover_response_tlv_write \ ( data, version, extension_bit)\ (data) 


The macro is expanded into the data argument, that is, its call inside the thread_discovery_response_send function after preprocessing turns into the expression (& discover_response_tlv) .



Wait what




I have no comments. Probably, this is not a mistake, but such a code always introduces me to a state similar to the image in the picture :).



Conclusion



The list of compilers supported in PVS-Studio has increased. If you have a project that is designed to be built using the GNU Arm Embedded Toolchain, I suggest you try checking it out using our analyzer. Download the demo version here . Also pay attention to the free license option, which is suitable for some smaller projects.







If you want to share this article with an English-speaking audience, then please use the link to the translation: Yuri Minaev. PVS-Studio Now Supports GNU Arm Embedded Toolchain .

Source: https://habr.com/ru/post/427447/



All Articles