📜 ⬆️ ⬇️

We are testing the Linux version of PVS-Studio on the Linux Kernel



Since the release of the public Linux version of PVS-Studio, the article about re-checking the Linux kernel has remained only a matter of time. The project, which is written by professionals from around the world, which is used by a large number of people in various fields, which is regularly checked and tested with various tools - checking such a project will be a serious test for any static analyzer. What errors did PVS-Studio manage to find in such conditions?

How to check


We have already tested the Linux kernel. Since then, much has changed - now OS testing is as simple and convenient as checking any other project:

pvs-studio-analyzer trace - make pvs-studio-analyzer analyze -o /path/to/report.log -j8 

It took just a few months to adapt and test the analyzer in Linux, which until recently was available only for Windows. This time it was much easier to check the kernel.
')
Version 4.9-rc4 (commit bc33b0ca11e3df467777a4fa7639ba488c9d4911) was taken for verification.

For writing an article, only general-purpose diagnostics of 1-2 levels were viewed. It is worth noting the high quality of the code - the density of warnings indicating real flaws is extremely low.

For the article, I chose those warnings that are most likely to indicate real errors / typos. It should be understood that in addition to useful warnings, the analyzer also produces false positives or detects a badly designed or “foul-smelling” code, which, nevertheless, is not truly erroneous.

Unfortunately, the number of false positives under Linux is more than we would like. I think this is connected with the youth version of the analyzer designed for Linux systems. We have done and continue to do a great job of minimizing the number of false positives. The Linux code helped us to become better and make many useful changes to the analyzer, and now I would like to reciprocate.

Typos


The most common category of errors associated with the usual typos and Copy-Paste . If you have read our articles before, I think you have already seen this. They appear in any projects on any operating systems in any languages. Such errors reveal the potential of a static analyzer: it is much more difficult to find them with other tools. Let's see how things are with them in the Linux kernel:

PVS-Studio warning : V581 The conditional expressions of the 'if' are agreed alongside each other are identical. Check lines: 2384, 2390. debug.c 2390

 int dbg_check_nondata_nodes_order(....) { .... sa = container_of(cur, struct ubifs_scan_node, list); sb = container_of(cur->next, struct ubifs_scan_node, list); if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE && sa->type != UBIFS_XENT_NODE) { ubifs_err(c, "bad node type %d", sa->type); ubifs_dump_node(c, sa->node); return -EINVAL; } if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE && sa->type != UBIFS_XENT_NODE) { ubifs_err(c, "bad node type %d", sb->type); ubifs_dump_node(c, sb->node); return -EINVAL; } .... } 

The analyzer complains about two identical conditions in a row: apparently, in the second they forgot to change sa to sb . Well, who then will say that they are not copying in the coolest projects?

PVS-Studio warning : V666 Consider the third argument of the function 'strncmp'. It is possible that it’s not passed on. spectral.c 341

 static ssize_t write_file_spec_scan_ctl(struct file *file, const char __user *user_buf, size_t count, loff_t *ppos) { struct ath10k *ar = file->private_data; char buf[32]; ssize_t len; int res; len = min(count, sizeof(buf) - 1); if (copy_from_user(buf, user_buf, len)) return -EFAULT; buf[len] = '\0'; mutex_lock(&ar->conf_mutex); if (strncmp("trigger", buf, 7) == 0) { .... } else if (strncmp("background", buf, 9) == 0) { res = ath10k_spectral_scan_config(ar, SPECTRAL_BACKGROUND); } else if (strncmp("manual", buf, 6) == 0) { res = ath10k_spectral_scan_config(ar, SPECTRAL_MANUAL); } else if (strncmp("disable", buf, 7) == 0) { res = ath10k_spectral_scan_config(ar, SPECTRAL_DISABLED); } else { res = -EINVAL; } mutex_unlock(&ar->conf_mutex); if (res < 0) return res; return count; } 

The classic type of error: two arguments must be passed to the function: a pointer to a string and its length. Often, when the argument is a literal, the length of the count is lazy and write just a number. The human factor enters into the business: they are often mistaken.

See, the code has several strncmp in a row. A literal is passed to each of them. And in strncmp (“background”, buf, 9) the length was calculated incorrectly: the word “background” consists of 10, not 9 characters.

PVS-Studio warning : V666 Consider the third argument of the function 'memcpy'. This is not the case. dpt_i2o.c 403

 static void adpt_inquiry(adpt_hba* pHba) { .... memset(pHba->detail, 0, sizeof(pHba->detail)); memcpy(&(pHba->detail), "Vendor: Adaptec ", 16); memcpy(&(pHba->detail[16]), " Model: ", 8); memcpy(&(pHba->detail[24]), (u8*) &buf[16], 16); memcpy(&(pHba->detail[40]), " FW: ", 4); // <= memcpy(&(pHba->detail[44]), (u8*) &buf[32], 4); pHba->detail[48] = '\0'; /* precautionary */ .... } 

One more example. The length of the string "FW:" is 5, and not at all 4 characters.

How to get rid of such an error? In C, you can use macros like:

 #define str_len(S) (sizeof(S) / sizeof((S)[0])) 

But using such macros is dangerous in itself: it is better to add compiler-specific checks, of course, that the argument passed is really an array.

For readers writing in C ++, I can recommend std :: string_view, which finally appeared in C ++ 17. It is better not to pass a pointer-length pair to a function. But if you need to manually calculate the size of the array (for example, to pass it to the memcpy function), then you can use std :: size (array) or its equivalent: for literals, the size will be calculated in compile-time.

Avoid repeating the code and do not be lazy to use language tools (be it macros or templates) for compile time calculations!

PVS-Studio warning : V653 It is possible that a comma is missing. Consider inspecting this literal: "30min" "No timeout". lp8788-charger.c 657

 static ssize_t lp8788_show_eoc_time(struct device *dev, struct device_attribute *attr, char *buf) { struct lp8788_charger *pchg = dev_get_drvdata(dev); char *stime[] = { "400ms", "5min", "10min", "15min", "20min", "25min", "30min" "No timeout" }; .... } 

As is known, two consecutive literals are concatenated. This makes it convenient to use them, for example, in macros. The danger arises when we write an array of such literals: you can skip a comma and get an unexpected result.

In this case, the last two literals “stick together” and you get “30minNo timeout”. This is a double mistake. Firstly, the text is incorrect, and secondly, one element will be missed in the array, which can lead to an exit beyond the array boundary.

I advise you to use a different formatting method, such an error will become noticeable in it:

 char *stime[] = { "400ms" , "5min" , "10min" , "15min" , "20min" , "25min" , "30min" "No timeout" }; 

My colleague Andrey Karpov wrote in more detail about this method of tabular code design. I suggest to get acquainted with chapter N13 from his small book .

PVS-Studio warning : V764 Possible incorrect order of arguments passed to 'ahc_9005_subdevinfo_valid' function: 'device' and 'vendor'. aic7xxx_pci.c 695

 const struct ahc_pci_identity * ahc_find_pci_device(ahc_dev_softc_t pci) { .... if (ahc_get_pci_function(pci) > 0 && ahc_9005_subdevinfo_valid(device, vendor, // <= subdevice, subvendor) && SUBID_9005_MFUNCENB(subdevice) == 0) return (NULL); .... } 

Sometimes it is difficult to understand what the analyzer scolds for. By the way, this often happens: the person did not understand what the analyzer wrote to him, sent us a report with a “false positive”, and there really is a mistake. So it seemed to me here that this is a false positive: the function is defined a little higher in the code and there all the parameters are in place. Here's what she looks like:

 static int ahc_9005_subdevinfo_valid(uint16_t device, uint16_t vendor, uint16_t subdevice, uint16_t subvendor) { .... } 

What's the matter? It turns out that even higher is the declaration of this function, and that is where these arguments are confused. In fact, there is nothing wrong with the logic of the program, but it is better to fix it, so as not to embarrass anyone and not confuse.

 static int ahc_9005_subdevinfo_valid(uint16_t vendor, uint16_t device, uint16_t subvendor, uint16_t subdevice); 



But the funny thing is, the error was already here: the parameters were really confused, just forgot to correct the ad. It's good that the analyzer also found this place.

PVS-Studio warning : V549 The first argument of the memcpy function is equal to the second argument. wilc_wfi_cfgoperations.c 1345

 static int del_pmksa(struct wiphy *wiphy, struct net_device *netdev, struct cfg80211_pmksa *pmksa) { .... for (; i < (priv->pmkid_list.numpmkid - 1); i++) { memcpy(priv->pmkid_list.pmkidlist[i].bssid, priv->pmkid_list.pmkidlist[i + 1].bssid, ETH_ALEN); memcpy(priv->pmkid_list.pmkidlist[i].pmkid, priv->pmkid_list.pmkidlist[i].pmkid, PMKID_LEN); } .... } 

In the last memcpy, the pointers match. Perhaps they wanted to write by analogy with the previous expression:

 memcpy(priv->pmkid_list.pmkidlist[i].pmkid, priv->pmkid_list.pmkidlist[i + 1].pmkid, PMKID_LEN); 

Unused variables


PVS-Studio warning : V575 The 'strncasecmp' function processes '0' elements. Inspect the third argument. linux_wlan.c 1121

 static int mac_ioctl(struct net_device *ndev, struct ifreq *req, int cmd) { u8 *buff = NULL; s8 rssi; u32 size = 0, length = 0; struct wilc_vif *vif; s32 ret = 0; struct wilc *wilc; vif = netdev_priv(ndev); wilc = vif->wilc; if (!wilc->initialized) return 0; switch (cmd) { case SIOCSIWPRIV: { struct iwreq *wrq = (struct iwreq *)req; size = wrq->u.data.length; if (size && wrq->u.data.pointer) { buff = memdup_user(wrq->u.data.pointer, wrq->u.data.length); if (IS_ERR(buff)) return PTR_ERR(buff); if (strncasecmp(buff, "RSSI", length) == 0) { // <= .... } } } .... } done: kfree(buff); return ret; } 

0 was passed to the strncasecmp function as a length argument. There is no place in the code where the length variable would change, so its value will remain zero. Probably, it was necessary to use size .

PVS-Studio warning : V751 Parameter 'LCDheight' is not used inside function body. init.c 339

 static unsigned short SiS_GetModeID(int VGAEngine, unsigned int VBFlags, int HDisplay, int VDisplay, int Depth, bool FSTN, int LCDwidth, int LCDheight) { unsigned short ModeIndex = 0; switch(HDisplay) { case 320: if(VDisplay == 200) ModeIndex = ModeIndex_320x200[Depth]; else if(VDisplay == 240) { if((VBFlags & CRT2_LCD) && (FSTN)) ModeIndex = ModeIndex_320x240_FSTN[Depth]; else ModeIndex = ModeIndex_320x240[Depth]; } break; case 400: if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 800) && (LCDwidth >= 600))) { // <= if(VDisplay == 300) ModeIndex = ModeIndex_400x300[Depth]; } break; case 512: if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 1024) && (LCDwidth >= 768))) { // <= if(VDisplay == 384) ModeIndex = ModeIndex_512x384[Depth]; } break; .... } return ModeIndex; } 

Parameters that are not always used in a function are an error. In fairly old APIs, situations arise when the parameter is no longer needed and is overwritten or simply not used. But take a closer look at this fragment: here they forgot to compare the height. Instead, comparisons of the form '(A> 5) && (A> 3)' appeared , which are redundant in themselves.

Prioritization of operations confusion


PVS-Studio warning : V502 Perhaps the '?:' Operator works in a different way. The '?:' Operator has a lower than the '|' operator. core.c 1046

 static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new, enum pr_type type, bool abort) { u32 cdw10 = nvme_pr_type(type) << 8 | abort ? 2 : 1; return nvme_pr_command(bdev, cdw10, old, new, nvme_cmd_resv_acquire); } 

A ternary operator in C is a very dangerous operator. Not just one of the first general-purpose diagnostics in PVS-Studio is dedicated to it. The fact is that it has a low priority, and in complex expressions it is easy to get confused and get a completely different order of calculations. Therefore it is better, when in doubt, to use brackets.

Suspicious checks


PVS-Studio warning : V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 375, 377. trx.c 375

 bool rtl92ee_rx_query_desc(struct ieee80211_hw *hw, struct rtl_stats *status, struct ieee80211_rx_status *rx_status, u8 *pdesc, struct sk_buff *skb) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct rx_fwinfo *p_drvinfo; struct ieee80211_hdr *hdr; u32 phystatus = GET_RX_DESC_PHYST(pdesc); .... status->macid = GET_RX_DESC_MACID(pdesc); if (GET_RX_STATUS_DESC_MAGIC_MATCH(pdesc)) status->wake_match = BIT(2); else if (GET_RX_STATUS_DESC_MAGIC_MATCH(pdesc)) status->wake_match = BIT(1); else if (GET_RX_STATUS_DESC_UNICAST_MATCH(pdesc)) status->wake_match = BIT(0); else status->wake_match = 0; .... } 

From the side it is difficult to understand what is wrong here. The same test with the GET_RX_STATUS_DESC_MAGIC_MATCH macro takes place twice . If we look at its announcement, we will see two other macros:

 #define GET_RX_STATUS_DESC_PATTERN_MATCH(__pdesc) LE_BITS_TO_4BYTE(__pdesc+12, 29, 1) #define GET_RX_STATUS_DESC_UNICAST_MATCH(__pdesc) LE_BITS_TO_4BYTE(__pdesc+12, 30, 1) #define GET_RX_STATUS_DESC_MAGIC_MATCH(__pdesc) LE_BITS_TO_4BYTE(__pdesc+12, 31, 1) 

Perhaps they wanted to use the GET_RX_STATUS_DESC_PATTERN_MATCH missing in the source fragment. Otherwise, this test is simply meaningless.

PVS-Studio warning : V547 Expression '(ptr [3] & 0x1E)! = 0x03' is always true. sd.c 4115

 int ext_sd_send_cmd_get_rsp(struct rtsx_chip *chip, u8 cmd_idx, u32 arg, u8 rsp_type, u8 *rsp, int rsp_len, bool special_check) { int retval; int timeout = 100; u16 reg_addr; u8 *ptr; .... if (cmd_idx == SELECT_CARD) { if (rsp_type == SD_RSP_TYPE_R2) { if ((ptr[3] & 0x1E) != 0x04) { rtsx_trace(chip); return STATUS_FAIL; } } else if (rsp_type == SD_RSP_TYPE_R0) { if ((ptr[3] & 0x1E) != 0x03) { // <= rtsx_trace(chip); return STATUS_FAIL; } } } .... } 

The error is related to bit operations. The result of a bitwise conjunction with 0x1E due to one bit will never be equal to the value 0x03 :



PVS-Studio warning : V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 1277, 1282. ks_wlan_net.c 1277

 static int ks_wlan_set_power(struct net_device *dev, struct iw_request_info *info, struct iw_param *vwrq, char *extra) { struct ks_wlan_private *priv = (struct ks_wlan_private *)netdev_priv(dev); short enabled; if (priv->sleep_mode == SLP_SLEEP) { return -EPERM; } /* for SLEEP MODE */ enabled = vwrq->disabled ? 0 : 1; if (enabled == 0) { /* 0 */ priv->reg.powermgt = POWMGT_ACTIVE_MODE; } else if (enabled) { /* 1 */ if (priv->reg.operation_mode == MODE_INFRASTRUCTURE) priv->reg.powermgt = POWMGT_SAVE1_MODE; else return -EINVAL; } else if (enabled) { /* 2 */ if (priv->reg.operation_mode == MODE_INFRASTRUCTURE) priv->reg.powermgt = POWMGT_SAVE2_MODE; else return -EINVAL; } else return -EINVAL; hostif_sme_enqueue(priv, SME_POW_MNGMT_REQUEST); return 0; } 

Reduce the example to:

 enabled = vwrq->disabled ? 0 : 1; if (enabled == 0) { /* 0 */ .... } else if (enabled) { /* 1 */ .... } else if (enabled) { /* 2 */ .... } else .... 

This code looks very strange. It seems that the range of values ​​is clearly discussed by the expression above: enabled is 0 or 1 . But checked as many as 4 values. At the same time, comments only interfere: if the digits should have indicated the possible value of the variable, now they do not correspond to reality: the checks for 1 and 2 are written in the same way.

PVS-Studio warning : V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 422, 424. Hal8188ERateAdaptive.c 422

 static int odm_ARFBRefresh_8188E( struct odm_dm_struct *dm_odm, struct odm_ra_info *pRaInfo) { /* Wilson 2011/10/26 */ .... if (pRaInfo->HighestRate > 0x13) pRaInfo->PTModeSS = 3; else if (pRaInfo->HighestRate > 0x0b) pRaInfo->PTModeSS = 2; else if (pRaInfo->HighestRate > 0x0b) pRaInfo->PTModeSS = 1; else pRaInfo->PTModeSS = 0; .... return 0; } 

Another place where two conditions go in a row. Note that the bodies are different. It is difficult to say if there is a real mistake here, or is it just unused code: this is the task of the project developers. The task of the analyzer to pay attention to a suspicious place.

PVS-Studio warning : V734 An excessive check. “Interleaver” and “deinterleaver”. sst-atom-controls.c 1449

 static int sst_fill_widget_module_info( struct snd_soc_dapm_widget *w, struct snd_soc_platform *platform) { struct snd_kcontrol *kctl; int index, ret = 0; struct snd_card *card = platform->component.card->snd_card; char *idx; down_read(&card->controls_rwsem); list_for_each_entry(kctl, &card->controls, list) { .... } else if (strstr(kctl->id.name, "interleaver")) { struct sst_enum *e = (void *)kctl->private_value; e->w = w; } else if (strstr(kctl->id.name, "deinterleaver")) { struct sst_enum *e = (void *)kctl->private_value; e->w = w; } .... } up_read(&card->controls_rwsem); return 0; } 

In this fragment, the presence of several substrings in one string is sequentially checked. For clarity, I left only the substrings of interest to us. Suppose that we did not find an interleaver - then there is no point in looking for deinterleaver , because the substring interleaver is definitely not. Therefore, this piece of code will never work, but since the bodies of if and else are the same, this is not a problem. This is just a redundant code.

PVS-Studio warning : V547 Expression 'block' is always true. svclock.c 873

 void nlmsvc_grant_reply(struct nlm_cookie *cookie, __be32 status) { struct nlm_block *block; dprintk("grant_reply: looking for cookie %x, s=%d \n", *(unsigned int *)(cookie->data), status); if (!(block = nlmsvc_find_block(cookie))) return; if (block) { if (status == nlm_lck_denied_grace_period) { /* Try again in a couple of seconds */ nlmsvc_insert_block(block, 10 * HZ); } else { /* Lock is now held by client, or has been rejected. * In both cases, the block should be removed. */ nlmsvc_unlink_block(block); } } nlmsvc_release_block(block); } 

This example demonstrates why it is not enough for the static analyzer to perform a Pattern-based analysis, bypassing the AST . It is important to be able to perform Control flow analysis and Data flow analysis. At the moment when block == NULL occurs return , respectively, further down the code, we can definitely say that the pointer is nonzero. And when we meet the check on NULL , we understand for certain that something is wrong here.

It seems that the second pointer check here is just superfluous. However, suddenly they wanted to check another variable here? Who knows ... This code is obviously left to the developer for verification.

Similar situation:

PVS-Studio warning : V547 Expression 'sym' is always true. menu.c 498

 bool menu_is_visible(struct menu *menu) { struct menu *child; struct symbol *sym; .... if (!sym || sym_get_tristate_value(menu->sym) == no) // <= return false; for (child = menu->list; child; child = child->next) { if (menu_is_visible(child)) { if (sym) // <= sym->flags |= SYMBOL_DEF_USER; return true; } } return false; } 

Macro error


PVS-Studio warning : V733 Check expression: request-> rq_timeout + 5 * 1000. niobuf.c 637

 #define CFS_FAIL_TIMEOUT(id, secs) \ cfs_fail_timeout_set(id, 0, secs * 1000, CFS_FAIL_LOC_NOSET) #define OBD_FAIL_TIMEOUT(id, secs) \ CFS_FAIL_TIMEOUT(id, secs) int ptl_send_rpc(struct ptlrpc_request *request, int noreply) { .... OBD_FAIL_TIMEOUT(OBD_FAIL_PTLRPC_DELAY_SEND, request->rq_timeout + 5); .... } 

But such errors are very rare. Before that, I saw only one triggering of this diagnostic in a real project: noteworthy that it was FreeBSD . An error was made in the definition of a macro: it is best to surround all its parameters with parentheses. If you do not do this, then the following case is possible: if you substitute 'x + 5' into 'secs * 1000', you get 'x + 5 * 1000', and this is clearly not what the author expected.

Mindless memset


PVS-Studio warning : V597 The compiler could delete the memset function call, which is used to flush the 'ps' buffer. The memset_s () function should be used to erase the private data. atom.c 1383

 int amdgpu_atom_asic_init(struct atom_context *ctx) { int hwi = CU16(ctx->data_table + ATOM_DATA_FWI_PTR); uint32_t ps[16]; int ret; memset(ps, 0, 64); ps[0] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFSCLK_PTR)); ps[1] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFMCLK_PTR)); if (!ps[0] || !ps[1]) return 1; if (!CU16(ctx->cmd_table + 4 + 2 * ATOM_CMD_INIT)) return 1; ret = amdgpu_atom_execute_table(ctx, ATOM_CMD_INIT, ps); if (ret) return ret; memset(ps, 0, 64); // <= return ret; } 

It makes no sense to add a memset before return : the compiler, seeing that this operation does not change the visible state of the program (the array goes out of scope anyway) removes it. If you need to erase some important data, then you should use memset_s or write your own analogue.

This error by the way is actually a vulnerability. Data that should be erased is not overwritten. Details can be found in the description of diagnostics V597 . By the way, this is a very common vulnerability: proof .

Dangerous use of memcmp


PVS-Studio warning : V642 Saving the 'memcmp' function result inside the 'unsigned char' type variable is inappropriate. Breaking the program's logic. host.c 1789

 static void power_control_timeout(unsigned long data) { .... u8 other = memcmp(requester->frame_rcvd.iaf.sas_addr, iphy->frame_rcvd.iaf.sas_addr, sizeof(requester->frame_rcvd.iaf.sas_addr)); if (other == 0) { .... } .... } 

If you carefully read what the documentation says about the return value of memcmp , then we will see that there is no guarantee about any particular range: the function can return any number within its type. And this is not always -1 , 0 and 1 . Therefore, it is impossible to save its value in a variable of a smaller type: with the loss of the higher digits, the younger ones can be zero. A similar error led to a vulnerability in MySQL / MariaDB .

Conclusion




As already mentioned, Linux is a very high-quality, well-tested project. Finding a mistake in it, even the most insignificant, is a cause for pride. This is also a reason to think about how many errors could be found before debugging and testing: it is in this line of work that static analysis is valuable. You can verify this by trying PVS-Studio. For Linux, you can get a trial version of the analyzer by emailing us. And for your non-commercial projects, you can use PVS-Studio for free: it’s enough to get acquainted with this article and use the open and free utility how-to-use-pvs-studio-free .



If you want to share this article with an English-speaking audience, then please use the link to the translation: Pavel Belikov. Linux Kernel, tested by the Linux-version of PVS-Studio

Read the article and have a question?
Often our articles are asked the same questions. We collected answers to them here: Answers to questions from readers of articles about PVS-Studio, version 2015 . Please review the list.

Source: https://habr.com/ru/post/317374/


All Articles