
We bring to your attention a series of articles devoted to recommendations for writing high-quality code using the example of errors found in the Chromium project. This is the fifth part, which will be devoted to errors in the use of unverified or incorrectly verified data. A very large number of vulnerabilities exist due to the use of unverified data, which makes this topic interesting and relevant.
In fact, the cause of the vulnerability can be almost any type of error, even a common typo. Actually, if the error found is classified according to the
Common Weakness Enumeration , then it is a potential vulnerability.
Starting from version 6.21, the PVS-Studio analyzer has learned to classify the found errors according to the Common Weakness Enumeration and assign them the corresponding CWE ID.
Perhaps readers have already noticed that in previous articles, in addition to the warning number Vxxx, I also quoted a CWE ID. This means that the errors discussed earlier could theoretically cause vulnerabilities. The probability of this is small, but it is. Interestingly, we managed to match this or that CWE ID with almost every warning issued by PVS-Studio. This means that without planning it, we created an analyzer that can detect a large number of weaknesses :).
')
Conclusion. The PVS-Studio analyzer helps to prevent many types of vulnerabilities in advance. Publication on this topic: "
How can PVS-Studio help in searching for vulnerabilities? ".
In this article, I have compiled errors that could potentially lead to security problems. I warn you that the choice of errors is very conditional and subjective. It may well turn out that some kind of vulnerability is disguised as an error, which I called a commonplace typo in one of the previous articles.
So, let's consider what security defects I noticed during the analysis of the report issued by PVS-Studio for the Chromium project. As I wrote in the
introductory article , I looked at the report rather briefly, so there may be other errors that I did not notice. The task of the article in general terms is to show how some errors can lead to the fact that the program begins to process incorrect or unverified data. I have not yet decided how best to call such data, and for the time being I will use the term “unreliable data”.
Error examples
Chromium project.
InstallUtil::ConditionalDeleteResult InstallUtil::DeleteRegistryValueIf(....) { .... ConditionalDeleteResult delete_result = NOT_FOUND; .... if (....) { LONG result = key.DeleteValue(value_name); if (result != ERROR_SUCCESS) { .... delete_result = DELETE_FAILED; } delete_result = DELETED; } return delete_result; }
PVS-Studio
warning :
V519 CWE-563 The 'delete_result' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 381, 383. install_util.cc 383
The function returns an incorrect status. As a result, other parts of the program will assume that the function has successfully deleted some value. The error is that the
DELETE_FAILED status is always replaced with the
DELETED status.
You can fix the error by adding the
else keyword:
if (result != ERROR_SUCCESS) { .... delete_result = DELETE_FAILED; } else { delete_result = DELETED; }
Perhaps, the considered error does not reflect the essence of inaccurate data very well. In this function, the creation of false data, and not their verification or use. Therefore, let's consider another, more appropriate error.
PDFium library (used in Chromium).
CPVT_WordRange Intersect(const CPVT_WordRange& that) const { if (that.EndPos < BeginPos || that.BeginPos > EndPos || EndPos < that.BeginPos || BeginPos > that.EndPos) { return CPVT_WordRange(); } return CPVT_WordRange(std::max(BeginPos, that.BeginPos), std::min(EndPos, that.EndPos)); }
PVS-Studio warnings:
- V501 CWE-570 There are identical sub-expressions 'that.BeginPos> EndPos' operator. cpvt_wordrange.h 46
- V501 CWE-570 There are identical sub-expressions' that. EndPos <| operator. cpvt_wordrange.h 46
The condition is written incorrectly. To make it easier to notice the error, we reduce the condition:
if (E2 < B1 || B2 > E1 || E1 < B2 || B1 > E2)
Notice that
(E2 <B1) and
(B1> E2) are one and the same. Similar to
(B2> E1), this is the same as
(E1 <B2) .
It turns out that not all the necessary checks are performed, and then an incorrect range can be generated, which, in turn, will affect the operation of the program.
Now let's look at a large and complex piece of code from the RE2 regular expression library (used in Chromium). To be honest, I don’t even understand what is happening here, but in the code there is definitely an anomalous check.
First you need to show how some types are declared. If this is not done, the code will not be very clear.
typedef signed int Rune; enum { UTFmax = 4, Runesync = 0x80, Runeself = 0x80, Runeerror = 0xFFFD, Runemax = 0x10FFFF, };
And now the function with anomaly.
char* utfrune(const char *s, Rune c) { long c1; Rune r; int n; if(c < Runesync) return strchr((char*)s, c); for(;;) { c1 = *(unsigned char*)s; if(c1 < Runeself) { if(c1 == 0) return 0; if(c1 == c)
The PVS-Studio analyzer issues a warning on the line, which I noted with the comment "// <=". Message:
V547 CWE-570 Expression 'c1 == c' is always false. rune.cc 247
Let's try to figure out why a condition is always false. First, notice these lines:
if(c < Runesync) return strchr((char*)s, c);
If the variable
c <0x80 , the function will stop its operation. If the function does not complete its work, but continues it, then we can say for sure that the variable
c> = 0x80 .
Now we look at the condition:
if(c1 < Runeself)
The condition
(c1 == c) , marked with the comment "// <=", is satisfied only if
c1 <0x80 .
So, this is what we know about the values ​​of variables:
It follows that the condition
c1 == c is always false. But it is very suspicious. It turns out that the
utfrune function in the regular expression library does not work as planned. The consequences of such an error are unpredictable.
LibVPX video codec (used in Chromium).
#define VP9_LEVELS 14 extern const Vp9LevelSpec vp9_level_defs[VP9_LEVELS]; typedef enum { .... LEVEL_MAX = 255 } VP9_LEVEL; static INLINE int log_tile_cols_from_picsize_level( uint32_t width, uint32_t height) { int i; const uint32_t pic_size = width * height; const uint32_t pic_breadth = VPXMAX(width, height); for (i = LEVEL_1; i < LEVEL_MAX; ++i) { if (vp9_level_defs[i].max_luma_picture_size >= pic_size && vp9_level_defs[i].max_luma_picture_breadth >= pic_breadth) { return get_msb(vp9_level_defs[i].max_col_tiles); } } return INT_MAX; }
PVS-Studio warnings:
- V557 CWE-119 Array overrun is possible. The value of 'i' index could reach 254. vp9_encoder.h 931
- V557 CWE-119 Array overrun is possible. The value of 'i' index could reach 254. vp9_encoder.h 932
- V557 CWE-119 Array overrun is possible. The value of 'i' index could reach 254. vp9_encoder.h 933
The
vp9_level_defs array consists of 14 elements. In the loop, the variable
i , used as the index of the array, varies from 0 to 254. The result: going beyond the array boundary.
Well, if this code leads to
Access Violation . But in practice, most likely, some random data located after the
vp9_level_defs array will be
processed .
Another similar error of using data outside the array was met in the SQLite library (used in Chromium).
First, notice that the
yy_shift_ofst array contains 455 elements.
static const short yy_shift_ofst[] = { 355, 888, 1021, 909, 1063, 1063, 1063, 1063, 20, -19, .... 1440, 1443, 1538, 1542, 1562, }
Also of interest to us are two macros:
#define YY_SHIFT_COUNT (454) #define YY_MIN_REDUCE 993
The macro
YY_SHIFT_COUNT defines the maximum index that can be used to access the elements of the
yy_shift_ofst array. It is not 455, but 454, since the numbering of elements comes with 0.
The macro
YY_MIN_REDUCE , equal to 993, has nothing to do with the size of the array
yy_shift_ofst .
Function containing weak validation:
static unsigned int yy_find_shift_action(....) { int i; int stateno = pParser->yytos->stateno; if( stateno>=YY_MIN_REDUCE ) return stateno;
PVS-Studio warning: V557 CWE-125 Array overrun is possible. The value of 'stateno' index could reach 992. sqlite3.c 138802
The function provides protection that the index when accessing an array should not be greater than a certain value. Because of a typo, or for some other reason, the wrong constant is used. It was necessary to use a constant equal to 454, and instead the value of the index is compared with 993.
As a result, it is possible to access the array beyond the boundary and read arbitrary invalid data.
Note. Below is the correct
assert , but it will not help in the release version.
Most likely, the check should be rewritten as follows:
if (stateno > YY_SHIFT_COUNT) { assert(false); return stateno; }
ICU project (used in Chromium).
UVector* ZoneMeta::createMetazoneMappings(const UnicodeString &tzid) { UVector *mzMappings = NULL; .... if (U_SUCCESS(status)) { .... if (U_SUCCESS(status)) { .... while (ures_hasNext(rb)) { .... if (mzMappings == NULL) { mzMappings = new UVector( deleteOlsonToMetaMappingEntry, NULL, status); if (U_FAILURE(status)) { delete mzMappings; uprv_free(entry); break; } } .... } .... } } ures_close(rb); return mzMappings; }
PVS-Studio
warning :
V774 CWE-416 The mzMappings' pointer was used after the memory was released. zonemeta.cpp 713
The code is complex, I find it difficult to say for sure if there is a real error here or not. However, as I understand it, a situation is possible when the function returns a pointer to an already freed memory block. The correct handler for an incorrect status should reset the pointer:
if (U_FAILURE(status)) { delete mzMappings; mzMappings = nullptr; uprv_free(entry); break; }
Now it turns out that the function returned a pointer to the freed memory. Anything in this memory can be anything and using this invalid pointer will lead to undefined program behavior.
The following Chromium project function incorrectly implements protection against negative values.
void AXPlatformNodeWin::HandleSpecialTextOffset(LONG* offset) { if (*offset == IA2_TEXT_OFFSET_LENGTH) { *offset = static_cast<LONG>(GetText().length()); } else if (*offset == IA2_TEXT_OFFSET_CARET) { int selection_start, selection_end; GetSelectionOffsets(&selection_start, &selection_end); if (selection_end < 0) *offset = 0; *offset = static_cast<LONG>(selection_end); } }
PVS-Studio warning: V519 CWE-563 The '* offset' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 3543, 3544. ax_platform_node_win.cc 3544
If the value of the variable
selection_end is negative, then the function should return
0 . However, because of a typo,
0 is written in the wrong place. The correct code should be:
if (selection_end < 0) selection_end = 0; *offset = static_cast<LONG>(selection_end);
Because of this error, the function may return a negative number, although it should not. This is a negative number that can “leak” through the check, and there is inaccurate data.
Other errors
To be honest, I do not really like the examples that I cited in the previous section of the article. There are few of them, and they do not reflect very well the essence of the errors of using invalid data. I think, over time, I will make a separate article where I will show more vivid examples of mistakes, having collected them from various open source projects.
By the way, it was possible to include more examples of errors in the article, but I already “spent” them when writing previous articles, but I don’t want to repeat myself. For example, in the article “Chromium: typos,” there was such a fragment:
if(!posX->hasDirtyContents() || !posY->hasDirtyContents() || !posZ->hasDirtyContents() || !negX->hasDirtyContents() || !negY->hasDirtyContents() ||
Because of this typo, the object referenced by the
negZ pointer is not
checked . As a result, the program will work with unreliable data.
Also in this article I did not begin to consider situations where invalid (corrupted) data arise due to the lack of verification of the pointer returned by the
malloc function. If the
malloc function
returns NULL, this does not mean that only a null pointer dereference is possible. There are more insidious situations. Schematically, they look like this:
int *ptr = (int *)malloc(100 * sizeof(int)); ptr[1234567] = 42;
There will be no null pointer dereference here. Here the data will be recorded incomprehensibly where and the destruction of some data.
This is an interesting story and I will dedicate the following separate article to it.
Recommendations
To the emergence and use of unreliable (unverified, corrupted) data lead a variety of errors. There can be no universal council here. You can, of course, write: do not make mistakes in the code! But there is no use for such advice :).
So why then did I even write this article and single out this type of error?
So that you know about them. The knowledge of the existence of a problem in itself helps to prevent it. If someone does not know about a problem, it does not mean that it does not exist. This picture will be a good illustration:

What can you still advise:
- Update libraries used in your project. The new versions can be fixed various bugs that represent vulnerabilities. However, we must admit that the vulnerability may appear just in the new version, and absent in the old one. But still, a better solution would be to upgrade the libraries. More people know about old vulnerabilities than about new ones.
- Carefully check all input data, especially coming from outside. For example, all data coming from somewhere over the network should be very carefully checked.
- Use various code verification tools. For example, the Chromium project clearly lacks the use of the PVS-Studio static analyzer :).
- Explain to colleagues that "a simple error in coding does not mean a non-direst error ." If your team develops responsible applications, then you should focus as much as possible on the quality of the code and destroy everything, even if they are innocuous for the type of error.
Note about PVS-Studio
As I have already said, the PVS-Studio analyzer already helps prevent the appearance of a vulnerability by detecting errors at the stage of writing code. But we want more and soon we will significantly improve PVS-Studio by introducing the concept of “using unverified data” into Data Flow analysis.
We have even reserved a special number for this important diagnostic: V1010. Diagnostics will detect errors when data has been received from an unreliable source (for example, sent over the network) and used without proper verification. The absence of all necessary input data checks is often the cause of detection of vulnerabilities in applications. We recently wrote about this in more detail in the article "
PVS-Studio 2018: CWE, Java, RPG, macOS, Keil, IAR, MISRA " (see the section "Potential Vulnerabilities, CWE").
New diagnostics will significantly enhance the analyzer in identifying potential vulnerabilities. Most likely, the V1010 diagnostics will match the
CWE-20 identifier (Improper Input Validation).
Conclusion
I suggest you and your colleagues read the article "
42 recommendations " on our website. After it, the programmer will not turn into a security expert, but he will learn a lot of new and useful things. Especially these articles will be useful to developers who have only recently mastered the language of C or C ++ and who are not aware of how deep the rabbit hole they fall into.
I plan to update 42 tips and turn them into 50 tips. Therefore, I invite you to subscribe to my twitter
@Code_Analysis and our
RSS feed, so as not to miss this and other interesting articles in our blog.

If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov.
Chromium: Use of Untrusted Data .