Large projects check interesting. As a rule, they manage to find various interesting mistakes and tell people about them. It is also a good way to test our analyzer and improve its various aspects. I have long wanted to check 'Mono', and finally this opportunity has appeared. And, it should be said, the test justified itself, since it was possible to find a lot of interesting things. About what was found, what nuances arose when checking, and there will be this article.
about the project
Mono - a project to create a full-fledged implementation of the .NET Framework system based on free software. The main developer of the Mono project is
Xamarin Corporation, formerly Novell.
Mono includes the C # compiler, the .NET runtime — mono (with JIT support) and mint (without JIT support), a debugger, and a number of libraries, including the implementation of WinForms, ADO.NET and ASP.NET, and the smcs compilers (for creating applications for Moonlight) and vbc (for applications written in VB.NET).
The project also develops bindings for the GTK + graphics library on the .NET platform.
')
The source code is available for download in the
repository on GitHub . The number of lines of code when checking the repository loaded from GitHub was about 6.3 million (excluding blank lines). This amount of code base looks extremely attractive - probably somewhere in the code errors crept in. On the other hand, checking such a large project will also be useful for the analyzer, since it will serve as a good stress test.
Analysis tool, verification features
The static code analyzer
PVS-Studio acted as an analysis tool. Currently, the C # analyzer includes
over 100 diagnostic rules , each of which is accompanied by documentation containing information about the error, possible consequences and remedies. Since the release, it has been possible to check many different projects written in C #, such as
Roslyn ,
Xamarin.Forms ,
Space Engineers ,
CoreFX ,
Code Contracts , etc. (the full list can be found
here ).
The analyzer itself is available for
download at the link . The trial version should be enough for the first acquaintance with the analyzer. If you are interested in the tool,
write to us and we will provide a key for more intimate acquaintance and help with its setting.
I also want to note that the article did not cite errors from files containing some mention of Microsoft at the beginning of the file. This is done mainly in order to avoid possible intersections with errors mentioned in other articles. In any case, there was enough material already.
As always, the article does not provide all the errors, as this would greatly increase its volume. I tried to choose the most interesting places, but many still remained outside the scope of the article. Do not think that I want to reproach the authors of 'Mono' for something. The number of errors due to the large size of the project, and this is understandable. Nevertheless, it would be good to correct the existing errors and prevent the appearance of new ones. To help with this will help the introduction of static analysis in the development process. More on this is written below in the corresponding section.
A couple of words about why checking projects is not a trivial exercise.
In an ideal world, checking projects and writing an article is done according to the following scenario: I found a project -> I collected -> I checked it with the analyzer -> I found a sufficient number of errors -> I wrote an article. Everyone is happy, everyone is happy: we ticked a checked project, people read a new article, the developers learned about errors in the code, the author did a good job.
Alas, our world is not perfect. At certain stages, problems often arise, this time with the assembly. If there is a detailed assembly instruction, or if it is easy to do on your own, this is great! Then you can safely proceed to the verification of the project and writing the article. Otherwise, a headache begins. It happened with 'Mono'. The
net_4_x.sln solution, which combines C # projects, is not going out of the box (that is, immediately after downloading from the repository). One of the build scripts turned out to be inoperative (the path was incorrectly spelled (possibly due to the fact that the directory hierarchy changed over time)), but its correction also did not bear fruit.
Of course, I did not want to retreat, so I experimented with the assembly even during off-hours, but the result turned out to be a bit. As a result, after taking a fair amount of time with this, it was decided to write the article “as is”.
Periodically in the articles, I repeat that in order to fully validate the project, it should be compiled - with all dependencies, without errors and p. As a rule, I try to achieve this, but there are exceptions to the rules, as in this case.
Of course, checking an unassembled project is not good for several reasons:
- project analysis is not as good as it could be. It is a fact. What exactly the quality is reduced depends on the implementation of the diagnostic rule. Perhaps a false positive will appear, or vice versa, a useful trigger will not be issued;
- when viewing the log, you need to be an order of magnitude more attentive, since there arises (albeit small) the probability of the occurrence of false positives, which would not exist if the project was correctly assembled
- as some useful operations disappear, it is possible to miss interesting errors that could get into the article and which the developers would learn about (however, they can learn about these errors by checking the project themselves);
- you have to write sections "A couple of words about why checking projects ...".
Nevertheless, there were many suspicious places, some of which will be discussed below.
Test results
Recently, in the articles we are trying to provide statistics on a verified project: the total number of warnings, the number of false positives and real errors.
Unfortunately, this time I can not give such statistics. First, the code, as well as warnings, a lot. If the total number of warnings analyzed is several dozen, you can view them and give a rough estimate. When the count of warnings goes to hundreds, the task of analysis becomes nontrivial.
Secondly, on a fully compiled project, these statistics can change in any of the directions: a decrease or increase in the number of warnings. In the assembled project, more semantic information is available to the analyzer, therefore, it can conduct a deeper analysis (false positives will disappear, new warnings will appear). For those who are interested in how semantic information affects the analysis and by what principles it all works, I suggest reading the article "
Introduction to Roslyn. Using the development of static analysis tools ."
But you probably can't wait to see what interesting things you found in the project code? Well, let's look at some code snippets.
Same subexpressions within the same expression
One of the most common mistakes. The reasons for the occurrence are many. These include copy-paste, similar variable names, abuse of IntelliSense, and trivial inattention. Were distracted for a minute - made a mistake.
public int ExactInference (TypeSpec u, TypeSpec v) { .... var ac_u = (ArrayContainer) u; var ac_v = (ArrayContainer) v; .... var ga_u = u.TypeArguments; var ga_v = v.TypeArguments; .... if (u.TypeArguments.Length != u.TypeArguments.Length)
PVS-Studio warning : V3001 There are identical sub-expressions 'u.TypeArguments.Length' to the left and the right! generic.cs 3135
Now that the method code is simplified, it is easy to notice an error in the condition of the
if operator — the parameter
v , rather than
u , should act as the
TypeSpec instance in the second subexpression. Perhaps the mistake was made because the external characters
u and
v are similar, and without focusing on this expression, you can easily overlook the trick.
The rest of the code was given in order to indicate that these parameters are mainly used together, on the basis of which it is possible to conclude an error and the correct code version:
if (u.TypeArguments.Length != v.TypeArguments.Length) return 0;
Not less interesting case.
bool BetterFunction (....) { .... int j = 0; .... if (!candidate_params && !candidate_pd.FixedParameters [j - j].HasDefaultValue) {
PVS-Studio warning : V3001 There are no sub-expressions. ecore.cs 4832
In one of the index calculation expressions, the programmer made a mistake by writing the expression
j - j . Thus, access to the first element of the array will always be performed. If this were required, it would be more logical to use an integer literal equal to 0. The fact that this is really an error is confirmed by other index calls to the same array:
j - 1 . I suppose that, again, the error was not noticed due to some similarity in writing the characters
j and
1 , so that during a cursory review of the code this could be overlooked.
Note colleagues Andrei Karpov. When I read the draft of the article, I was about to make a note to Sergei that he apparently had written the wrong part of the code. I looked into the code and did not see the error. Just starting to read the description, I realized what was the matter. I confirm this typo is very difficult to notice. Our PVS-Studio is gorgeous!
Continue to "break eyes":
internal void SetRequestLine (string req) { .... if ((ic >= 'A' && ic <= 'Z') || (ic > 32 && c < 127 && c != '(' && c != ')' && c != '<' && c != '<' && c != '>' && c != '@' && c != ',' && c != ';' && c != ':' && c != '\\' && c != '"' && c != '/' && c != '[' && c != ']' && c != '?' && c != '=' && c != '{' && c != '}')) continue; .... }
PVS-Studio warning : V3001 There are identical sub-expressions 'c! =' <'' Operator. HttpListenerRequest.cs 99
Within the expression, the subexpression
c! = '<' Occurs twice. Probably just an extra comparison.
protected virtual bool ShouldSerializeLinkHoverColor () { return grid_style.LinkHoverColor != grid_style.LinkHoverColor; }
PVS-Studio warning : V3001 There are identical sub-expressions of the 'grid_style.LinkHoverColor' on the left! DataGrid.cs 2225
Simplify the method to highlight the error is not needed. In comparison, the same subexpressions are
involved -
grid_style.LinkHoverColor .
The correct code should look like this:
protected virtual bool ShouldSerializeLinkHoverColor () { return grid_style.LinkHoverColor != default_style.LinkHoverColor; }
Why so? Above there are a number of methods in which various properties of the
grid_style object
are compared with properties of the
default_style object. But in the latter case, relaxed and made a mistake. Hmm, "
last line effect "?
Well, this is a classic typos:
static bool AreEqual (VisualStyleElement value1, VisualStyleElement value2) { return value1.ClassName == value1.ClassName &&
PVS-Studio warning : V3001 There are identical sub-expressions 'value1.ClassName' to the left. ThemeVisualStyles.cs 2141
Accidentally compared the
value1.ClassName subexpression with itself. Of course, in the second case, the
value2 object should have been used.
I think if you make such a code using a table method, then the error will be much easier to notice. This is a good prevention of such typos:
static bool AreEqual (VisualStyleElement value1, VisualStyleElement value2) { return value1.ClassName == value1.ClassName && value1.Part == value2.Part && value1.State == value2.State; }
Code that is formatted this way is much easier to read, and it is easier to notice that there is something wrong with one column. For details on code alignment, see Chapter 13 from the book "
The Main Question of Programming, Refactoring, and All That ."
The remaining suspicious places detected by diagnostic rule V3001 are listed in the
file .
Same conditions in the 'else if' construct
enum TitleStyle { None = 0, Normal = 1, Tool = 2 } internal TitleStyle title_style; public Point MenuOrigin { get { .... if (this.title_style == TitleStyle.Normal) {
PVS-Studio warning : V3003 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 597, 599. Hwnd.cs 597
The same expression
this.title_style == TitleStyle.Normal is checked twice . Obviously, this code contains an error. After all, regardless of the value of the above expression, the expression
pt.Y + = tool_caption_height will never be executed. I will assume that the second case implied a comparison of the
title_style field with the constant
TitleStyle.Tool .
Same 'if-then' and 'if-else' bodies
public static void DrawText (....) { .... if (showNonPrint) TextRenderer.DrawTextInternal (g, text, font, new Rectangle (new Point ((int)x, (int)y), max_size), color, TextFormatFlags.NoPadding | TextFormatFlags.NoPrefix, false); else TextRenderer.DrawTextInternal (g, text, font, new Rectangle (new Point ((int)x, (int)y), max_size), color, TextFormatFlags.NoPadding | TextFormatFlags.NoPrefix, false); .... }
PVS-Studio warning : V3004 The 'then' statement is equivalent to the 'else' statement. System.Windows.Forms-net_4_x TextBoxTextRenderer.cs 79
Regardless of the value of the
showNonPrint variable, the static
DrawTextInternal method of the
TextRenderer class with the same arguments will be called. It is possible that the error was made due to the use of copy-paste. The method call was copied, but the arguments were forgotten.
The return value of the method is not used.
public override object ConvertTo(.... object value, Type destinationType) { .... if (destinationType == typeof(string)) { if (value == null) { return String.Empty; } else { value.ToString(); } } .... }
PVS-Studio warning : V3010 The return value of the function 'ToString' is required to be utilized. ColumnTypeConverter.cs 91
A very interesting mistake, apparently, with far-reaching consequences. From the code you can see that type checking is performed, and if the type is
string , a check is performed for
null equality. And then - the most interesting. If the
value reference is
null , an empty string is returned, otherwise ... Most likely, it was assumed that the string representation of the object will be returned, but the
return statement is missing. Therefore, the return value of the
ToString () method will not be used in any way, and the
ConvertTo method
will continue its execution. Thus, because of the forgotten operator
return , the logic of the program has changed. I assume that the correct version of the code should look like this:
if (value == null) { return String.Empty; } else { return value.ToString(); }
What error is described here, you will find out later

I usually simplify the methods so that the error is easier to notice. I propose to play the game. Find the error in the following method fragment. To make it more interesting, I will not write the type of error and simplify all the code (and so provide only part of the method).

Well, how are you doing? For some reason it seems to me that most did not even try. I will not torment you.
PVS-Studio warning : V3012 The '?:' Operator, regardless of its conditional expression, always returns one and the same value: Color.FromArgb (150, 179, 225). ProfessionalColorTable.cs 258
Here it is, the ill-fated ternary operator:
button_pressed_highlight = use_system_colors ? Color.FromArgb (150, 179, 225) : Color.FromArgb (150, 179, 225);
Regardless of the value of the use_system_colors variable, the button_pressed_highlight object will be assigned the same value. If you think that such errors are difficult to track, I suggest you look at the entire file (
ProfessionalColorTable.cs ) and understand that such errors are not just difficult to track on your own - this is simply unrealistic.
By the way, there are many such places (as many as 32), which even suggests that this is not a mistake at all, but such is the intention. Nevertheless, the code looks strange, and I would advise checking it out. Even if this is not an error, but the expected logic, it would be easier to use ordinary assignment than to write strange ternary operators that are confusing. The remaining warnings V3012 are given in the
file .
Using another cycle counter
public override bool Equals (object obj) { if (obj == null) return false; PermissionSet ps = (obj as PermissionSet); if (ps == null) return false; if (state != ps.state) return false; if (list.Count != ps.Count) return false; for (int i=0; i < list.Count; i++) { bool found = false; for (int j=0; i < ps.list.Count; j++) { if (list [i].Equals (ps.list [j])) { found = true; break; } } if (!found) return false; } return true; }
PVS-Studio warning : V3015 It is likely that a variable is being compared inside the 'for' operator. Consider reviewing 'i' corlib-net_4_x PermissionSet.cs 607
A suspicious condition to exit a nested loop looks like:
i <ps.list.Count . The variable
j acts as a loop counter, but the output condition uses the outer loop counter — variable
i .
The intentions of the author of the code are clear - check that the collections contain the same elements. But if it turns out that some element of the
list collection is not contained in
ps.list , the exit from the nested loop will not be performed using the
break statement. At the same time, within the same cycle, the variable
i does not change, i.e.
i <ps.list.Count expression will always be
true . As a result, the cycle will be executed until the collection overflows the boundary (due to the constant increment of the counter
j ).
Checking the wrong null reference after casting with the as operator
As it turned out, this is a typical error for the C # language. She
is in almost every project on which the article is written. Typically, diagnostic rule
V3019 detects cases of the following:
var derived = base as Derived; if (base == null) {
Checking
base == null will save only if the
base is indeed
null , and then it doesn’t matter to us whether the cast succeeded or not. But it is obvious that link verification was
derived . And then, if
base! = Null and the conversion failed, and then the method works with the members of the
derived object, an exception of type
NullReferenceException will be generated.
Moral: if you use this pattern, make sure that you check the correct link for
null equality.
But this is a theory. Let's see what we found in practice:
public override bool Equals (object o) { UrlMembershipCondition umc = (o as UrlMembershipCondition); if (o == null) return false; .... return (String.Compare (u, 0, umc.Url, ....) == 0);
PVS-Studio warning : V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'o', 'umc'. UrlMembershipCondition.cs 111
The pattern is one to one described above. If the object type
o is incompatible with the
UrlMembershipCondition type
, and the object
o is not
null , then an attempt to access the
umc.Url property will generate an exception of type
NullReferenceException .
Accordingly, to correct the error, it is necessary to correct the check:
if (umc == null) return false;
Take a look at another blooper.
static bool QSortArrange (.... ref object v0, int hi, ref object v1, ....) { IComparable cmp; .... cmp = v1 as IComparable; if (v1 == null || cmp.CompareTo (v0) < 0) { .... } .... }
PVS-Studio warning : V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'v1', 'cmp'. Array.cs 1487
The situation is similar to that described above. Unless in the case of an unsuccessful cast, an
NullReferenceException exception will be
thrown right there - right when checking the expression.
In general, the situation in other places is similar, so I’ll just give you another 12 warnings in a
text file .
Unqualified exception exception
public void ReadEmptyContent(XmlReader r, string name) { .... for (r.MoveToContent(); r.NodeType != XmlNodeType.EndElement; r.MoveToContent()) { if (r.NamespaceURI != DbmlNamespace) r.Skip(); throw UnexpectedItemError(r);
PVS-Studio warning : V3020 An unconditional 'throw' within a loop. System.Data.Linq-net_4_x XmlMappingSource.cs 180
At the first iteration of the loop, an exception of type
UnexpectedItemError will be generated. At least, it looks weird. By the way, Visual Studio highlights the object
r in the section for changing the loop counter with a hint about unreachable code. But, probably, the author of the code did not use Visual Studio, or simply did not notice the warning, which is why the error remained in the code.
Suspicious 'if' Operators
Periodically there are errors of this type, when there are 2 identical 'if' operators in the method, and the values ​​of the objects used in the conditional expressions of these operators do not change between them. If any of these conditional expressions are true, the method body will be exited. Thus, the second 'if' statement will never be executed. Let's look at a snippet of code in which a similar error was made:
public int LastIndexOfAny (char [] anyOf, int startIndex, int count) { .... if (this.m_stringLength == 0) return -1; .... if (this.m_stringLength == 0) return -1; .... }
PVS-Studio warning : V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the statement is a senseless corlib-net_4_x String.cs 287
The execution of the method never reaches the second
if statement given in this fragment, since if
this.m_stringLength == 0 , the output will be made when the first conditional statement is executed. This code would be justified if the value of the
m_stringLength field were changed between
if statements , but this is not the case.
The consequences of the error depend on the reason for its occurrence:
- if both conditional expressions are correct (from the point of view of logic), and the second code is simply redundant - there is nothing terrible, but you should remove it in order not to confuse other people;
- if one of the operators was meant to test another expression, or if the expression was true, other actions had to be performed - this is a more serious problem, indicating the presence of an error in the logic of the program. Then it is worth taking the solution of the issue more seriously.
An example of a second, more serious error case can be seen in the following code snippet (click on image to enlarge):

Notice the error in this code is not difficult. Just kidding, just kidding, of course. But not for the analyzer. Let us resort to our standard code simplification method to make everything clearer:
private PaperKind GetPaperKind (int width, int height) { .... if (width == 1100 && height == 1700) return PaperKind.Standard11x17; .... if (width == 1100 && height == 1700) return PaperKind.Tabloid; .... }
PVS-Studio warning : V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. If it’s a statement, it’s a senseless System.Drawing-net_4_x PrintingServicesUnix.cs 744
If the expression
width == 1100 is true
&& height == 1700 , only the first
if statement will be executed. However, the values ​​returned for the truth of this expression are different, so you cannot just say that the second
if statement is superfluous. Moreover - most likely in its condition there should be another expression. There is a violation of the logic of the program.
Finally, I would like to consider another piece of code with a similar error:
private void SerializeCore (SerializationStore store, object value, bool absolute) { if (value == null) throw new ArgumentNullException ("value"); if (store == null) throw new ArgumentNullException ("store"); CodeDomSerializationStore codeDomStore = store as CodeDomSerializationStore; if (store == null) throw new InvalidOperationException ("store type unsupported"); codeDomStore.AddObject (value, absolute); }
PVS-Studio warning : V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. CodeDomComponentSerializationService.cs 562
This warning intersects with the warning V3019, since there is clearly a
null test pattern after the cast with the
as operator of the wrong link. But whatever warning is issued is a mistake.
There were other similar warnings:
- V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. Mono.Data.Sqlite-net_4_x SQLiteDataReader.cs 270
- V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the statement is a senseless system. Web-net_4_x HttpUtility.cs 220
- V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. CodeDomComponentSerializationService.cs 562
- V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the statement is a senseless Mono.Security.Providers.DotNet-net_4_x DotNetTlsProvider.cs 77
Suspicious format lines
Diagnostic rule
V3025 detects erroneously formatted strings. It is also a type of error that occurs in many verifiable projects. Situations of 2 types are detected:
- the format string is expecting more parameters than they are presented;
- The format string is expecting fewer parameters than they are presented.
In the first case, an exception of type
FormatException will be generated; in the second case, unused arguments will simply be ignored. Anyway, you should pay attention to such places and fix the code properly.
Of course, I would not recall this diagnostic rule, if such errors were not found in the code.
static IMessageSink GetClientChannelSinkChain(string url, ....) { .... if (url != null) { string msg = String.Format ( "Cannot create channel sink to connect to URL {0}. An appropriate channel has probably not been registered.", url); throw new RemotingException (msg); } else { string msg = String.Format ( "Cannot create channel sink to connect to the remote object. An appropriate channel has probably not been registered.", url); throw new RemotingException (msg); } .... }
PVS-Studio warning : V3025 Incorrect format. A different number of formatted items is expected while calling 'Format' function. Arguments not used: url. corlib-net_4_x RemotingServices.cs 700
I want to draw your attention to the second line of formatting. It is a string literal, in which argument substitution is not provided (as opposed to the format string above). However, the
Format method takes a
url object as its second argument. It follows from the above that the
url object will simply be ignored when forming a new line, and information about it will not be included in the exception text.
In C # 6.0,
interpolated strings were added, which in some cases will help avoid problems associated with the use of format strings, including with an inappropriate number of arguments.
Consider another piece of erroneous code.
public override string ToString () { return string.Format ("ListViewSubItem {{0}}", text); }
PVS-Studio warning : V3025 Incorrect format. A different number of formatted items is expected while calling 'Format' function. Arguments not used: text. System.Windows.Forms-net_4_x ListViewItem.cs 1287
Based on the format string, it can be concluded that the final line should have text in curly brackets. In fact, the resulting string will look like this:
"ListViewSubItem {{0}}"
To correct this error, the interpolated lines would be perfect, using which the method could be rewritten as follows:
public override string ToString () { return $"ListViewSubItem {{{text}}}",; }
Or, remaining true to the
String.Format method, it was worth adding one brace on each side. Then the format string would look like this:
"ListViewSubItem {{{0}}}"
And the last code snippet with formatting strings. As always, the most interesting thing is left for dessert:
void ReadEntropy () { if (reader.IsEmptyElement) throw new XmlException ( String.Format ("WS-Trust Entropy element is empty.{2}", LineInfo ())); .... }
PVS-Studio warning : V3025 Incorrect format. A different number of formatted items is expected while calling 'Format' function. Format items not used: {2}. Arguments not used: 1st. System.ServiceModel-net_4_x WSTrustMessageConverters.cs 147
I do not know how the formatting element with the index '2' got into the formatting string, but it leads to a very interesting error. It was planned to throw an exception with some text made up of a format string. And an exception will be generated. The exception is of type
FormatException , because for the current formatting line, 3 arguments are needed (since the third one is required), and only one is presented.
If in some way only the number of the requested argument is confused (for example, during refactoring), it will not be difficult to correct the error:
"WS-Trust Entropy element is empty.{0}"
Other suspicious places found by rule V3025 are listed in the file .
Null reference dereferencing
private bool IsContractMethod (string methodName, Method m, out TypeNode genericArgument) { .... return m.Name != null && m.Name == methodName && (m.DeclaringType.Equals (this.ContractClass)
PVS-Studio warning : V3027 The variable "m.DeclaringType" was used. Mono.CodeContracts-net_4_x ContractNodes.cs 211
Before accessing the Name property of the DeclaringType property , the programmer decided to play it safe and check DeclaringType for null inequality so that it does not accidentally dereference the null reference. The desire is clear and completely legitimate - everything is correct. But it still does not take action, because the instance method Equals is called above the DeclaringType property , which means that DeclaringType == null, an exception of type NullReferenceException will be thrown . To solve the problem, you can, for example, transfer the test to the null inequality above by code, or use the null-conditional operator ('?.'), Available in C # 6.0.
Another case.
Node leftSentinel; .... IEnumerator<T> GetInternalEnumerator () { Node curr = leftSentinel; while ((curr = curr.Nexts [0]) != rightSentinel && curr != null) { .... } }
PVS-Studio warning : V3027 The variable 'curr' was used in the same logical expression. Mono.Parallel-net_4_x ConcurrentSkipList.cs 306
And again the same situation. If curr == null , it will exit the loop. But if curr was initially null (i.e. at the time of executing the code leftSentinel == null ), we again get a NullReferenceException exception .
Redundant verification
In checked projects, periodically there are expressions of the following form or the like:
!aa || (aa && bb)
They can be simplified to the following expression:
!aa || bb
Perhaps in some cases you will get a performance gain (even if it is insignificant), but the second variant is also easier to read, despite the fact that it is logically equivalent to the first one (in case the aa expression does not change between calls).
It is possible that instead of aa there should have been another subexpression:
!aa || (cc && bb)
Then we are talking about this error. Anyway, in the PVS-Studio analyzer there is a diagnostic rule V3031 , which detects such cases. Consider a few code fragments that were found with its help:
public void Emit(OpCode opc) { Debug.Assert(opc != OpCodes.Ret || ( opc == OpCodes.Ret && stackHeight <= 1)); .... }
PVS-Studio Warning: V3031 Alert can be simplified. The '||'
operator is surrounded by opposite expressions. mcs-net_4_x ILGenerator.cs 456
Redundant code. Accessing the opc object does not change its value, so the expression can be simplified:
Debug.Assert(opc != OpCodes.Ret || stackHeight <= 1));
Another code snippet:
public bool Validate (bool checkAutoValidate) { if ((checkAutoValidate && (AutoValidate != AutoValidate.Disable)) || !checkAutoValidate) return Validate (); return true; }
PVS-Studio Warning: V3031 Alert can be simplified. The '||'
operator is surrounded by opposite expressions. System.Windows.Forms-net_4_x ContainerControl.cs 506
The situation is similar to the previous one. The expression is easily and painlessly simplified to the following form:
!checkAutoValidate || (AutoValidate != AutoValidate.Disable)
Some of the warnings I have selected are listed in the file .
Formatting code that does not comply with the logic.
When developing diagnostic rules like the V3033 , there were debates about how relevant they are. The fact is that diagnostics, tied to the formatting of the code, are very specific, since many editors / development environments (the same Visual Studio) themselves format the code in the course of its writing. Therefore, the probability of making such an error is very small. I rarely met such errors in the projects being checked, but a couple of them were found in 'Mono'.
public bool this [ string header ] { set { .... if (value) if (!fields.Contains (header)) fields.Add (header, true); else fields.Remove (header); } }
PVS-Studio warning : V3033 It is possible that this 'else' branch must apply to the previous 'if' statement. HttpCacheVaryByHeaders.cs 159
The code is formatted in such a way that it may seem as if else belongs to the first if statement . But the compiler is completely indifferent to how your code is formatted, so it will decrypt this fragment in its own way, associating else with the second if statement , as it should be. There is an interesting mistake. Code formatted according to a given logic should look like this:
if (value) if (!fields.Contains (header)) fields.Add (header, true); else fields.Remove (header);
A similar warning occurred again: V3033 It is possible that this "else" branch must apply. HttpCacheVaryByParams.cs 102
Another diagnostic rule, V3043, can be assigned to this category .
Error code:
public void yyerror (string message, string[] expected) { .... for (int n = 0; n < expected.Length; ++ n) ErrorOutput.Write (" "+expected[n]); ErrorOutput.WriteLine (); .... }
PVS-Studio Warning: V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing.
cs-parser.cs 175
Judging by the formatting of the code (forgetting about the rules of a programming language), one might think that both method calls ( Write and WriteLine ) refer to the for operator . In fact, only the Write method will be called in the loop . This code is definitely wrong! If such logic was implied (it would seem, everything is logical - the elements are output, after which an empty line is inserted), why is the formatting misleading? On the other hand - skimming through the code, you can immediately and do not understand the true logic of the operator. Not just as programmers adhere to certain formatting styles.
private string BuildParameters () { .... if (result.Length > 0) result.Append (", "); if (p.Direction == TdsParameterDirection.InputOutput)
PVS-Studio Warning: V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing.
Tds50.cs 379 The
second if statement does not relate to the first. Why then mislead people working with this code?
public void Restore () { while (saved_count < objects.Count) objects.Remove (objects.Last ().Key); referenced.Remove (objects.Last ().Key); saved_count = 0; referenced.RemoveRange (saved_referenced_count, referenced.Count - saved_referenced_count); saved_referenced_count = 0; }
PVS-Studio Warning: V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing.
XamlNameResolver.cs 81
Apparently, it was planned to remove from the collections of objects and referenced values ​​corresponding to a particular key. But they forgot about braces, as a result only one value will be removed from the referenced collection . What is even more interesting is that one cannot get off with curly braces, because in this case, at each iteration of the cycle from the referenced collection, the object will be deleted not by the same key that was removed from the objects collection . This is due to the fact that at the time the Remove method is called for the referenced collection, the objects collection will already be changed, which means the Last methodwill return another item.
There were also warnings about errors with formatting that did not correspond to the logic of the program. Here are some of them:
- V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing. ExpressionParser.cs 92
- V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing. EcmaUrlParser.cs 80
- V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing. ILParser.cs 167
Bringing an object to its own type / checking an object for compatibility with its own type
The diagnostic rule V3051 is responsible for finding similar situations . As a rule, it finds a redundant code of the form
String str; String str2 = str as String;
or
String str; if (str is String)
But there are (although rarely) much more interesting cases.
Consider the code snippet:
public string GenerateHttpGetMessage (Port port, OperationBinding obin, Operation oper, OperationMessage msg) { .... MimeXmlBinding mxb = (MimeXmlBinding) obin.Output .Extensions .Find (typeof(MimeXmlBinding)) as MimeXmlBinding; if (mxb == null) return req; .... }
PVS-Studio warning : V3051 An type cast. The object is already the mimeXmlBinding type. SampleGenerator.cs 232
It would seem - too redundant, so what? Below we check mxb for null equality , so if the type is incompatible, that's okay. It was not there. The Find method returns an instance of the Object type , after which it is explicitly cast to the MimeXmlBinding type and after that it is cast to the same type using the as operator . However, the operator of explicit, if the argument is incompatible type, does not return null (as opposed to the operator as), and throws an exception of type InvalidCastException . As a result, checking mxb == null does not help in case of unsuccessful type conversion.
The remaining warnings are not so interesting (for example, redundant casting), so I will list them in a text file .
The parameter is overwritten in the body of the method before the
fact that the method parameter is immediately overwritten is used, looks suspicious. After all, the value that came to the method is not used at all, but is simply ignored / lost. Ideally, you should fix the method signature and make the parameter a local variable. True, this approach is not always possible. For example, when implementing an interface or virtual functions.
internal static int SetErrorInfo (int dwReserved, IErrorInfo errorInfo) { int retVal = 0; errorInfo = null; .... retVal = _SetErrorInfo (dwReserved, errorInfo); .... }
PVS-Studio warning : V3061 Parameter 'errorInfo' is always rewritten. corlib-net_4_x Marshal.cs 1552
The value that came as an errorInfo parameter is not used at all. The parameter is immediately reset, and then passed to the method. In this case, it would be logical to make errorInfo a local variable (if it is possible to change the method signature).
The remaining places are similar, therefore, as before, I provide a list of warnings in a list in the file .
Invalid static member initialization
class ResXResourceWriter : IResourceWriter, IDisposable { .... public static readonly string ResourceSchema = schema; .... static string schema = ....; .... }
PVS-Studio warning : V3070 Uninitialized variable 'schema' used when initializing the 'ResourceSchema' variable. ResXResourceWriter.cs 59
The programmer wanted to set the value of the read-only public static field ResourceSchema to another static field - the schema . No error has not done. At the time of ResourceSchema initialization, the schema field will be initialized with a default value ( null in this case). It is unlikely that this is exactly what the developer wanted.
Wrong initialization of the static field, decorated with the attribute [ThreadStatic]
A rare and interesting error. Consider the code snippet:
static class Profiler { [ThreadStatic] private static Stopwatch timer = new Stopwatch(); .... }
PVS-Studio warning : V3089 Initializer of a field marked by [ThreadStatic] attribute will be called once. The field will have a default value on different threads. System.Data.Linq-net_4_x Profiler.cs 16
Decorating a field with the [ThreadStatic] attribute means that the value of this field in each thread will be unique. It seems to be nothing complicated. But the fact is that such fields cannot be initialized either during the declaration or in the static constructor. This is a great way to shoot yourself in the foot (or even both) and get a subtle mistake.
The point is, if initialization is performed during the declaration, the field will be initialized with this value only on the first accessing stream. For other threads, the field will contain a default value (in this case, null , since Stopwatch- reference type). The static constructor will also be called only once - when the first thread is accessed. Therefore, in other streams, the field will also be initialized with a default value.
The error is quite intricate, so I strongly recommend that you read the documentation for the diagnostic rule in order to prevent such situations and not to waste precious time on debugging.
Intersecting ranges
public void EmitLong (long l) { if (l >= int.MinValue && l <= int.MaxValue) { EmitIntConstant (unchecked ((int) l)); ig.Emit (OpCodes.Conv_I8); } else if (l >= 0 && l <= uint.MaxValue) { EmitIntConstant (unchecked ((int) l)); ig.Emit (OpCodes.Conv_U8); } else { ig.Emit (OpCodes.Ldc_I8, l); } }
PVS-Studio warning : V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}.
mcs-net_4_x codegen.cs 742
The analyzer found the ranges intersection in expressions to be suspicious:
- l> = int.MinValue && l <= int.MaxValue
- l> = 0 && l <= uint.MaxValue
For both of these expressions, the range [0, Int32.MaxValue] is common, so if the variable l has a value lying in this range, the first condition will be satisfied, although the second one could also be satisfied.
Similar warnings:
- V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}. I18N.CJK-net_4_x CP51932.cs 437
- V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}. I18N.CJK-net_4_x CP932.cs 552
- V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}. I18N.CJK-net_4_x ISO2022JP.cs 460
Access to a collection item by a constant index, carried out within the cycle.
The practice of giving loop counters with short names is distributed everywhere. There is nothing wrong with calling the cycle counter i or j - it is immediately clear what this variable serves as (with the exception of those cases where the counters require yet more meaningful names). But there are cases when a non-loop counter and a numeric literal are used as an index. It is clear that sometimes this is done on purpose, but sometimes it indicates an error. Diagnostic Rule V3102 looks for similar error spots.
public void ConvertGlobalAttributes ( TypeContainer member, NamespaceContainer currentNamespace, bool isGlobal) { var member_explicit_targets = member.ValidAttributeTargets; for (int i = 0; i < Attrs.Count; ++i) { var attr = Attrs[0]; if (attr.ExplicitTarget == null) continue; .... } }
PVS-Studio warning : V3102 Suspicious access to the element of the Attrs object. mcs-net_4_x attribute.cs 1272
At each iteration of the loop, the variable attr is initialized with the same value, Attrs [0] . Later, some work is done with it (properties are called, passed to the method). It is unlikely that at all iterations of the loop they wanted to work with the same value, so I suppose that the correct initialization should look like this:
var attr = Attrs[i];
Similar errors occurred in two places:
- V3102 Suspicious access to the element of the 'seq' object by a constant index inside a loop. System.Xml-net_4_x XmlQueryRuntime.cs 679
- Constant index inside a loop. System.Web-net_4_x Login.cs 1223
Insecure locks
In many verifiable projects one has to deal with the code of the form lock (this) or lock (typeof (....)) . This is not the best way to lock, as it can lead to deadlocks. But let's get everything in order.
First, look at the dangerous code:
public RegistryKey Ensure (....) { lock (typeof (KeyHandler)){ .... } }
PVS-Studio warning : V3090 Unsafe locking on a type. The type of object will be the same. corlib-net_4_x UnixRegistryApi.cs 245 A
common problem with the potential occurrence of a deadlock is that locking is performed on the same object. Therefore, the general advice that will allow to get rid of such problems is to use an object that is inaccessible from the outside as a lock object - a local variable or a private class field.
What is the problem with the typeof operator ? This statement always returns an instance of type Type, and the same for the same argument, therefore, violates the rule described above - there is the problem of blocking on the same object.
In the project code there are several such places:
- V3090 Unsafe locking on a type. The type of object will be the same. corlib-net_4_x UnixRegistryApi.cs 245
- V3090 Unsafe locking on a type. The type of object will be the same. corlib-net_4_x UnixRegistryApi.cs 261
- V3090 Unsafe locking on a type. The type of object will be the same. corlib-net_4_x UnixRegistryApi.cs 383
- V3090 Unsafe locking on a type. The type of object will be the same. corlib-net_4_x UnixRegistryApi.cs 404
- V3090 Unsafe locking on a type. All instances of a type will have the same 'Type' object. corlib-net_4_x UnixRegistryApi.cs 451
- V3090 Unsafe locking on a type. All instances of a type will have the same 'Type' object. corlib-net_4_x UnixRegistryApi.cs 469
- V3090 Unsafe locking on a type. All instances of a type will have the same 'Type' object. corlib-net_4_x UnixRegistryApi.cs 683
- V3090 Unsafe locking on a type. All instances of a type will have the same 'Type' object. corlib-net_4_x UnixRegistryApi.cs 698
- V3090 Unsafe locking on a type. All instances of a type will have the same 'Type' object. System-net_4_x NetworkChange.cs 66
- V3090 Unsafe locking on a type. All instances of a type will have the same 'Type' object. System-net_4_x NetworkChange.cs 74
- V3090 Unsafe locking on a type. The type of object will be the same. System-net_4_x NetworkChange.cs 85
- V3090 Unsafe locking on a type. The type of object will be the same. System-net_4_x NetworkChange.cs 93
The situation with the GetType () method is fundamentally no different from the one described above:
void ConfigureHttpChannel (HttpContext context) { lock (GetType()) { .... } }
PVS-Studio warning : V3090 Unsafe locking on a type. The type of object will be the same. System.Runtime.Remoting-net_4_x HttpRemotingHandlerFactory.cs 61
The GetType () method also returns an instance of the Type type , so if you block it elsewhere using the GetType () method or the typeof operator , for an object of the same type, there is a possibility of deadlock .
Separately, I would like to consider blocking on objects of type String , since in this case the situation becomes much more interesting, and errors more probable and difficult to detect.
const string Profiles_SettingsPropertyCollection = "Profiles.SettingsPropertyCollection"; .... static void InitProperties () { .... lock (Profiles_SettingsPropertyCollection) { if (_properties == null) _properties = properties; } }
PVS-Studio warning : V3090 Unsafe locking on an object of type 'String'. System.Web-net_4_x ProfileBase.cs 95
The essence of the problem remains unchanged - locking on the same object, but the details are more interesting. Objects of type String can be accessed even from another application domain (this is due to the string interning mechanism). Imagine how to debug a deadlock resulting from the fact that in different application domains used the same line as the object of the lock. The advice is brief - do not use objects of type String (and Thread, by the way, too). These and other problems associated with the use of synchronization mechanisms (as well as ways to avoid them) are described in the documentation for diagnostic rule V3090 .
Erroneous comparison
public bool Equals (CounterSample other) { return rawValue == other.rawValue && baseValue == other.counterFrequency && counterFrequency == other.counterFrequency && systemFrequency == other.systemFrequency && timeStamp == other.timeStamp && timeStamp100nSec == other.timeStamp100nSec && counterTimeStamp == other.counterTimeStamp && counterType == other.counterType; }
PVS-Studio Warning: V3112 An abnormality within similar comparisons. It is possible that a typo is currently inside the expression 'baseValue == other.counterFrequency'. System-net_4_x CounterSample.cs 139
I deliberately did not change the formatting of the code, leaving it in the same form as it was in the original source. The idea of ​​the method is clear and does not cause any problems for understanding - it simply compares the fields of the current object and the object that came as an argument. But even this seemingly simple method contains an error. I am sure that if the formatting of the code were performed better, it would be easier to notice it. This is the same code snippet, but formatted:
public bool Equals (CounterSample other) { return rawValue == other.rawValue && baseValue == other.counterFrequency &&
I think that with such formatting an error is much easier to notice - it is clear that the same field being compared is compared with different fields of the current object. As a result, the logic of the program is violated.
This code fragment is clearly visible - formatting plays an important role! And if the compiler doesn’t care how your code is formed, it’s important for programmers to avoid annoying errors and also simplify the perception of the code and understanding of the program’s logic.
How to rule all this?
Well, mistakes found. Many errors. If you count the errors described in the article, as well as those that were given in the files, it turns out 167 pieces! I want to note that these are not all erroneous / suspicious places - I just chose enough material for the article (which was not a problem). Many other errors, perhaps even the majority, left out of the scope of the article, because it was already quite voluminous.
There may be a reasonable question, how to fix it all? How to implement a static analyzer in the project?
It is unlikely that a separate team will be allocated, which will deal exclusively with the correction of errors (unless we ourselves will correct these errors, as was the case with the Unreal Engine ). It will be correct to fix the existing errors, which will be gradually corrected, and to prevent the emergence of new ones.
To solve the first problem will help the mechanism of mass suppression of warnings. This will allow differentiating old and new errors, observing the appearance and correction of errors only in the new code. Old errors are ruled separately.
Incremental analysis modeurged to solve the second problem. This mode starts the analysis on the developer's machine immediately after compilation, allowing you to detect more recent errors and correct them before entering the version control system (SLE).
Anyway, most likely a certain percentage of errors will fall into hard currency. To detect an erroneous code as quickly as possible after getting into SCR, a good solution would be to inject static analysis into nightly assemblies and use analysis results. How?
You can, for example, notify the person responsible for the project, as well as the programmer who allowed this error to get into the repository.
It is important to correct such errors immediately, until they are 'overgrown' with additional code, and the task of correction has not been complicated several times.
Combining the tips described above (introducing static analysis both on the build server and on the developers' machines) will reduce the cost of fixing errors, since with the proper organization of the process they will be corrected as quickly as possible (without reaching the testers and even more so without getting in releases).
More information about the introduction of static analysis in the development process can be found in the article " How to quickly implement static analysis in a large project? ".
Conclusion
In the comments to one of the articles they wrote: “You write that you check open-source products with your analyzer, but in fact you check your analyzer with open-source products!”. This is the truth.
Indeed, we are constantly working to improve the analyzer, and checking projects allows us to do it better — correct false positives, teach the analyzer to find new errors, etc. This side of the verification usually remains outside the scope of the article, as it is more interesting to the developers of the analyzer.
But we still check projects and, importantly, we find real mistakes there. We do not just find, but try to bring this information to the developers and all interested. And how to use this information - a personal matter of each. I think the benefits of our work for the open-source community are definitely there. Not so long ago, we passed over the mark of 10,000 errors found !
In general, our main goal is to popularize the methodology of static analysis in general and the PVS-Studio analyzer in particular. And our articles are a great way to demonstrate the benefits of this methodology.
Try PVS-Studio on your project: http://www.viva64.com/en/pvs-studio/
Of course, it is a little sad that the compiled project could not be verified, which is why a truly complete and in-depth analysis did not work. On the other hand, there was enough material for the article without it, so developers should think about correcting errors and introducing static analysis into the development process. And we are always happy to help them with this.
If you want to share this article with an English-speaking audience, then please use the link to the translation: Sergey Vasiliev. Searching for bugs in Mono: there are hundreds of them! .