
Now the PVS-Studio team is actively developing a static analyzer of C # code. We are planning to release the first version of the analyzer by the end of 2015. In the meantime, my task is to write several articles in order to interest C # programmers in advance with the PVS-Studio tool. Today I was given an updated installer. Now it became possible to install PVS-Studio with C # support and even check something. I was not slow to take advantage of this and decided to check the first thing that came to hand. The first turned up project Umbraco. Of course, so far the current version of the analyzer is not enough, but this is already enough for me to write a small article.
Umbraco
Umbraco is an open source content management system platform used to publish content to the world wide web. It is written in C #. Since the release of 4.5, the entire system has become available under the MIT license.
The project has an average size. However, the part written in C # is not very large. Most of the project is written in JavaScript. There are a total of about 3200 files with the extension ".cs", with a total size of 15 megabytes. Number of lines of C # code: 400 KLOC.
About PVS-Studio 6.00
The check will be carried out using the Alpha version of the PVS-Studio 6.00 analyzer. In this version there will be two big changes:
- The analysis of C # projects will be supported.
- The analyzer will no longer support VS2005, VS2008. For a small number of users working with VS2005 / 2008, we offer to continue to use version 5.31 or older versions 5.xx, if any errors are corrected.
Pricing policy will remain the same. We do not make a new product, but expand the capabilities of an existing one. We will simply support another programming language. Previously, you could buy PVS-Studio and use it to test projects written in languages: C, C ++,
C ++ / CLI ,
C ++ / CX . Now additionally there is an opportunity to check C # projects. It will not affect the price. Those who have already purchased the analyzer for C ++ can also check the C # code at the same time.
')
Why C #?
At conferences, I often said that doing a C # analyzer is not very interesting. Many of the errors that are present in C ++ are simply not possible in C #. It really is. For example, in C # there are no such functions as memset (), respectively, and there is no mass of problems (see examples related to memset ():
V511 ,
V512 ,
V575 ,
V579 ,
V597 ,
V598 ).
Gradually, however, I changed my mind. A large number of errors detected by PVS-Studio are not connected with some features of the programming language, but with the carelessness of programmers. I mean typos and unsuccessful code changes after copy-paste. This is exactly where the PVS-Studio analyzer for C ++ is strong, and we decided that these developments can also be applied to C #.
C # does not protect against confusing a variable name or the "
last line effect " associated with loss of attention.
Another important factor that prompted the decision to create a C # analyzer is the appearance of Roslyn. Without it, the work of creating an analyzer would be too expensive for us.
Roslyn is an open source compilation platform for C # and Visual Basic. Roslyn performs two basic actions: it builds a syntax tree (parsing) and compiles it. Additionally, it allows analyzing source code, recursively bypassing it, working with Visual Studio projects, and executing code on the fly.
What was interesting?
For C ++, I have a favorite
V501 diagnostic. Now there is its counterpart for C #: V3001. Let's start with this diagnosis.
N1 code snippetIn the code there is a “focal point” property:
[DataMember(Name = "focalPoint")] public ImageCropFocalPoint FocalPoint { get; set; }
This property is of the type 'ImageCropFocalPoint', which is defined below:
public class ImageCropFocalPoint { [DataMember(Name = "left")] public decimal Left { get; set; } [DataMember(Name = "top")] public decimal Top { get; set; } }
It would seem that when working with this property it is impossible to make a mistake. But no. In the HasFocalPoint () method, an annoying typo was made:
public bool HasFocalPoint() { return FocalPoint != null && FocalPoint.Top != 0.5m && FocalPoint.Top != 0.5m; }
'Top' is checked twice, but forgot about 'Left'.
Corresponding warning PVS-Studio: V3001 There are identical sub-expressions 'FocalPoint.Top! = 0.5m' operator. ImageCropDataSet.cs 58
N2 code snippet protected virtual void OnBeforeNodeRender(ref XmlTree sender, ref XmlTreeNode node, EventArgs e) { if (node != null && node != null) { if (BeforeNodeRender != null) BeforeNodeRender(ref sender, ref node, e); } }
PVS-Studio warning: V3001 There are no '&!' Operator. BaseTree.cs 503
The link 'node' is checked twice. Most likely they would also like to check the 'sender' link.
N3 code snippet public void Set (ExifTag key, string value) { if (items.ContainsKey (key)) items.Remove (key); if (key == ExifTag.WindowsTitle || <<<<---- key == ExifTag.WindowsTitle || <<<<---- key == ExifTag.WindowsComment || key == ExifTag.WindowsAuthor || key == ExifTag.WindowsKeywords || key == ExifTag.WindowsSubject) { items.Add (key, new WindowsByteString (key, value)); .... }
PVS-Studio Warning: V3001 There are identical sub-expressions' key == ExifTag.WindowsTitle operator. ExifPropertyCollection.cs 78
The key is compared twice with the 'ExifTag.WindowsTitle' constant. It's hard for me to judge how serious this mistake is. Perhaps one of the checks is just superfluous, and it can be removed. But perhaps there should be a comparison with some other constant.
N4 code snippetConsider another case where I am not at all sure that there is an error. However, this code should be checked once again.
There is an enumeration with 4 named constants:
public enum DBTypes { Integer, Date, Nvarchar, Ntext }
But the SetProperty () method for some reason only considers 3 options. Once again, I’m not saying that this is a mistake. However, the analyzer suggests paying attention to this code fragment and I completely agree with it.
public static Content SetProperty(....) { .... switch (((DefaultData)property.PropertyType. DataTypeDefinition.DataType.Data).DatabaseType) { case DBTypes.Ntext: case DBTypes.Nvarchar: property.Value = preValue.Id.ToString(); break; case DBTypes.Integer: property.Value = preValue.Id; break; } .... }
PVS-Studio Warning: V3002 The switch statement doesn’t cover all values ​​of the DBTypes enum: Date. ContentExtensions.cs 286
N5 code snippet public TinyMCE(IData Data, string Configuration) { .... if (p.Alias.StartsWith(".")) styles += p.Text + "=" + p.Alias; else styles += p.Text + "=" + p.Alias; .... }
PVS-Studio warning: V3004 The 'then' statement is equivalent to the 'else' statement. TinyMCE.cs 170
Code snippet N6, N7At the beginning of the article I said that C # does not protect against the "
last line effect ". And so we got to the appropriate example:
public void SavePassword(IMember member, string password) { .... member.RawPasswordValue = result.RawPasswordValue; member.LastPasswordChangeDate = result.LastPasswordChangeDate; member.UpdateDate = member.UpdateDate; }
PVS-Studio warning: V3005 The 'member.UpdateDate' variable is assigned to itself. MemberService.cs 114
The programmer copied the class members from the 'result' object into the 'member' object. But at the last moment, the person relaxed and copied the member member.UpdateDate into himself.
Additionally, it is alarming that the SavePassword () method works with passwords, and therefore requires special attention.
The exact same code snippet can be observed in the UserService.cs file (see line 269). Most likely it was simply copied there without checking it out.
N8 code snippet private bool ConvertPropertyValueByDataType(....) { if (string.IsNullOrEmpty(string.Format("{0}", result))) { result = false; return true; } .... return true; .... return true; .... return true; .... return true; .... .... return true; }
PVS-Studio Warning: V3009 This is the same way that it returns true. DynamicNode.cs 695
The method uses many 'if' statements and many 'return' statements. It is alarming that all the 'return' statements return the value 'true'. Is this a mistake? Perhaps it was necessary to return 'false' somewhere?
N9 code snippetI suggest that readers check the attention and find the error in the following code snippet. To do this, study the method, but do not read the text below. In order not to do this by accident, I inserted a separator (unicorn :).
public static string GetTreePathFromFilePath(string filePath) { List<string> treePath = new List<string>(); treePath.Add("-1"); treePath.Add("init"); string[] pathPaths = filePath.Split('/'); pathPaths.Reverse(); for (int p = 0; p < pathPaths.Length; p++) { treePath.Add( string.Join("/", pathPaths.Take(p + 1).ToArray())); } string sPath = string.Join(",", treePath.ToArray()); return sPath; }
Figure 1. Separates the code from the explanation of what the error is.PVS-Studio warning: V3010 The return value of the function 'Reverse' is required to be utilized. DeepLink.cs 19
By calling the Reverse () method, the programmer planned to change the array 'pathPaths'. Perhaps he was confused by the fact that such an operation is perfectly correct, for example, in the case of lists (
List <T> .Reverse ). However, as applied to arrays, the Reverse () method does not change the original array. For arrays, this method is implemented by the
Reverse () extension method of the 'Enumerable' class. This method does not perform a permutation in place, but returns a modified collection.
It will be correct to write:
string[] pathPaths = filePath.Split('/'); pathPaths = pathPaths.Reverse().ToArray();
Or even like this:
string[] pathPaths = filePath.Split('/').Reverse().ToArray();
N10 code snippetThe PVS-Studio analyzer issued several warnings V3013, about the suspicious coincidence of the bodies of some methods. In my opinion, all these warnings are false. It seems to me that only one of these warnings deserves attention:
public void GetAbsolutePathDecoded(string input, string expected) { var source = new Uri(input, UriKind.RelativeOrAbsolute); var output = source.GetSafeAbsolutePathDecoded(); Assert.AreEqual(expected, output); } public void GetSafeAbsolutePathDecoded(string input, string expected) { var source = new Uri(input, UriKind.RelativeOrAbsolute); var output = source.GetSafeAbsolutePathDecoded(); Assert.AreEqual(expected, output); }
PVS-Studio warning: V3013 It is odd that the body of the GetAbsolutePathDecoded function is fully equivalent to the body of the GetSafeAbsolutePathDecoded function. UriExtensionsTests.cs 141
Perhaps inside the GetAbsolutePathDecoded () method, you need to use not
source.GetSafeAbsolutePathDecoded()
but
source. GetAbsolutePathDecoded()
I'm not sure what's right, but this place deserves a check.
Answers on questions
The article is intended for a new audience. Therefore, I foresee a number of questions that may arise. I will try to answer in advance some of them.
Have you notified the project developers about the defects found?Yes, we always try to do it.
Do you use PVS-Studio to check the code of PVS-Studio itself?Yes.
PVS-Studio supports Mono?Not.
The answers to these and other questions are given in more detail in the note: "
Answers to questions from readers of articles about PVS-Studio ".
Conclusion
There were not so many mistakes in the project. Our readers from the C ++ audience already know why this is happening. But since we only have to conquer and seduce C # programmers, I will list a number of important points:
- Static analyzer is a tool for regular use. Its meaning is to find errors at a very early stage. There is no point in one-time checks of the project. As a rule, errors are detected that do not have a significant impact on the operation of the program or are located in rarely used sections of code. The reason is that real mistakes have been corrected by sweat and blood all this time. Programmers found them, debugging the code for hours , testers found them, and even worse, users found them. Many of these errors could be fixed immediately if you regularly use a static code analyzer. Consider PVS-Studio as an extension to the C # compiler warnings. I hope you see the list of warnings issued by the compiler more than once a year? All this is discussed in more detail in the article: " Leo Tolstoy and Static Code Analysis ".
- In the articles we mention only those code fragments that seemed interesting to us. As a rule, we don’t write about the code that the analyzer quite honestly considered suspicious, but it is clear that there is no real error. We call this "wrap code." When you use PVS-Studio, this code is worth checking. But to talk about such places in the article is inappropriate.
- This item is not for C ++, but it still exists for C #. We have implemented not so many diagnostics yet, but we are moving fast. Give our C # -unice a bit of a raise. Oh, he will show you then!
Thank you all for your attention and wish you all reckless programs.
If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov.
The First C # Project Analyzed .