Checking the source code of Roslyn

From time to time we return to projects that have already been tested with PVS-Studio and wrote articles about it. It is interesting to do this for two reasons. First, to understand how much better our analyzer has become. Secondly, in order to track whether the project authors paid attention to our article, as well as to the error report that we usually give them. Of course, errors can be corrected without our participation. But it is always nice when exactly our efforts help to make a project better. Roslyn was no exception. The previous article on the verification of this project dates back to December 23, 2015. This is quite a long time, taking into account the path that our analyzer has done in its development during this time. Roslyn is of additional interest to us personally because it is based on the C # kernel of the PVS-Studio analyzer. Thus, we are very interested in the quality of the code of this project. We will arrange for him to re-check and find out what is new and interesting (but let's hope that nothing substantial) will be able to find PVS-Studio there.

Roslyn (or the .NET Compiler Platform) is probably familiar to many of our readers. In short, this is a set of open source compilers and an API for analyzing code for C # and Visual Basic .NET from Microsoft. The source code for the project is available on GitHub .

I will not give a detailed description of this platform, and I recommend everyone interested in an article by my colleague Sergey Vasiliev " Introduction to Roslyn. Use for developing static analysis tools ." From this article, you can learn not only about the features of the Roslyn architecture, but also how exactly we use this platform.

As I mentioned earlier, since the writing of the last article by my colleague Andrei Karpov about checking Roslyn " PVS-Studio 6.00 New Year Release: checking Roslyn " more than three years have passed. During this time, the C # PVS-Studio analyzer has acquired many new features. In general, Andrei’s article was a kind of “trial ball”, because the C # analyzer was only added to PVS-Studio at that time. Despite this, even then, in an unconditionally qualitative project, Roslyn managed to find interesting errors. What has changed in the analyzer for C # code so far, which will potentially allow for a deeper analysis?
')
Since then, both the analyzer core and the infrastructure have evolved. Support for Visual Studio 2017 and Roslyn 2.0 has been added, as well as deep integration with MSBuild. You can read more about our approach to integration with MSBuild and about the reasons that prompted us to adopt it, see my colleague Pavel Yeremeyev’s article " Visual Studio 2017 and Roslyn 2.0 support in PVS-Studio: sometimes using ready-made solutions is not as easy as it seems at first glance . "

Now we are actively working on the transition to Roslyn 3.0 in the same way that we originally supported Visual Studio 2017, that is, through our own toolset, which comes in the PVS-Studio distribution with a “stub” in the form of an empty MSBuild.exe file. Despite the fact that it looks like a “crutch” (the MSBuild API is not very friendly to be reused in third-patry projects due to the low portability of the libraries), this approach has already helped us to survive several Roslyn updates relatively painlessly during the life of Visual Studio 2017, and now, albeit with a large number of overlays, survive the upgrade to Visual Studio 2019, and also maintain full backward compatibility and performance on systems with older versions of MSBuild.

The analyzer core has also undergone a number of improvements. One of the main innovations is a full-fledged interprocedural analysis taking into account the input and output values of the methods, with a calculation, depending on these parameters, on the reachability of the execution branches and cusps.

The task of tracking parameters inside methods is already close to completion, while auto-annotations are preserved for what happens to these parameters (for example, potentially dangerous dereference). This will allow for any diagnostics using the data-flow mechanism to take into account dangerous situations that occur when a parameter is passed to a method. Previously, when analyzing such dangerous places, a warning was not generated, since we could not know all the possible input values to such a method. Now we can detect danger, since in all places of the call of this method these input parameters will be taken into account.

Note: you can familiarize yourself with the basic analyzer mechanisms, such as data-flow and others, from the article " Technologies used in the PVS-Studio code analyzer for searching errors and potential vulnerabilities ."

Interprocedural analysis in PVS-Studio C # is not limited by either input parameters or depth. The only limitation is virtual methods in the classes that are not closed for inheritance and falling into recursion (we’ll stop when we saw on the stack the repeated call of the calculated method). In this case, the recursive method itself will ultimately be calculated assuming that the return value of its self-recursion is unknown.

Another major innovation in the C # analyzer was the consideration of the dereferencing of the potentially null pointer. Previously, the analyzer swore at a possible null reference exception if it was certain that in all execution branches the value of the variable would accept null. Of course, he was sometimes mistaken, so the V3080 diagnostics had previously been called the potential null reference.

Now the analyzer remembers that the variable could be null in one of the execution branches (for example, under a certain condition in if). If he sees access to such a variable without checking, he will display a message V3080, but at a lower level of importance than if he sees null in all branches. In combination with an improved interprocedural analysis, this mechanism allows finding very difficult to detect errors. An example is a long chain of method calls, the last of which is not familiar to you, and which, for example, in certain circumstances returns null in catch, but you did not protect against this, because you simply did not know. In this case, the analyzer swears only when it accurately sees the destination null. In our opinion, this qualitatively distinguishes our approach from such an innovation of C # 8.0 as the nullable reference type, which, in fact, boils down to setting null checks in each method. We offer an alternative - to do checks only where null can really come, and our analyzer is now able to look for such situations.

So, without delaying a debt, let's move on to the “debriefing” —analyzing the results of the Roslyn check. First, consider the errors found due to the innovations described above. In general, quite a lot of warnings were issued for the Roslyn code this time. I think this is due to the fact that the platform is actively developing (the volume of the code base at the moment is about 2,770,000 lines of code excluding empty ones), and we have not done an analysis of this project for a long time. Nevertheless, there are not so many critical errors, namely, they are of interest for the article. And yes, Roslyn has quite a lot of tests that I, as usual, excluded from checking.

I’ll start with the errors V3080, with the criticality level Medium, in which the analyzer found possible access via the zero reference, but not in all possible cases (code branches).

Possible null dereference - Medium

V3080 Possible null dereference. Consider inspecting 'current'. CSharpSyntaxTreeFactoryService.PositionalSyntaxReference.cs 70

private SyntaxNode GetNode(SyntaxNode root) { var current = root; .... while (current.FullSpan.Contains(....)) // <= { .... var nodeOrToken = current.ChildThatContainsPosition(....); .... current = nodeOrToken.AsNode(); // <= } .... } public SyntaxNode AsNode() { if (_token != null) { return null; } return _nodeOrParent; }

Consider the GetNode method. The analyzer considers that it is possible to access by zero reference in the condition of the while block. In the body of the while block, the current variable will be assigned a value - the result of the AsNode method. And this value in some cases will be null . A good example of interprocedural analysis in action.

Now consider a similar case in which the interprocedural analysis was performed through two method calls.

V3080 Possible null dereference. Consider inspecting the 'directory'. CommonCommandLineParser.cs 911

 private IEnumerable<CommandLineSourceFile> ExpandFileNamePattern(string path, string baseDirectory, ....) { string directory = PathUtilities.GetDirectoryName(path); .... var resolvedDirectoryPath = (directory.Length == 0) ? // <= baseDirectory : FileUtilities.ResolveRelativePath(directory, baseDirectory); .... } public static string GetDirectoryName(string path) { return GetDirectoryName(path, IsUnixLikePlatform); } internal static string GetDirectoryName(string path, bool isUnixLike) { if (path != null) { .... } return null; }

The variable directory in the body of the ExpandFileNamePattern method gets the value from the GetDirectoryName (string) method. That, in turn, will return the result of the overloaded GetDirectoryName (string, bool) method, the value of which can be null . Since the variable directory is used in the body of the ExpandFileNamePattern method without first checking for null equality, we can talk about the validity of issuing a warning by the analyzer. This is a potentially unsafe design.

Another piece of code with error V3080, more precisely, immediately with two errors issued for one line of code. Here an interprocedural analysis is not needed.

V3080 Possible null dereference. Consider inspecting 'spanStartLocation'. TestWorkspace.cs 574

V3080 Possible null dereference. Consider inspecting 'spanEndLocationExclusive'. TestWorkspace.cs 574

 private void MapMarkupSpans(....) { .... foreach (....) { .... foreach (....) { .... int? spanStartLocation = null; int? spanEndLocationExclusive = null; foreach (....) { if (....) { if (spanStartLocation == null && positionInMarkup <= markupSpanStart && ....) { .... spanStartLocation = ....; } if (spanEndLocationExclusive == null && positionInMarkup <= markupSpanEndExclusive && ....) { .... spanEndLocationExclusive = ....; break; } .... } .... } tempMappedMarkupSpans[key]. Add(new TextSpan( spanStartLocation.Value, // <= spanEndLocationExclusive.Value - // <= spanStartLocation.Value)); } } .... }

The variables spanStartLocation and spanEndLocationExclusive are of type nullable int and initialized to null . Further, in the code, they can be assigned values, but only under certain conditions. In some cases, their value will remain null . Further, in the code, these variables are accessed by reference without first checking for null equality, as indicated by the analyzer.

The Roslyn code contains quite a few such errors, more than 100. Often, the pattern of these errors is the same. There is some general method that potentially returns null . The result of this method is used in a large number of places, sometimes through dozens of intermediate method calls or additional checks. It is important to understand that these errors are not fatal, but they can potentially lead to access via a null link. And to detect such errors is very difficult. Therefore, in some cases, you should think about refactoring the code, in which an exception would be thrown if the null were returned. Otherwise, you can only secure your code with total checks, which is quite tedious and unreliable. Of course, in each specific case, the decision must be made on the basis of the project specifics.

Note. It happens that at the moment there are no situations (input data) in which the method returns null and there is no real error. However, this code is still not reliable, because everything can change when you make changes to the code.

To close the topic with the V3080 , let's take a look at the obvious errors with the High confidence level when access via the zero link is more likely or even unavoidable.

Possible null dereference - High

V3080 Possible null dereference. Consider inspecting 'collectionType.Type'. AbstractConvertForToForEachCodeRefactoringProvider.cs 137

 public override async Task ComputeRefactoringsAsync(CodeRefactoringContext context) { .... var collectionType = semanticModel.GetTypeInfo(....); if (collectionType.Type == null && collectionType.Type.TypeKind == TypeKind.Error) { return; } .... }

Due to a typo in the condition (instead of the operator || used && ), the code does not work as intended, and access to the collectionType.Type variable will be executed if it is null . The condition needs to be corrected as follows:

 if (collectionType.Type == null || collectionType.Type.TypeKind == TypeKind.Error) ....

By the way, the second scenario is possible: in the first part of the condition, the operators == and ! = Were confused. Then the corrected code would look like:

 if (collectionType.Type != null && collectionType.Type.TypeKind == TypeKind.Error) ....

This version of the code is less logical, but also corrects the error. The final decision for the authors of the project.

Another similar error.

V3080 Possible null dereference. Consider inspecting 'action'. TextViewWindow_InProc.cs 372

 private Func<IWpfTextView, Task> GetLightBulbApplicationAction(....) { .... if (action == null) { throw new InvalidOperationException( $"Unable to find FixAll in {fixAllScope.ToString()} code fix for suggested action '{action.DisplayText}'."); } .... }

An error was made when generating a message for an exception. This attempts to access the action.DisplayText property via the action variable, which is obviously null .

And the last error V3080 level High.

V3080 Possible null dereference. Consider inspecting 'type'. ObjectFormatterHelpers.cs 91

 private static bool IsApplicableAttribute( TypeInfo type, TypeInfo targetType, string targetTypeName) { return type != null && AreEquivalent(targetType, type) || targetTypeName != null && type.FullName == targetTypeName; }

The method is small, so I cite its entire code. The condition in the return block is incorrect. In some cases, NullReferenceException may be thrown when accessing type.FullName . I use brackets (here they will not change behavior) to clarify the situation:

 return (type != null && AreEquivalent(targetType, type)) || (targetTypeName != null && type.FullName == targetTypeName);

That is, in accordance with the priority of operations, this code will work. If the type variable is equal to null , we will end up in an else-check, where, after verifying that the targetTypeName variable is null , we use the null reference type . You can fix the code, for example, like this:

 return type != null && (AreEquivalent(targetType, type) || targetTypeName != null && type.FullName == targetTypeName);

I think this is where you can complete the study of the errors of V3080 and see what else was interesting in finding the PVS-Studio analyzer in Roslyn code.

Typo

V3005 The 'SourceCodeKind' variable is assigned to itself. DynamicFileInfo.cs 17

 internal sealed class DynamicFileInfo { .... public DynamicFileInfo( string filePath, SourceCodeKind sourceCodeKind, TextLoader textLoader, IDocumentServiceProvider documentServiceProvider) { FilePath = filePath; SourceCodeKind = SourceCodeKind; // <= TextLoader = textLoader; DocumentServiceProvider = documentServiceProvider; } .... }

Due to unsuccessful variable names, a typo was made in the constructor of the DynamicFileInfo class. The SourceCodeKind field is assigned its own value, instead of using the sourceCodeKind parameter. To minimize the likelihood of such errors, it is recommended to use an underscore prefix for parameter names in such cases. I will give an example of a revised version of the code:

 public DynamicFileInfo( string _filePath, SourceCodeKind _sourceCodeKind, TextLoader _textLoader, IDocumentServiceProvider _documentServiceProvider) { FilePath = _filePath; SourceCodeKind = _sourceCodeKind; TextLoader = _textLoader; DocumentServiceProvider = _documentServiceProvider; }

Inattention

V3006 The object was not used. The 'throw' keyword could be missing: new InvalidOperationException (FOO). ProjectBuildManager.cs 61

 ~ProjectBuildManager() { if (_batchBuildStarted) { new InvalidOperationException("ProjectBuilderManager.Stop() not called."); } }

Under certain conditions, the destructor should throw an exception, but this does not happen, and the exception object is simply created. The throw keyword was missing. Corrected code:

 ~ProjectBuildManager() { if (_batchBuildStarted) { throw new InvalidOperationException("ProjectBuilderManager.Stop() not called."); } }

The issue of working with destructors in C # and throwing exceptions out of them is a topic for a separate discussion that is beyond the scope of this article.

When the result is not important

A number of V3009 warnings were received for methods that in all cases return the same value. Sometimes this is not critical, or in the calling code the return value is simply not checked. I missed such warnings. But a few code snippets seemed suspicious to me. I will give one of them:

V3009 It's one of those true values. GoToDefinitionCommandHandler.cs 62

 internal bool TryExecuteCommand(....) { .... using (context.OperationContext.AddScope(....)) { if (....) { return true; } } .... return true; }

The TryExecuteCommand method returns only true , and nothing but true . In this case, in the calling code, the return value is involved in some checks:

 public bool ExecuteCommand(....) { .... if (caretPos.HasValue && TryExecuteCommand(....)) { .... } .... }

It is difficult to say how dangerous this behavior is. But if the result is not needed, it may be worth replacing the return type with void and make minimal edits to the calling method. This will make the code more understandable and safer.

Other similar warnings:

V3009 It's one of those true values. CommentUncommentSelectionCommandHandler.cs 86
V3009 It's one of those true values. RenameTrackingTaggerProvider.RenameTrackingCommitter.cs 99
V3009 It's one of those true values. JsonRpcClient.cs 138
V3009 It's one of those true values. AbstractFormatEngine.OperationApplier.cs 164
V3009 It’s odd that this method returns one. TriviaDataFactory.CodeShapeAnalyzer.cs 254
V3009 It's one of those true values. ObjectList.cs 173
V3009 It's one of those true values. ObjectList.cs 249

Not checked

V3019 Possibly is not variable after type conversion using 'as' keyword. Check variables 'value', 'valueToSerialize'. RoamingVisualStudioProfileOptionPersister.cs 277

 public bool TryPersist(OptionKey optionKey, object value) { .... var valueToSerialize = value as NamingStylePreferences; if (value != null) { value = valueToSerialize.CreateXElement().ToString(); } .... }

The value variable is cast to the NamingStylePreferences type. The problem is the next test. Even if the value variable is not null, it does not guarantee that the type conversion was successful and the valueToSerialize is not null . A NullReferenceException exception may be thrown . The code needs to be corrected as follows:

 var valueToSerialize = value as NamingStylePreferences; if (valueToSerialize != null) { value = valueToSerialize.CreateXElement().ToString(); }

And one more similar error.

V3019 Possibly is not variable after type conversion using 'as' keyword. Check variables 'columnState', 'columnState2'. StreamingFindUsagesPresenter.cs 181

 private void SetDefinitionGroupingPriority(....) { .... foreach (var columnState in ....) { var columnState2 = columnState as ColumnState2; if (columnState?.Name == // <= StandardTableColumnDefinitions2.Definition) { newColumns.Add(new ColumnState2( columnState2.Name, // <= ....)); } .... } .... }

The variable columnState is called ColumnState2 . However, the result of the operation, the variable columnState2 , is no longer checked for null equality. Instead, the columnState variable is checked with the conditional null operator. Why is this code dangerous? As in the previous example, type casting with the as operator can fail, and the variable columnState2 will be null , which will result in a further exception being thrown. By the way, there may be a typo. Pay attention to the condition in the if block. Perhaps instead of columnState? .Name, they wanted to write columnState2? .Name . This is very likely if you take into account the rather unsuccessful variable names columnState and columnState2.

Redundant checks

Quite a large number of warnings (over 100) were issued for non-critical, but potentially unsafe constructions associated with redundant checks. For example, give one of them.

V3022 Expression 'navInfo == null' is always false. AbstractSyncClassViewCommandHandler.cs 101

 public bool ExecuteCommand(....) { .... IVsNavInfo navInfo = null; if (symbol != null) { navInfo = libraryService.NavInfoFactory.CreateForSymbol(....); } if (navInfo == null) { navInfo = libraryService.NavInfoFactory.CreateForProject(....); } if (navInfo == null) // <= { return true; } .... } public IVsNavInfo CreateForSymbol(....) { .... return null; } public IVsNavInfo CreateForProject(....) { return new NavInfo(....); }

Perhaps there is no real error here. Just a good reason to demonstrate a bunch of technologies "interprocedural analysis + data flow analysis" in action. The analyzer considers that the second check navInfo == null is redundant. Indeed, before this, the value for the assignment of navInfo will be obtained from the libraryService.NavInfoFactory.CreateForProject method, which constructs and returns a new object of the NavInfo class. But not null . The question arises, why didn't the analyzer issue a warning for the first check navInfo == null ? There is an explanation. Firstly, if the variable symbol turns out to be null , then the value of navInfo will remain a null reference. Secondly, even if navInfo gets a value from the libraryService.NavInfoFactory.CreateForSymbol method, this value can be null . Thus, the first check navInfo == null is really necessary.

Not enough checks

And now the situation is the reverse of the above. Several warnings V3042 was received for the code, which can be accessed by zero link. In this case, only one or two small checks could fix everything.

Consider one interesting code snippet that contains two similar errors.

V3042 Possible NullReferenceException. The '?.' and '.' binder_Expressions.cs 7770

V3042 Possible NullReferenceException. The '?.' and '.' binder_Expressions.cs 7776 operators are used

 private BoundExpression GetReceiverForConditionalBinding( ExpressionSyntax binding, DiagnosticBag diagnostics) { .... BoundExpression receiver = this.ConditionalReceiverExpression; if (receiver?.Syntax != // <= GetConditionalReceiverSyntax(conditionalAccessNode)) { receiver = BindConditionalAccessReceiver(conditionalAccessNode, diagnostics); } var receiverType = receiver.Type; // <= if (receiverType?.IsNullableType() == true) { .... } receiver = new BoundConditionalReceiver(receiver.Syntax, 0, // <= receiverType ?? CreateErrorType(), hasErrors: receiver.HasErrors) // <= { WasCompilerGenerated = true }; return receiver; }

The variable receiver may be null . The author of the code knows this because it uses the conditional null operator in the if block condition to access the receiver? .Syntax . Further, in the code, the variable receiver is used without any checks to access receiver.Type , receiver.Syntax and receiver.HasErrors . These errors need to be corrected:

 private BoundExpression GetReceiverForConditionalBinding( ExpressionSyntax binding, DiagnosticBag diagnostics) { .... BoundExpression receiver = this.ConditionalReceiverExpression; if (receiver?.Syntax != GetConditionalReceiverSyntax(conditionalAccessNode)) { receiver = BindConditionalAccessReceiver(conditionalAccessNode, diagnostics); } var receiverType = receiver?.Type; if (receiverType?.IsNullableType() == true) { .... } receiver = new BoundConditionalReceiver(receiver?.Syntax, 0, receiverType ?? CreateErrorType(), hasErrors: receiver?.HasErrors) { WasCompilerGenerated = true }; return receiver; }

You also need to make sure that the BoundConditionalReceiver constructor supports getting null values for its parameters, or perform additional refactoring.

Other similar errors:

V3042 Possible NullReferenceException. The '?.' and '.' SyntaxGeneratorExtensions_Negate.cs 240 operators are used for accessing members of the 'containingType' object
V3042 Possible NullReferenceException. The '?.' and '.' ExpressionSyntaxExtensions.cs 349
V3042 Possible NullReferenceException. The '?.' and '.' ExpressionSyntaxExtensions.cs 349

Error in condition

V3057 The 'Substring' function could receive the '-1' value while non-negative value is expected. Inspect the second argument. CommonCommandLineParser.cs 109

 internal static bool TryParseOption(....) { .... if (colon >= 0) { name = arg.Substring(1, colon - 1); value = arg.Substring(colon + 1); } .... }

If the colon variable is 0, which is conditional in the code, the Substring method will throw an ArgumentOutOfRangeException . Correction required:

 if (colon > 0)

A typo is possible

V3065 Parameter 't2' is not utilized inside method's body. CSharpCodeGenerationHelpers.cs 84

 private static TypeDeclarationSyntax ReplaceUnterminatedConstructs(....) { .... var updatedToken = lastToken.ReplaceTrivia(lastToken.TrailingTrivia, (t1, t2) => { if (t1.Kind() == SyntaxKind.MultiLineCommentTrivia) { var text = t1.ToString(); .... } else if (t1.Kind() == SyntaxKind.SkippedTokensTrivia) { return ReplaceUnterminatedConstructs(t1); } return t1; }); .... }

Two parameters are passed to the lambda expression: t1 and t2. However, only t1 is used. It looks suspicious, considering how easy it is to make mistakes when using variables with such names.

Inattention

V3083 Unsafe invocation of event 'TagsChanged', NullReferenceException is possible. Consider assigning an event to a local variable before invoking it. PreviewUpdater.Tagger.cs 37

 public void OnTextBufferChanged() { if (PreviewUpdater.SpanToShow != default) { if (TagsChanged != null) { var span = _textBuffer.CurrentSnapshot.GetFullSpan(); TagsChanged(this, new SnapshotSpanEventArgs(span)); // <= } } }

The TagsChanged event is called insecure. Between checking for equality of null and calling the event, it can be unsubscribed, then an exception will be thrown. Moreover, in the body of the if block, some other operations are performed immediately before the event is called. I called this error “Inattention” because in other places in the code with this event they work more accurately, for example, like this:

 private void OnTrackingSpansChanged(bool leafChanged) { var handler = TagsChanged; if (handler != null) { var snapshot = _buffer.CurrentSnapshot; handler(this, new SnapshotSpanEventArgs(snapshot.GetFullSpan())); } }

Using the optional handler variable eliminates the problem. In the OnTextBufferChanged method, you need to make edits for the same safe work with the event.

Intersecting ranges

V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}. ILBuilderEmit.cs 677

 internal void EmitLongConstant(long value) { if (value >= int.MinValue && value <= int.MaxValue) { .... } else if (value >= uint.MinValue && value <= uint.MaxValue) { .... } else { .... } }

For better understanding, I will rewrite this code snippet, replacing the names of constants with their actual values:

 internal void EmitLongConstant(long value) { if (value >= -2147483648 && value <= 2147483648) { .... } else if (value >= 0 && value <= 4294967295) { .... } else { .... } }

Probably there is no real error here, but the condition looks strange. The second part ( else if ) will be executed only for the value range from 2147483648 + 1 to 4294967295.

A couple of similar warnings:

V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}. LocalRewriter_Literal.cs 109
V3092 Range intersections are possible within conditional expressions. Example: if (A> 0 && A <5) {...} else if (A> 3 && A <9) {...}. LocalRewriter_Literal.cs 66

More about tests for equality of null (or their absence)

Several errors V3095 about checking the variable for equality of null after use. The first is ambiguous, consider the code.

V3095 The 'displayName' object was used before it was verified against null. Check lines: 498, 503. FusionAssemblyIdentity.cs 498

 internal static IAssemblyName ToAssemblyNameObject(string displayName) { if (displayName.IndexOf('\0') >= 0) { return null; } Debug.Assert(displayName != null); .... }

It is assumed that the displayName reference may be null. To do this, check Debug.Assert . It is not clear why it comes after the use of the line. It should also be noted that for other configurations other than Debug, the compiler will remove Debug.Assert from the code altogether. Does this mean that only for Debug is it possible to get a null link? And if this is not the case, then why wasn’t a string.IsNullOrEmpty (string) check done , for example. These are questions to the authors of the code.

The following error is more obvious.

V3095 The 'scriptArgsOpt' object was used before it was verified against null. Check lines: 321, 325. CommonCommandLineParser.cs 321

 internal void FlattenArgs(...., List<string> scriptArgsOpt, ....) { .... while (args.Count > 0) { .... if (parsingScriptArgs) { scriptArgsOpt.Add(arg); // <= continue; } if (scriptArgsOpt != null) { .... } .... } }

I think this code snippet does not require explanations. I will give a corrected version:

 internal void FlattenArgs(...., List<string> scriptArgsOpt, ....) { .... while (args.Count > 0) { .... if (parsingScriptArgs) { scriptArgsOpt?.Add(arg); continue; } if (scriptArgsOpt != null) { .... } .... } }

In the Roslyn code there were 15 more similar errors:

V3095 The 'LocalFunctions' object was used before it was verified against null. Check lines: 289, 317. ControlFlowGraphBuilder.RegionBuilder.cs 289
V3095 The 'resolution.OverloadResolutionResult' object was used against null. Check lines: 579, 588. Binder_Invocation.cs 579
V3095 The 'resolution.MethodGroup' object was verified against it. Check lines: 592, 621. Binder_Invocation.cs 592
V3095 The 'touchedFilesLogger' object was verified against null. Check lines: 111, 126. CSharpCompiler.cs 111
V3095 The 'newExceptionRegionsOpt' object was verified against it. Check lines: 736, 743. AbstractEditAndContinueAnalyzer.cs 736
V3095 The 'symbol' object was used against it. Check lines: 422, 427. AbstractGenerateConstructorService.Editor.cs 422
V3095 The '_state.BaseTypeOrInterfaceOpt' object was verified against null. Check lines: 132, 140. AbstractGenerateTypeService.GenerateNamedType.cs 132
It was verified against null. Check lines: 232, 233. ProjectUtil.cs 232
V3095 The 'languages' object was verified against null. Check lines: 22, 28. ExportCodeCleanupProvider.cs 22
V3095 The 'memberType' object was used before it was verified against null. Check lines: 183, 184. SyntaxGeneratorExtensions_CreateGetHashCodeMethod.cs 183
V3095 The 'validTypeDeclarations' object was used before it was verified against null. Check lines: 223, 228. SyntaxTreeExtensions.cs 223
V3095 The 'text' object was used before it was verified against null. Check lines: 376, 385. MSBuildWorkspace.cs 376
V3095 The 'nameOrMemberAccessExpression' object was used before it was verified against null. Check lines: 206, 223. CSharpGenerateTypeService.cs 206
V3095 The 'simpleName' object was used before it was verified against null. Check lines: 83, 85. CSharpGenerateMethodService.cs 83
V3095 The 'option' object was used before it was verified against null. Check lines: 23, 28. OptionKey.cs 23

Consider the error V3105 . Here, the conditional null operator is used to initialize the variable, but later in the code the variable is used without checking for equality of null .

Two warnings signal the next error.

V3105 The 'documentId' variable was used after it was assigned through the null-conditional operator. NullReferenceException is possible. CodeLensReferencesService.cs 138

V3105 The 'documentId' variable was used through the null-conditional operator. NullReferenceException is possible. CodeLensReferencesService.cs 139

 private static async Task<ReferenceLocationDescriptor> GetDescriptorOfEnclosingSymbolAsync(....) { .... var documentId = solution.GetDocument(location.SourceTree)?.Id; return new ReferenceLocationDescriptor( .... documentId.ProjectId.Id, documentId.Id, ....); }

The documentId variable can be initialized to null . As a result, the creation of the ReferenceLocationDescriptor object will result in an exception being thrown. The code needs to be fixed:

 return new ReferenceLocationDescriptor( .... documentId?.ProjectId.Id, documentId?.Id, ....);

Also further in the code it is necessary to provide the possibility of equality of null variables passed to the constructor.

Other similar code errors:

V3105 The 'symbol' variable was used after it was assigned through the null-conditional operator. NullReferenceException is possible. SymbolFinder_Hierarchy.cs 44
V3105 The 'symbol' variable was used after it was assigned through the null-conditional operator. NullReferenceException is possible. SymbolFinder_Hierarchy.cs 51

Priorities and brackets

V3123 Perhaps the '?:' Operator works in a different way than it was expected. It is a priority. Edit.cs 70

 public bool Equals(Edit<TNode> other) { return _kind == other._kind && (_oldNode == null) ? other._oldNode == null : _oldNode.Equals(other._oldNode) && (_newNode == null) ? other._newNode == null : _newNode.Equals(other._newNode); }

The condition in the return block is not calculated exactly as the developer thought. It was assumed that the first condition would be _kind == other._kin d (therefore after this condition a line break is made), and then blocks of conditions with the operator " ? " Will be calculated sequentially . In fact, the first condition is _kind == other._kind && (_oldNode == null) . This is due to the fact that the && operator has a higher priority than the " ? " Operator . To correct the error, you need to bracket all the expressions of the operator " ? ":

 return _kind == other._kind && ((_oldNode == null) ? other._oldNode == null : _oldNode.Equals(other._oldNode)) && ((_newNode == null) ? other._newNode == null : _newNode.Equals(other._newNode));

This concludes the description of the errors found.

findings

Despite the significant number of errors that I was able to detect, in terms of the size of the Roslyn project code (2,770,000 lines), this would be a rather insignificant amount. Like Andrew in the previous article, I am also ready to recognize the high quality of this project.

I want to note that such episodic checks of the code have nothing to do with the methodology of static analysis and practically do not bring any benefit. Static analysis should be applied regularly, not occasionally. Then many errors will be corrected at the earliest stages, and, therefore, the cost of correcting them will be ten times lower. In more detail, this idea is presented in this small note , which I beg you to read.

You can search for any other errors yourself, either in the considered project or in any other. For this you just need to download and try our analyzer.

If you want to share this article with an English-speaking audience, then please use the link to the translation: Sergey Khrenov. Checking the Roslyn Source Code .

Source: https://habr.com/ru/post/446592/

All Articles

Checking the source code of Roslyn

More articles: