Checking a Microsoft Orleans project with PVS-Studio

Introduction

Good day to all.

Initially, a small Disclaimer for doubters: yes, for this post, I might get a license for PVS-Studio to check the open-source Microsoft Orleans project. And maybe I will not get it, as the chip will fall, sir. No, I was not directly connected with the company "Program Verification Company" in any way and wrote this post on my own initiative.

And now let's get to the bottom line.
')
PVS-Studio 6.0 , as stated by the official website of the company, is a static code analyzer focused on ease of use and searching for errors at the stage of writing code.

And relatively recently, the company has released a version that supports checking for C # projects. Than we actually will check the Microsoft Orleans project.

By the way, the PVS-Studio team also checked the Orleans project for detected errors, but I was a little ahead of them and they kindly provided me with their KDPV ("picture to attract attention") with the invariably pleasing unicorn.

PVS-Unicorn-In-Clouds

What is the project Orleans, Virtual actors and what is their profit.

Microsoft Orleans is a framework that is built on the concept of virtual actors. The Orleans terminology is slightly different from other similar frameworks (Akka.Net, Service Fabric, Erlang Actors): The actors are called Grains (aka grain), and the servers participating in the cluster are Silo .

The advantage of virtual actors is that at any moment your code can get a proxy from Orleans Runtime to access a specific Grain via its interface and Id. When calling a proxy method, a message is sent to a cluster of several servers, where it will be delivered to the Grain with the specified Id. Runtime guarantees that such a Grain will be created in a single copy on one of the servers and subsequent calls will be delivered to it. Runtime also automatically deletes from the memory (deactivates) Grains that did not receive calls for a specified time, thus constantly collecting garbage. If the server on which your grain was previously, went offline - Runtime will quickly raise an instance on another one and you will not notice anything, except for a small delay. If the call should come on the same server - Runtime optimizes this and the call will be local.

Actually, the profit of virtual actors is that it all scales very easily in the clouds:
Silo ping each other and determine when someone is not available, redistribute the grains of the "fallen comrade" among themselves and all that. The cluster happily lives and is available, as long as there is at least one active Silo, and when new members are connected, they will receive their range of "grains" and begin actively processing requests. And you write your code using the usual OOP approach. It turns out this, as it were, "Distributed C # /. NET".

Even the runtime itself guarantees that when the method is performed on grain, no other call will come to the same grain, i.e. Runtime guarantees single-threaded execution of your code, which allows you to think less about all non-thread-safe situations and focus more on writing useful business logic.

In general, there are still many other interesting things, such as replaceable storage providers, messaging, pub-sub and other useful things for developing applications for cloud (or distributed on-prem) platforms.
More information about the project can be found here - Microsoft Orleans @ GitHub . Microsoft Orleans has been well tested in large-scale projects - this framework is used on the back end of the games Halo 4 and Halo 5 - it collects information about all the games, aggregates statistics, etc.

Why did you want to check out the Orleans

Well, first of all, we ourselves use Orleans to create our cloud platform in Drawboard and want to rely on a convenient and reliable framework.

Secondly, the concepts and guarantees embodied in Orleans are rather non-trivial and the quality of the cluster depends on their implementation.

Thirdly - it was just interesting to try PVS-Studio with C # - the product team writes excellent articles about checking C ++ projects, but C # was somehow deprived of attention until recently.

Test results

The Microsoft Orleans project is developing very dynamically and, obviously, a one-time check will help to find questionable places - in the long run the overall quality will not be affected much.

So, as of the time of merge PR # 1288 4/02/2016 2:43:26 PM, Commit hash: 7c1e35466fde08fcf1c2caf64fa304d25e60e045 PVS-Studio (version 6.01.15638.1) issued:

18 High-severity alerts
7 Medium-severity warnings
58 Low-severity alerts

It is hard to say whether this is good or bad, because it cannot be compared with other projects tested by PVS-Studio - different complexity, amount of code, competence of developers and many other factors.
But in general - it does not look the worst result, the number of suspicious places is not expressed by a three or four-digit number - everything can be viewed and processed in a day.

Let's see whether PVS-Studio will find something serious.

And yet - here is a table that shows the separation of errors between the main code (Runtime and auxiliary projects) and the code in the tests:

Severity	Total	Runtime	Tests
High	18	12 `*`	6
Medium	7	four	3
Low	58	13	45

* - 4 warnings for improper use of Replace in one method, 3 warnings in less-used code. Total ~ 5 which should be noted.

I will walk, basically, on errors in Runtime, since test errors are usually less critical. Although they can create the illusion that everything is OK, but in reality in reality everything will not be so rosy ...

Critical warnings found in the project

Let's start with the most interesting and quite serious. The developers of the project, it turns out, also encountered a problem that goes back to this bug.

(this is from a gitter-chat project about a problem found):
... an embarrassing bug in SanitizeTableProperty that IIRC puzzled us recently.
... were sure the sanitization worked.

PVS-Studio issued the following warning:

V3010 The return value of the function 'Replace' is required to be utilized. AzureStorageUtils.cs 278,279,280,281

And at the specified address was (the bug has already been fixed) method:

 public static string SanitizeTableProperty(string key) { // Remove any characters that can't be used in Azure PartitionKey or RowKey values key.Replace('/', '_'); // Forward slash key.Replace('\\', '_'); // Backslash key.Replace('#', '_'); // Pound sign key.Replace('?', '_'); // Question mark if (key.Length >= 1024) throw new ArgumentException(string.Format("Key length {0} is too long to be an Azure table key. Key={1}", key.Length, key)); return key; }

Quite a cool bug, straight snake oil (... and here it was slowly an epiphany that this bug probably got out in our code a couple of times ....) .
The key parameter of type string is also passed to the method. Strings in .NET are immutable, so no matter how many times we call the key.Replace , the value will not change.

Next good find -

V3006 The object was not used. The 'throw' keyword could be missing: throw new InconsistentStateException (FOO). MemoryStorageGrain.cs 129

 if (currentETag != null) { string error = string.Format("Etag mismatch during {0} for grain {1}: Expected = {2} Received = null", operation, grainStoreKey, currentETag.ToString()); logger.Warn(0, error); new InconsistentStateException(error); }

A new exception is created, but not thrown. Those. if during the recording Etag = null is sent and this is the first record, no error occurs Slightly lower in the same method, another exception is still thrown — that is, we have a missed problem here.

Another warning:

V3005 The 'jsonSettings' variable is assigned to itself. AzureTableStorage.cs 104

When reading the configuration, the serialization settings in the jsonSettings variable are initialized 2 times:

 if (useJsonFormat) { jsonSettings = jsonSettings = OrleansJsonSerializer.SerializerSettings; }

Not very critical problem, for me it looks like the result of refactoring - in the "past life" one of these variables was most likely called differently. Then a separate variable for default settings was removed, and the assignment remained.

The following (false alarm in our case):

** V3022 Expression 'USE_DEBUG_CONTEXT_PARAMS && arguments! = Null && arguments.Length> 0' is always false. GrainReference.cs 480 ***

 [NonSerialized] private const bool USE_DEBUG_CONTEXT_PARAMS = false; ... if (USE_DEBUG_CONTEXT_PARAMS && arguments != null && arguments.Length > 0) { ... }

Debug flag, compiler optimizes this branch if. Not very critical, from the context it is clear what will happen. But to check such places is still useful - to make sure that the important and necessary code will not be "c-optimized".

This warning was found in little-used code, so this site did not cause any special problems, but it could:

V3025 Incorrect format. A different number of formatted items is expected while calling 'Format' function. Expected: 3. Present: 2. Program.cs 169,183

 WriteStatus(string.Format("**Calling DeleteGrain({0}, {1}, {2})", silo, grainId)); WriteStatus(string.Format("**Calling LookupGrain({0}, {1}, {2})", silo, grainId));

It may seem that this is not such a significant error, well, logger, well, something does not write. But no. In fact, there will fly -

FormatException: Index (zero based) must be greater than or equal to zero and less than the size of the argument list.

And the next warning can signal real problems in the logic of the code, in Orleans this, fortunately, is only logging -

V3033 It is possible that this "else" branch must apply. Interner.cs 251

 private void InternCacheCleanupTimerCallback(object state) { ... long numRemoved = numEntries - internCache.Count; if (numRemoved>0) if (logger.IsVerbose) logger.Verbose(ErrorCode.Runtime_Error_100296, "Removed {0} / {1} unused {2} entries in {3}", numRemoved, numEntries, internCacheName, clock.Elapsed); else if (logger.IsVerbose2) logger.Verbose2(ErrorCode.Runtime_Error_100296, "Removed {0} / {1} unused {2} entries in {3}", numRemoved, numEntries, internCacheName, clock.Elapsed); }

This is a stylistically complex code - it is not very clear what they wanted to do. Of course, you can figure it out, but this style not only breaks the eyes of the reader, but also opens up possibilities for all sorts of unpleasant bugs like Apple's surprise.

Apple certificate check bug:
if ((err = SSLHashSHA1.update (& hashCtx, & serverRandom))! = 0)
goto fail;
if ((err = SSLHashSHA1.update (& hashCtx, & signedParams))! = 0)
goto fail;
goto fail;

In order that such things do not appear - there is StyleCop and many other methods of "forcing the right style." And they are also useful to use.

Another warning, also minor in this case, is simply a redundant check. But if the check is computationally "expensive" - it is worth getting rid of it.

V3053 An excessive expression. Examine the substrings '/ bootstrap' and '/ boot'. ClientGenerator.cs 310

 else if (arg.StartsWith("/bootstrap") || arg.StartsWith("/boot")){...}

On this critical errors that are worthy of attention, end and ahead a little more Medium and more Low .

Found errors of moderate severity

The first is an example of a warning, which can be either a signal of a problem or a reason for a holivar.

V3054 Potentially unsafe double-checked locking. Use volatile variable (s) or synchronization primitives to avoid this. StreamImpl.cs 142, 144

Sample code is a bit cut, only the main sections are left with some comments.

 private readonly object initLock; // need the lock since the same code runs in the provider on the ... internal IAsyncBatchObserver<T> GetProducerInterface() { ->> // not so anonical double-checked locking, effectively doing the same, if (producerInterface != null) return producerInterface; lock (initLock) { if (producerInterface != null) return producerInterface; if (provider == null) provider = GetStreamProvider(); producerInterface = provider.GetProducerInterface<T>(this); } return producerInterface; } internal IInternalAsyncObservable<T> GetConsumerInterface() { ->> // Canonical double-checked locking if (consumerInterface == null) { lock (initLock) { if (consumerInterface == null) { if (provider == null) provider = GetStreamProvider(); consumerInterface = provider.GetConsumerInterface<T>(this); } } } return consumerInterface; }

Here we have two examples of applying the Double-checked locking pattern. Chuck Norris of the .NET world (aka Jon Skeet) in the article Implementing the Singleton Pattern in C # provides a more elegant and reliable implementation if you need a singleton.

I also wanted to write that:

But personally, this code gives me one more doubt: in all articles and examples, this pattern always uses lock on a static object, and here I am not very sure that it can be used on non- static locks with guaranteed reliable results ...

But, after talking with the developers and reading this article of Sayonara volatile by Joe Duffy , I agree that since In our case, this is not a singleton, it is acceptable to use a non-static field. And without volatile.

In general, studying this error, I discovered such a "can of worms" that I don’t even know if it’s a problem in the code or specifically with this diagnostics in PVS-Studio.

But the fact that the tool can catch such patterns in general is, imho, great. And I hope that in the future we will see more sophisticated warnings, good and different.

Let's move on to another good example, a bug that is very badly caught by a person, but it is easy with an analyzer:

V3022 Expression 'n1 == 0 && n1! = 0' is always false. Unsigned type value is always> = 0. Probably the '||' operator should be used here. UniqueKey.cs 113

 private static UniqueKey NewKey(ulong n0, ulong n1, Category category, long typeData, string keyExt) { // in the string representation of a key, we grab the least significant half of n1. // therefore, if n0 is non-zero and n1 is 0, then the string representation will always be // 0x0 and not useful for identification of the grain. if (n1 == 0 && n1 != 0) throw new ArgumentException("n0 cannot be zero unless n1 is non-zero.", "n0");

Here it was necessary to check n0! = 0, as written in the comments to this code, and in the current implementation, the check is always false. Again, a good Coding-style could help in this case - if the variables were not called n0 and n1 , and for example firstHalf and secondhHalf , the error would be more pronounced. Compare:

 if (firstHalf == 0 && secondHalf != 0) vs if (secondHalf == 0 && secondHalf != 0)

Warning about the same code in two different methods -

V3013 It is odd that the body of the 'IncrementMetric' function is fully equivalent to the body of the 'DecrementMetric' function (1079, line 1095). TraceLogger.cs 1079

Spaced Copy-Paste, between these 2 methods there is also the correct implementation
decrement, which reduces the metric.

 public override void IncrementMetric(string name, double value) { foreach (var tc in TelemetryConsumers.OfType<IMetricTelemetryConsumer>()) { ->> tc.IncrementMetric(name, value); } } ... public override void DecrementMetric(string name, double value) { foreach (var tc in TelemetryConsumers.OfType<IMetricTelemetryConsumer>()) { ->> tc.IncrementMetric(name, value); } }

The same, but in tests:

V3013 It is odd that the body of the StartTimer function is fully equivalent to the body of the StopTimer function (183, line 188). TimerOrleansTest.cs 183

 public Task StartTimer(string timerName) { if (persistant) return persistantGrain.StartTimer(timerName); else return grain.StartTimer(timerName); } public Task StopTimer(string timerName) { if (persistant) return persistantGrain.StartTimer(timerName); else return grain.StartTimer(timerName); }

This hello from Ctrl + C, Ctrl + V in tests can have worse consequences - the tests will be false-positive.

Difficult code in tests can produce one error per 10 or 100 launches, then such a test also causes irritation:

V3032 Waiting for Compiler may be optimized for some of the variables. Use volatile variable (s) or synchronization primitives to avoid this. LoggerTest.cs 468

 // Wait until the BulkMessageInterval time interval expires before wring the final log message - should cause bulk message flush while (stopwatch.Elapsed <= TraceLogger.BulkMessageInterval) { Thread.Sleep(10); }

Honestly, I myself do not very well understand how and why the cycle here can turn into an infinite, but an example in the description of this warning on the PVS-Studio website is more understandable V3032 .
Not a big problem, because the code in the tests, but even in theory, arbitrarily falling tests or hanging for a long time is not the most pleasant thing.

And again the strange code in the tests -

V3051 An excessive type check. The object is already of the 'Exception' type. PersistenceGrainTests.cs 178

 catch (AggregateException ae) { exceptionThrown = true; Exception e = ae.GetBaseException(); ->> if (e is Exception) { // Expected error } else { throw e; } }

The exception will never be thrown again, because the code will never fall into the else branch. Criticality depends on the context, in tests it may not be so scary, and in another code it is pure bug water.

Low-severity errors

PVS-Studio found quite a few comments in the tests, but after reviewing most of them, it can be concluded that there are no critical or high-impact problems among them. It is quite difficult to comprehensively test such a product, and often you have to go for different tricks to cause a certain system behavior.

For example, here:

V3008 The 'promise' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 91, 84. TestInternalGrains ErrorGrain.cs 91

 // the grain method returns OK, but leaves some unobserved promise -->>Task<long> promise = Task<long>.Factory.StartNew(() => { if (!doThrow) return 0; logger.Info("About to throw 1."); throw new ArgumentException("ErrorGrain left Immideate Unobserved Error 1."); }); -->>promise = null; GC.Collect(); GC.WaitForPendingFinalizers(); GC.Collect(); GC.WaitForPendingFinalizers(); return Task.FromResult(11);

The behavior of the code with the task that is compiled by the GC is checked.

From what else some attention should be paid to (these errors can lead to incorrect expectations during tests):

V3013 It is the odd that the body of the ProduceSequentialSeries' function is fully equivalent to the body of the 'ProduceParallelSeries' function (618, line 625). StreamingGrain.cs 618

 public override async Task ProduceSequentialSeries(int count) { await base.ProduceParallelSeries(count); State.Producers = _producers; await WriteStateAsync(); } public override async Task ProduceParallelSeries(int count) { await base.ProduceParallelSeries(count); State.Producers = _producers; await WriteStateAsync(); }

In a special test class, methods designed to generate sequential and parallel series of events always generate parallel ones. Maybe this is OK, but then the semantics inside the test is misleading.

V3013 It is an odd that the body of the TestInitialize function is fully equivalent to the body of the TestClean function (44, line 52). ConfigTests.cs 44

 [TestInitialize] public void TestInitialize() { TraceLogger.UnInitialize(); GrainClient.Uninitialize(); GrainClient.TestOnlyNoConnect = false; } [TestCleanup] public void TestCleanup() { TraceLogger.UnInitialize(); GrainClient.Uninitialize(); GrainClient.TestOnlyNoConnect = false; }

In the configuration tests, the initialization and primitives are carried out for some reason at initialization and at completion.

V3043 The code's operational logic does not correspond with its formatting. The statement is indented. It is possible that curly brackets are missing. RequestContextTest.cs 87,97,111,155,181

 if(msg.RequestContextData != null) foreach (var kvp in msg.RequestContextData) { headers.Add(kvp.Key, kvp.Value); };

And nevertheless, I am a big supporter of strict coding-style rules. Well, or at least automatic coercion to style, by StyleCop, Resharper or CodeFormatter. It costs nothing, but it allows you not to break your eyes.

V3052 The original exception object 'ae' was swallowed. Stack of original exception could be lost. GrainReferenceCastTests.cs 212

 catch (AggregateException ae) { Exception ex = ae.InnerException; while (ex is AggregateException) ex = ex.InnerException; throw ex; }

Losing the stack AggregatedException. I understand that for cases with other exceptions it is bad, but we in our project just "unwind" AggregatedException. I would write it in false-positive if the catch section is AggregatedException.

Evaluation of the criticality of found bugs by the project developers.

Of all the error reports sent, after 3 hours 5 marked as bug, 1 critical with Replace fixed. Pretty good catch for 20 minutes of static analysis in automatic mode.

Why Code analysis is not just another favor of the month.

The Orleans project is developing very intensively, the frequency of commits is very high, Gihub confirms this - Orleans-contributors . During the writing of the article, I checked the project several times, keeping the analysis logs. Including different versions of PVS Studio (6.0 and 6.01).

Here is a comparison of different logs:

Priority	28 Jan	Feb 2	4 Feb v6.01	Comments
High	nineteen	21	18	Two days -3 critical errors - which is good. But, for example, 2 of them were analogs of V3025 described above. One could not even check such code.
Medium	four	four	7	For the same 2 days, 3 warnings of moderate severity.
Low	52	46	58	And +12 new sloppy code areas.

As can be seen from the table, critical errors appear and disappear in a very short time. Where it is more convenient to catch them as a result of the analysis immediately after the build, and not after hours of debugging or the fall of production, it saves time and nerves of the programmer. This is, of course, a commonplace thing in the series “It’s better to be rich and healthy than poor and sick” , but if there are still programmers who don’t use at least some available static code analyzer - “You're doing it wrong” ...

In terms of ease of use, integration with PVS-Studio with Visual Studio 2015 is quite simple - there is an Analyze with PVS-Studio item with a clearly visible green icon in the context menu of the Solution Explorer on the C # file, project, or root solution. Pretty easy to find. If only in this context menu there are no 100,500 items from other extensions. It would be interesting to see some auxiliary nuget package that could easily run the analysis from the command line so that you can open the Package Manger Console or build.cmd and say PVS-Solution or PVS-Project Orleans , well, in the future it's all like built into the xproj mechanisms of the new CoreCLR.

However, Ctrl+Q, PVS ... Down, Down, Down, Down, Enter enough for a quick start :).

Conclusion

Modern realities of development set a very high level in the speed of development - everyone is trying to run forward very quickly to at least remain at the level of competitors in the industry. Naturally, with such a speed of development, you can miss something or write carelessly. An extensive set of tests helps in identifying problem areas and regressions, but this is also not a panacea - there are errors in tests, and the creation of such a set, as well as keeping it up to date, takes time and effort.

Any statistical analyzer, be it Visual Studio warnings, Resharper analyzers or PVS-Studio, is another tool in the developer’s collection and assistant in detecting potential problems with code. Use at least what is available for free. For example, on all our projects, the default mode is Enable Warnings as Errors, which helps to write code a little more disciplined. It costs the programmer nothing, but can save him from shots in his leg. Modern analyzers are very easy to use, in most cases everything works fine out of the box and the only thing that can somehow justify the non-use of analyzers is a matter of price.

NB: The author thanks the PVS-Studio team for providing a temporary license for testing.

All read - Happy and bug-free programming!

Source: https://habr.com/ru/post/276727/

All Articles