Vacuuming IDEA Ultimate Code with Data Flow Analysis

IntelliJ IDEA contains thousands of inspections for Java code. Most of them work as advanced regular expressions: according to a certain pattern, they look for fragments of a program that look like typos, are redundant, ugly or may work slowly. But there is an inspection of a completely different kind. She has a somewhat strange name: "Constant conditions & exceptions." In fact, it performs the analysis of data streams in Java methods using the so-called “symbolic execution”. As a result of this analysis, some suspicious facts may be revealed. Here are some examples of such facts:

Link dereferencing may result in a NullPointerException
The condition is always true or false.
The array index is always outside the range.
Type ClassCastException may result in ClassCastException

These facts do not always indicate an error. For example, sometimes you want to add a knowingly true condition for greater clarity of the code:

 if (obj == null) { … } else if (obj != null) { // : 'obj != null'   … }

However, often a warning from this inspection indicates a typo or says that the author of the code did not quite understand what he wrote. And most importantly, this inspection is able to detect very strange errors that are not covered by any patterns. Sometimes even we, the IDE authors, are surprised to look at the code in which the “Constant conditions & exceptions” inspection worked.
')
Over the years, inspection has improved steadily. Previously, it was mainly responsible for possible problems with null and type mismatch. Later, it appeared to analyze the presence of a value in Optional, the analysis of integer ranges, and the analysis of the boundaries of arrays. In the previous major release of 2017.3, we improved the analysis of integer ranges and added basic support for the Stream API and Optional chains.

Every inspection improvement is the first thing we test on the IDEA Ultimate code. This is a major project containing tens of thousands of classes that have been written for the second decade. Not less than a hundred developers with different levels of knowledge and style had a hand in the code. Therefore, it is always interesting to drive out “Constant conditions & exceptions” and discover a new bug in the ancient class. Let's see what the new inspection found after the improvements made for the upcoming release.

One of the most interesting analysis improvements in 2018.1 is the tracking of the relationships between variables. Previously, the inspection monitored only equality and inequality of variables. For example, the condition if(a == b && a != b) marked as always false. When we began to track integer ranges, new warnings appeared in conditions like if(a > 0 && a <= 0) , because the left part is true if the value of a lies in the range [1..MAX_VALUE] , and the right part - if range [MIN_VALUE..0] , and these ranges do not overlap. However, until now there has been no warning on the condition if(a > b && a <= b) , because here we do not know anything about ranges. Now we are watching which value is greater, even if they are both unknown, so in 2018.1 this condition will also be highlighted. It seems that the programmer himself will certainly notice such a trivial error, but do not forget that the analysis of data streams is not just looking for patterns.

First, he also easily finds the opposite case. And such an error was really revealed in our code for processing SQL tables:

idx & lt; myAncestorRefs.length || idx & gt; = myAncestorRefs.length

idx & lt; myAncestorRefs.length || idx & gt; = myAncestorRefs.length

Yes, the method begins with a condition that is always true, and therefore always returns an empty list, because the value of the idx variable is either less than the length of the myAncestorRefs array, or greater than or equal to it. The next ten lines of code are never executed. Obviously there is a typo: I mean idx < 0 instead of idx < myAncestorRefs.length .

By the way, about the length of the array. A pleasant surprise turned out to be that this feature made friends with the existing checks of the lengths of the arrays. We did not expect to find any errors like this one found in the Groovy plugin:

if (initializers.length & lt; i) return initializers [i];

if (initializers.length & lt; i) return initializers [i];

Now, when reading the array element, the inspection already knows that i in a given place is always greater than the number of elements in initializers . Therefore, an ArrayIndexOutOfBoundsException exception ArrayIndexOutOfBoundsException unavoidable if the condition is ever met. I think, in fact, it has never been executed, otherwise the exception would have been noticed. Of course the condition was turned over by mistake.

Another problem related to the array boundaries was found in the code of a single Java inspection, which handles catch blocks:

while (catchSections [index]! = catchSection & amp; & amp; index & lt; catchSections.length)

while (catchSections [index]! = catchSection & amp; & amp; index & lt; catchSections.length)

As you can see, the border of index values is checked after this variable is used to access an array element. The condition cannot be false: if it is false, then we would have already flown out of this code with an ArrayIndexOutOfBoundsException .

In the new version, we sometimes track the values from the array initializers. This allowed us to detect redundant code like this:

boolean [] result = new boolean [] {false}; & lt; ... & gt; if (result [0])

boolean [] result = new boolean [] {false}; & lt; ... & gt; if (result [0])

A long time ago, the single-element array result used inside the command lambda to fix the need to activate the editor. Over time, the activation of the editor was no longer needed, but not a single existing inspection department noticed that the condition was redundant. Now we see it.

Now we think that Boolean methods without arguments, whose name starts with “is”, return the same result if they were called twice with the same qualifier. Yes, this is a heuristic, it is not necessarily true. But if it caused a false positive in your code, think about it. Perhaps this is a sign that the code is difficult to understand, not only for the IDE, but also for your colleagues. But such a heuristic allows you to find real bugs. For example, such code was found when processing CSS selectors:

localSelectorElement = (CssSimpleSelector) localElement; remoteSelectorElement = (CssSimpleSelector) localSelectorElement; if (localSelectorElement.isUniversalSelector ()! = remoteSelectorElement.isUniversalSelector ()

localSelectorElement = (CssSimpleSelector) localElement; remoteSelectorElement = (CssSimpleSelector) localSelectorElement; if (localSelectorElement.isUniversalSelector ()! = remoteSelectorElement.isUniversalSelector ()

Although the qualifiers here are different, the same value is assigned to them above. Therefore, the analyzer assumes that isUniversalSelector will return the same result. And indeed it is! A typo is not in the condition, but the line above: there must be, of course, remoteSelectorElement = (CssSimpleSelector)remoteSelectorElements[j] .

Also in 2018.1, the inspectorate issues a new warning. If a variable is assigned a value, and it has always had the same value in this place, we highlight it. Again, this is not always an error, but the assignment is obviously redundant. Real errors with this warning were also found. For example, this is a unit test fragment checking HTML formatting:

int indentSize = htmlIndentOptions.INDENT_SIZE = 2;

int indentSize = htmlIndentOptions.INDENT_SIZE = 2;

The problem is three lines above the warning: there should not be “ = 2 ” at the end. As a result, the indent size is not restored after the test, as expected.

This warning also revealed duplicates in the initialization chains of fields or array elements:

defaults [CONSTANT_Int] = ints; ... defaults [CONSTANT_Int] = ints;

defaults [CONSTANT_Int] = ints; ... defaults [CONSTANT_Int] = ints;

Perhaps this line is simply redundant, but maybe something else was meant. Anyway, this is obviously a mistake.

A slight improvement in the processing of this revealed an error in the code processing tabbed controls:

while (component! = this || component! = null)

while (component! = this || component! = null)

If the left side is true, then the right one is not checked. If the left is false, then the component is equal to this , and this is not exactly null, so the component not null. Therefore, the whole condition is always true. A very common mistake when using || instead of && or vice versa.

Finally, we improved the handling of the Stream API. In particular, it now works for unfinished chains (that is, not ending with a terminal operation). This improvement did not reveal any errors in our code, but there were some redundant checks. For example, such a method is available in processing code coverage results:

mapToObj (idx - & gt; Pair.create (...)). filter ((x) - & gt; x! = null & amp; & amp; ...)); interesting, and the blind read this article? I am writing alternative text to all the pictures here, is there any benefit from this or not?

mapToObj (idx - & gt; Pair.create (...)). filter ((x) - & gt; x! = null & amp; & amp; ...)); interesting, and the blind read this article? I am writing alternative text to all the pictures here, is there any benefit from this or not?

The Pair.create annotation is automatically output for the @NotNull , so the x != null condition is redundant. The inspectorate is aware that x has the same meaning that the previous lambda returned.

The “Constant conditions & exceptions” inspection surprisingly often finds mistakes that have been hidden in large projects for years. It helps even more when you write new, untested code. At first, some of the warnings of this inspection can be annoying, but over time you understand how to live together with them.

We have already started the EAP-program 2018.1 , so you can experience new features right now. This inspection is constantly improving, the false gains disappear and useful ones are added. We still have many ideas that can be improved in it, so stay tuned.

Source: https://habr.com/ru/post/347410/

All Articles

Vacuuming IDEA Ultimate Code with Data Flow Analysis

More articles: