📜 ⬆️ ⬇️

Static analyzer HuntBugs: check IntelliJ IDEA

As many remember, for a while I developed the FindBugs static bytecode Java analyzer. However, there were so many problems in FindBugs that I decided that it would be easier to write a new bytecode analyzer. I haven't very creatively called it HuntBugs. Development is conducted on GitHub . It is still in the early development stage, sometimes buggy and covers about 35% of the diagnostics from FindBugs, but at the same time it adds its own interesting pieces. You can try on your Maven-project using the command mvn one.util:huntbugs-maven-plugin:huntbugs (the report is written in target/huntbugs/report.html ). Alternatively, you can manually build from the gita and launch the one.util.huntbugs.HuntBugs command line one.util.huntbugs.HuntBugs , which can be used as input to JAR files or directories with .class files.


Sometime later, when the project is a little older, I will tell about it in more detail. And in this article I will show you what interesting things HuntBugs found in IntelliJ IDEA Community Edition . I downloaded from the official site and installed the latest version of this IDE, and then set HuntBugs on the file lib/idea.jar , in which almost everything lies. I like to test static analysis on IDEA, because it is an IDE, in which there is a very good static analyzer itself and the developers clearly use it. It is interesting to see what remains after it.


The format of this article is not very different from what PVS-Studio does: errors, pieces of code, explanations. Of course, the article included only the most interesting.


Field is assigned to itself


As a rule, no one admits errors like this.field = this.field , not even the newest IDE usually warns about such. However, HuntBugs can look a little deeper. Here is the code snippet :


  private int myLastOffsetInNewTree; ... private int getNewOffset(ASTNode node){ int optimizedResult = haveNotCalculated; ... if (myLastNode == prev) { ... optimizedResult = myLastOffsetInNewTree; myLastNode = node; myLastOffsetInNewTree = optimizedResult; ... } } 

The field myLastOffsetInNewTree loaded into the local variable optimizedResult , and then for some reason again stored in the field, although during this time it could not change. The last grafting is strange, either it needs to be removed, or something else was meant.


An integer value is passed to the rounding method


Sometimes there are errors with untimely conversion of integer type to fractional. Not always it can be caught, but here it turned out :


  final int width = icon.getIconWidth(); final int height = icon.getIconHeight(); final int x = (int)Math.ceil((actionButton.getWidth() - width) / 2); final int y = (int)Math.ceil((actionButton.getHeight() - height) / 2); 

Here, rounding up ( Math.ceil ) is used twice, but the argument in both cases is an integer, since in Java the division of an integer into an integer yields an integer (rounded down). Probably meant to divide by 2.0 or otherwise switch to fractional numbers before dividing. (int)Math.ceil current behavior suits you, then (int)Math.ceil should be removed: this part of the code is useless.


Switch branch is unreachable due to expression range


A very curious piece of code , which, apparently, someone once automatically generated, and now no one understands whether this is correct and what should be there:


  int state = getState() & 0xF; tokenType = fixWrongTokenTypes(tokenType, state); if (...) { // TODO: do not know when this happens! switch (state) { case __XmlLexer.DOCTYPE: tokenType = XmlTokenType.XML_DECL_START; break; } } 

The __XmlLexer.DOCTYPE is 24, but state = getState() & 0xF is executed above, so the state value can only be from 0 to 15 and the switch branch is not guaranteed to be executed. Perhaps when the source file of the lexer was changed again, the constants were regenerated with different values, and this file was forgotten to be regenerated. Anyway, the code is very suspicious, as evidenced by the comment.


Synchronization on getClass () rather than class literal


This fragment of the MatcherImpl class is synchronized to getClass() . And this is done in a public non-final class, which actually has a subclass of Matcher . As a result, when executing this code from a subclass, synchronization will occur not according to MatcherImpl.class , but according to Matcher.class . The problem is aggravated by the fact that in the same class there is an explicit synchronization on MatcherImpl.class and both critical sections (which may not be mutually exclusive) update the same static field lastMatchData . As a result, the whole point of synchronization is lost. Normally synchronized(getClass()) is wrong; use the explicit synchronized(MatcherImpl.class) class literal synchronized(MatcherImpl.class) .


Exception created and dropped rather than thrown


Quite a common error in Java: an exception object was created, but not thrown. For example, here :


 public void remove() { new OperationNotSupportedException(); } 

IDEA itself also warns about such situations. Another similar place .


Invariant loop condition


Here is another auto-generated file . In principle, everything is probably okay, and nothing can be edited, but in hand-written code it would look suspicious:


 boolean r = ...; while (r) { if (!value(b, l + 1)) break; if (!empty_element_parsed_guard_(b, "json", c)) break; c = current_position_(b); } 

Here, a cycle with a condition on a local variable r , whose value does not change in a cycle, therefore either we don’t go into a cycle at all, or we never quit by a condition (only by break ). If this was really implied, then in such cases it is better to write if(r) { while(true) { ... } } to emphasize the intention to make an infinite loop.


The switch operator has identical branches


Duplicate branches of the switch sometimes look reasonable, but when there is a large piece of code in them, like here , you should take a closer look:


 switch ((((PsiWildcardType)x).isExtends() ? 0 : 1) + (((PsiWildcardType)y).isExtends() ? 0 : 2)) { case 0: /* ? super T1, ? super T2 */ if (constraints != null && xType != null && yType != null) { constraints.add(new Subtype(yType, xType)); } return balance(xType, yType, balancer, constraints); case 1: /* ? extends T1, ? super T2 */ if (constraints != null && xType != null && yType != null) { constraints.add(new Subtype(xType, yType)); } return balance(xType, yType, balancer, constraints); case 2: /* ? super T1, ? extends T2*/ return null; case 3: /* ? extends T1, ? extends T2*/ if (constraints != null && xType != null && yType != null) { constraints.add(new Subtype(xType, yType)); } return balance(xType, yType, balancer, constraints); } 

Not immediately noticeable, but case 1 and case 3 exactly the same (and differ from case 0). If this was meant, it may be wiser to combine case 1 and case 3 , so that it is easier to read and maintain the code.


The same condition is repeatedly checked


For some reason, the same condition is checked twice in this code :


 if (offsetToScroll < 0) { if (offsetToScroll < 0) { ... } } 

Maybe just the internal check should be removed, and maybe something else would be checked. Here is another similar case. Or another interesting case:


  return o instanceof PsiElement && ((PsiElement)o).isValid() && ((PsiElement)o).isPhysical() || o instanceof ProjectRootModificationTracker || o instanceof PsiModificationTracker || o == PsiModificationTracker.MODIFICATION_COUNT || o == PsiModificationTracker.OUT_OF_CODE_BLOCK_MODIFICATION_COUNT || // <<< o == PsiModificationTracker.OUT_OF_CODE_BLOCK_MODIFICATION_COUNT || // <<< o == PsiModificationTracker.JAVA_STRUCTURE_MODIFICATION_COUNT; 

And here repeated conditions are not very close and it is even more difficult to notice them:


 return SUPPORTED_TYPES.contains(token) || StdArrangementTokens.Regexp.NAME.equals(token) || StdArrangementTokens.Regexp.XML_NAMESPACE.equals(token) || KEEP.equals(token) || BY_NAME.equals(token) || SUPPORTED_TYPES.contains(token); 

The SUPPORTED_TYPES.contains(token) checked twice. Of course, HuntBugs carefully watches so that nothing changes between these conditions. If in intermediate conditions token reassigned, such a construction would have the right to exist.


Numeric comparison is always true or false


Here is just a redundant check rather than a real error:


 int size = myPanels.length; final Dimension preferredSize = super.getPreferredSize(); if (size >= 0 && size <= 20) { return preferredSize; } 

The size variable contains the length of the array, which can never be negative. It is not clear why to check size >= 0 . Even if there is no error, I believe that such checks should be deleted, because they confuse the reader. It is not known if the author may have meant size > 0 , then this is a mistake.


Chain of private methods is never called


Typically, IDEs easily find private methods that are never invoked, and offer to remove them. But such a case is not always determined:


 @Nullable private static JsonSchemaObject getChild(JsonSchemaObject current, String name) { JsonSchemaObject schema = current.getProperties().get(name); if (schema != null) return schema; schema = getChildFromList(name, current.getAnyOf()); // <<< if (schema != null) return schema; ... } @Nullable private static JsonSchemaObject getChildFromList(String name, List<JsonSchemaObject> of) { if (of == null) return null; JsonSchemaObject schema; for (JsonSchemaObject object : of) { schema = getChild(object, name); // <<< if (schema != null) return schema; } return null; } 

These two private methods call each other recursively, but nobody calls them outside. HuntBugs sees this situation and says that both methods are not really used.


Useless String.substring (0)


Honestly, I did not expect to see such a diagnosis in the production code, it is too trivial. But no, there are also trivial errors :


 String str = (String)value; if (str.startsWith("\"")) { str = str.substring(0); str = StringUtil.trimEnd(str, "\""); } 

Apparently, the author meant to delete the first character of a string, but for some reason, he wrote not substring(1) , but substring(0) (this call simply returns the original string). This is the second case (in addition to the dropped exception), when IDEA itself also highlights the problem area.


Result of integer multiplication promoted to long


This warning does not always lead to real danger, but nevertheless I want to set an example :


 final long length = myIndexStream.length(); long totalCount = length / INDEX_ENTRY_SIZE; // INDEX_ENTRY_SIZE = 26 for(int i=0; i<totalCount; i++) { final long indexOffset = length - (i + 1) * INDEX_ENTRY_SIZE; 

First, it is already suspicious that the loop variable is of type int , and not long (it is possible that a separate diagnosis should be made for such a situation). If totalCount exceeds 2 31 , then the loop will never end. But all right, this is only possible with a length of index length greater than 52 gigabytes, which is still quite a lot. However, the problems in this code will begin at a length of more than 2 gigabytes. Since i and INDEX_ENTRY_SIZE are of type int , multiplication will be performed on 32-bit signed integers and successfully overflow. Already after this, the result of the multiplication will be reduced to long and after performing the subtraction, the offset may well be more length. Probably, such large caches have never been there, but it will be unpleasant when they appear. The fix is ​​simply to declare the long loop variable.


And what about Kotlin?


It is known that the IntelliJ IDEA part is written in Kotlin, which is also compiled into Java bytecode. Static bytecode analyzers can formally analyze any language, but in fact, if the analyzer is sharpened in Java, then there will be a lot of false positives for other languages. Often they are taken because the language compiler generates some specific constructions (for example, implicit checks). Sometimes, however, such a false trigger is a reason to get accustomed to the compiler code generator. For example, the com.intellij.configurationStore.FileBasedStorageKt class. In the class itself there is the following line :


 private val XML_PROLOG = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>".toByteArray() 

In the java.lang.String class, the toByteArray() method is toByteArray() known to be known. This is Kotlin's extension-method , and the inline-method (which the compiler embeds directly at the place of use), by default it executes String.getBytes(Charsets.UTF_8) . Let's see what this line was compiled in Kotlin. I will not show directly baytkod, and I will transform it to more clear code on Java:


 String str = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"; Charset charset = null; int mask = 1; Object obj = null; //         —    //  - nop -   if(obj != null) { throw new UnsupportedOperationException("Super calls with default arguments not supported in this target, function: toByteArray"); } if(mask & 1 == 0) { charset = kotlin.text.Charsets.UTF_8; } //   - nop XML_PROLOG = kotlin.jvm.internal.Intrinsics.checkExpressionValueIsNotNull(((String)str).getBytes(charset), "(this as java.lang.String).getBytes(charset)"); 

It can be seen that the line has grown incredibly. The variable mask is associated with the transfer of the default parameter (Dmitry Jemerov told about this at JPoint - see slide 40 and below . Here, obviously, a lot of excess, and HuntBugs rightly swears at obj != null (comparing null with null ), and mask & 1 Although the author of the code is absolutely not guilty, I suppose, over time, Kotlin's compiler will be smarter and will generate less garbage.


Conclusion


Here you can write a plain text about the importance of static analysis, which Andrey2008 and his colleagues write after their articles, but you already know everything. Interestingly, even in the code that is developed using static analysis, we managed to find a lot of suspicious places just by checking it with a new tool. Of course, not everything got into the article. In addition to false triggers, there are quite a few messages important, but boring. Many messages about performance. For example, string concatenation in a loop is 59 pieces. Or bypassing Map values ​​through keySet() + get() , when faster through values() is 18 pieces. A large number of potential problems with multithreading. Let's say nonatomic volatile updates are 50 pieces. Or suspicious wait() / notify() usage scenarios - 8 pieces.


Use static analysis and follow the news!


')

Source: https://habr.com/ru/post/301626/


All Articles