More simple bugs [Hell language]

(Many) More Low Hanging Bugs

Translator's note. On Habré there are almost no publications related to the language of Hell. But this is a living language in which programs are written, for which static analysis tools are developed. In order to at least fill this informational vacuum on Habré, I decided to translate a small note related to this language. Why her? PVS-Studio is mentioned in it, and I am pleased :). Plus, perhaps the Russian developers in the Ada language will learn about the new toolkit for themselves and see that they are not at all alone, and life continues to boil in the world of Hell.

In the previous article, we talked about our experiments on creating compact and lightweight diagnostics for Ada source code based on the Libadalang technology. The two diagnostics described there found 12 errors in the code base of the tools developed by our team at AdaCore. In this note, we will talk about another 6 small checks, with the help of which 114 defects were detected. Of these 6 diagnostics, 4 found errors and poor-quality code fragments (check_deref_null, check_test_not_null, check_same_test, check_bad_unequal), the remaining 2 - possibilities for refactoring (check_same_then_else, check_useless_assign). Due to the fact that the run of each diagnostic on our code takes a few seconds, we were able to debug them without any problems until they stopped producing false alarms. At the moment, none of these diagnostics use the mechanism of semantic analysis, which appeared recently in Libadalang, however, in the future it will improve the accuracy of the analysis. When developing each of the checks, we were inspired by similar compact diagnostics in other tools, in particular, by the rules of PVS-Studio and its base of errors from real applications .

Diagnostics of pointer dereferencing before checking

The first diagnostic we have implemented is one of the most popular in many instruments. It searches for pointers that are dereferenced and then checked for zero. Such a situation is suspicious, since it is logical to expect a reverse order of action, and indicates either an error (the pointer should not be dereferenced without first checking for zero) or the problem of poor-quality code (checking for zero is superfluous). The code base of our tools found both problems. Here is an example of an error found in the GNAT compiler in the g-spipat.adb file, where the PP pointer in line 2088 is dereferenced in the Dump procedure:

procedure Dump (P : Pattern) is subtype Count is Ada.Text_IO.Count; Scol : Count; -- Used to keep track of column in dump output Refs : Ref_Array (1 .. PPIndex); -- We build a reference array whose N'th element points to the -- pattern element whose Index value is N.

and then, much later, on line 2156, is checked for zero:
')

 -- If uninitialized pattern, dump line and we are done if PP = null then Put_Line ("Uninitialized pattern value"); return; end if;

The solution to the problem is that the Refs array is now declared after the inequality of the PP pointer zero is precisely established. Here is an example of poor-quality code found also in the GNAT compiler on line 2797 of the g-comlin.adb file. The Line parameter is first declared and then checked for zero:

  Sections_List : Argument_List_Access := new Argument_List'(1 .. 1 => null); Found : Boolean; Old_Line : constant Argument_List := Line.all; Old_Sections : constant Argument_List := Sections.all; Old_Params : constant Argument_List := Params.all; Index : Natural; begin if Line = null then return; end if;

We fixed this code by declaring Line as a parameter of a non-zero access type. Sometimes dereferencing and checking occur in the same expression, as, for example, in the following code snippet of our GNATstack tool - line 97 of the dispatching_calls.adb file:

 if Static_Class.Derived.Contains (Explored_Method.Vtable_Entry.Class) and then Explored_Method /= null and then

The code was corrected in such a way that the check for the Explored_Method pointer zero is performed before it is dereferenced. A total of 11 errors and 9 fragments of poor-quality code fell on this diagnostic.

The second diagnostics in this category finds the code, in which the checking of the pointer to zero is followed by its dereference. As a rule, this is a sign of a logical error, especially in complex boolean expressions, as shown in the examples from the database of errors found by PVS-Studio . No errors of this type were found in our code, which may indicate a high quality of testing on our part. As a matter of fact, the execution of such a code in the Ada language would immediately lead to an exception.

Diagnostics of checked expressions

The first diagnostics in this series is looking for verification of identical subexpressions in the if-elsif operator chain. Such checks indicate either errors or poor-quality code. Below is an example of the error found in the GNAT compiler on line 7380 of the sem_ch4.adb file:

 if Nkind (Right_Opnd (N)) = N_Integer_Literal then Remove_Address_Interpretations (Second_Op); elsif Nkind (Right_Opnd (N)) = N_Integer_Literal then Remove_Address_Interpretations (First_Op); end if;

The edit was to replace Right_Opnd (N) with Left_Opnd (N) in the second check. In total, this diagnostics detected 3 errors and 7 fragments of poor-quality code.

The second diagnostics is looking for expressions like "A / = B or A / = C", where the literals B and C are different. Such conditions are always true. As a rule, instead of the “or” operator they should use “and”. Using this diagnostic, one error was found in our QGen code generator in line 675 of the himoco-blockdiagramcmg.adb file:

 if Code_Gen_Mode /= "Function" or else Code_Gen_Mode /= "Reusable function" then To_Flatten.Append (Obj); end if;

Duplicate Code Diagnostics

The first diagnostics of duplicated code searches for identical blocks of code in different branches of the if and case statements . Situations of this kind point to typos or logical errors, but in our case they only revealed fragments to be refactored. However, some of them contain over 20 duplicate lines of code, as in the following example - line 1023 of the be-checks.adb file in the CodePeer tool:

 elsif VN_Kind (VN) = Binexpr_VN and then Operator (VN) = Logical_And_Op and then Int_Sets.Is_In (Big_True, To_Int_Set_Part (Expect)) then -- Recurse to propagate check down to operands of "and" Do_Check_Sequence (Check_Kind, Split_Logical_Node (First_Operand (VN)), Srcpos, File_Name, First_Operand (VN), Expect, Check_Level, Callee, Callee_VN, Callee_Expect, Callee_Precondition_Index); Do_Check_Sequence (Check_Kind, Split_Logical_Node (Second_Operand (VN)), Srcpos, File_Name, Second_Operand (VN), Expect, Check_Level, Callee, Callee_VN, Callee_Expect, Callee_Precondition_Index); ... elsif VN_Kind (VN) = Binexpr_VN and then Operator (VN) = Logical_Or_Op and then Int_Sets.Is_In (Big_False, To_Int_Set_Part (Expect)) then -- Recurse to propagate check down to operands of "and" Do_Check_Sequence (Check_Kind, Split_Logical_Node (First_Operand (VN)), Srcpos, File_Name, First_Operand (VN), Expect, Check_Level, Callee, Callee_VN, Callee_Expect, Callee_Precondition_Index); Do_Check_Sequence (Check_Kind, Split_Logical_Node (Second_Operand (VN)), Srcpos, File_Name, Second_Operand (VN), Expect, Check_Level, Callee, Callee_VN, Callee_Expect, Callee_Precondition_Index);

or in line 545 of the soap-generator-skel.adb file in the GPRbuild tool:

 when WSDL.Types.K_Derived => if Output.Next = null then Text_IO.Put (Skel_Adb, WSDL.Parameters.To_SOAP (N.all, Object => "Result", Name => To_String (N.Name), Type_Name => T_Name)); else Text_IO.Put (Skel_Adb, WSDL.Parameters.To_SOAP (N.all, Object => "Result." & Format_Name (O, To_String (N.Name)), Name => To_String (N.Name), Type_Name => T_Name)); end if; when WSDL.Types.K_Enumeration => if Output.Next = null then Text_IO.Put (Skel_Adb, WSDL.Parameters.To_SOAP (N.all, Object => "Result", Name => To_String (N.Name), Type_Name => T_Name)); else Text_IO.Put (Skel_Adb, WSDL.Parameters.To_SOAP (N.all, Object => "Result." & Format_Name (O, To_String (N.Name)), Name => To_String (N.Name), Type_Name => T_Name)); end if;

In total, this diagnosis revealed 62 fragments of poor-quality code in our database.

The second diagnostics of this type searches for meaningless assignments of values to local variables, after which these values are never used. Such errors can be explicit, as, for example, here (line 1067 of the be-value_numbers-factory.adb file in the CodePeer tool):

 Global_Obj.Obj_Id_Number := Obj_Id_Number (New_Obj_Id); Global_Obj.Obj_Id_Number := Obj_Id_Number (New_Obj_Id);

and hidden, as in the example below (line 895 of the bt-xml-reader.adb file in the same CodePeer tool):

 if Next_Info.Sloc.Column = Msg_Loc.Column then Info := Next_Info; Elem := Next_Cursor; end if; Elem := Next_Cursor;

In total, this diagnostics found 9 fragments of poor-quality code.

Installation Instructions

Want to try out the above scripts on your code base? This can be done now using the latest version of GNAT Pro or the GPL version for community users and academic license holders! Following the instructions in the Libadalang repository , you can run scripts inside your Python2 interpreter.

Conclusion

A total of 8 diagnostics implemented at the moment (some of them are described in this note, some of them in the previous one ) allowed us to detect and correct 24 errors and 102 fragments of poor-quality code in our code base. These results clearly support the fact that such diagnostics should be included in static analyzers, and we hope that we will soon be able to implement all these and many other tests in our static CodePeer analyzer for Ada programs.

It is also noteworthy that, thanks to the powerful API for Python implemented in Libadalang, it took no more than a couple of hours to create each of these diagnostics. Although we had to write the template code to bypass the AST in different directions, as well as numerous patches to compensate for the lack of semantic analysis in the latest Libadalang version available at that time, it all took relatively little time, and we can then apply these developments in other similar diagnostics. Now we are looking forward to the release of the Libadalang version, in which the semantic analysis option will appear and which will allow us to develop even more powerful and useful checks.

[author of the photo - Courtney Lemon ]

About Yannick My

Yannik Moy is a senior development engineer at AdaCore and co-director of the ProofInUse collaborative laboratory. In AdaCore, he develops CodePeer and SPARK static code analyzers designed to detect errors and evaluate the security and security of a code. Yannick is a leading developer of the SPARK 2014 product, which he regularly talks about in his articles, at conferences, in high school and in blogs (in particular, on www.spark-2014.org ). Previously, Yannick was developing code analyzers at PolySpace (now The MathWorks) and the University of Paris-Sud.

Source: https://habr.com/ru/post/329270/

All Articles