⬆️ ⬇️

Checking the OpenJDK project with PVS-Studio

Coauthor: Roman Fomichev.



Currently, many projects open their source code and allow the developer community to make changes in it. We will check one of these projects - OpenJDK, and help developers improve their code.



Introduction



OpenJDK (Open Java Development Kit) - a project to create an implementation of the Java platform (Java SE), consisting entirely of free and open source. The project started in 2006 through the efforts of Sun. The project uses several languages ​​- C, C ++ and Java. We are interested in source codes written in C and C ++. For verification, take the 9th version of OpenJDK. The code for this Java implementation of the platform is available in the Mercurial repository .



The project was tested using the PVS-Studio static code analyzer. It implements many diagnostic rules that allow you to find a large number of errors made during programming, including those that are difficult to detect with a simple view of the code. Some of these errors do not affect the logic of the program, and some can lead to dire consequences during the execution of the program. On the analyzer website there are examples of errors found in other projects. Projects that use C, C ++, and C # are available for analysis. You can download a trial version of the analyzer by the link .

')

Errors in logical expressions







First, consider the errors in logical expressions:

int StubAssembler::call_RT(....) { #ifdef _LP64 // if there is any conflict use the stack if (arg1 == c_rarg2 || arg1 == c_rarg3 || arg2 == c_rarg1 || arg1 == c_rarg3 || arg3 == c_rarg1 || arg1 == c_rarg2) { .... } 


PVS-Studio warning : V501 There are identical sub-expressions 'arg1 == c_rarg3' operator. c1_Runtime1_x86.cpp 174



The analyzer reports a duplicate check arg1 == c_rarg3. In this snippet, there is either a redundant check or, much worse, a logical error. Perhaps, instead of the duplicated condition, something else should be checked. It makes sense for developers to take a closer look at this code.



In the same condition there is one more repeated expression arg1 == c_rarg2 :



PVS-Studio warning : V501 There are identical sub-expressions 'arg1 == c_rarg2' operator. c1_Runtime1_x86.cpp 174



These warnings particularly well demonstrate the need to use a static analyzer. In a large number of similar expressions it is very easy to make a mistake that is difficult to detect when superficially viewing the code.



In the following snippet, a “non-ideal” test was found in the condition of the Ideal method:

 Node *AddLNode::Ideal(PhaseGVN *phase, bool can_reshape) { .... if( op2 == Op_AddL && in2->in(1) == in1 && op1 != Op_ConL && 0 ) { .... } 


PVS-Studio warning : V560 A part of conditional expression is always false: 0. addnode.cpp 435



It is rather strange to use 0 in a logical expression. Most likely, this code fragment is still under development and the condition was made impossible to debug. There are no corresponding comments in the code, and in this case there is a high probability of forgetting to correct the code in the future. The result of this error will be that everything that is inside this condition will never be executed, since the result of evaluating a logical expression will always be false.



Priorities for operations



Quite often, programmers rely on their knowledge of priorities and do not single out the component parts of a complex expression in parentheses:

 int method_size() const { return sizeof(Method)/wordSize + is_native() ? 2 : 0; } 


PVS-Studio warning : V502 Perhaps the '?:' Operator works in a different way. The '?:' Operator has a lower limit than the '+' operator. method.hpp 249



In this case, I do not know the specifics of the code, but there is a suspicion that the function planned to choose the offset value '2' or '0' depending on the result of the is_native () function call, but the expression has a different order of evaluation. First, sizeof (Method) / wordSize + is_native () will be added , and then the result will be 2 or 0. Most likely, the code should look like this:

 { return sizeof(Method)/wordSize + (is_native() ? 2 : 0); } 


This is a very common error with priority operations. In the database of errors found by the analyzer, the most popular of them were identified and given in the article: Logical expressions in C / C ++. How wrong are the professionals .



Copy paste







The next characteristic error group is related to copying code. One cannot get away from this favorite method of programmers, so we explore the places where copy paste was applied:

 static int setImageHints(....) { .... if (dstCMP->isDefaultCompatCM) { hintP->allocDefaultDst = FALSE; hintP->cvtToDst = FALSE; } else if (dstCMP->isDefaultCompatCM) { hintP->allocDefaultDst = FALSE; hintP->cvtToDst = FALSE; } .... } 


PVS-Studio warning : V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 1873, 1877. awt_ImagingLib.c 1873



In this example, the conditions in if and else if are completely identical, and the code that must be executed. The second condition is completely meaningless, it will never be executed.



Another similar case:

 static int expandPackedBCR(JNIEnv *env, RasterS_t *rasterP, int component, unsigned char *outDataP) { .... /* Convert the all bands */ if (rasterP->numBands < 4) { /* Need to put in alpha */ for (y=0; y < rasterP->height; y++) { inP = lineInP; for (x=0; x < rasterP->width; x++) { for (c=0; c < rasterP->numBands; c++) { *outP++ = (unsigned char) (((*inP&rasterP->sppsm.maskArray[c]) >> roff[c]) <<loff[c]); } inP++; } lineInP += rasterP->scanlineStride; } } else { for (y=0; y < rasterP->height; y++) { inP = lineInP; for (x=0; x < rasterP->width; x++) { for (c=0; c < rasterP->numBands; c++) { *outP++ = (unsigned char) (((*inP&rasterP->sppsm.maskArray[c]) >> roff[c]) <<loff[c]); } inP++; } lineInP += rasterP->scanlineStride; } } .... } 


PVS-Studio warning : V523 The 'then' statement is equivalent to the 'else' statement. awt_ImagingLib.c 2927



The executable code in both blocks is identical, respectively, there is no difference what is calculated in the condition. It makes sense to look at this place and remove the unnecessary branch or, if different logic was meant, adjust the code to eliminate duplication.



There are two more places with identical duplication. Just point them out, without casting the code:



Well, the last interesting example associated with a possible copy paste error:



 Node* GraphKit::record_profiled_receiver_for_speculation(Node* n) { .... ciKlass* exact_kls = profile_has_unique_klass(); bool maybe_null = true; if (java_bc() == Bytecodes::_checkcast || java_bc() == Bytecodes::_instanceof || java_bc() == Bytecodes::_aastore) { ciProfileData* data = method()->method_data()->bci_to_data(bci()); bool maybe_null = data == NULL ? true : <== data->as_BitData()->null_seen(); } return record_profile_for_speculation(n, exact_kls, maybe_null); return n; } 


PVS-Studio Warning: V561 It's not better to declare it maybe. Previous declaration: graphKit.cpp, line 2170. graphKit.cpp 2175



What happens in this code? Before the if block, the variable bool maybe_null = true; . Then, when the code is executed in the if block , a variable with the same name is declared. After exiting the block, the value of this variable will be lost, and the call to the function using this variable will in any case be true . It is good if this variable was duplicated for debugging purposes. Otherwise, this code is executed incorrectly and requires modification:

 maybe_null = data == NULL ? true : data->as_BitData()->null_seen(); 

Work with pointers







Working with pointers requires care and accuracy, since incorrect use of pointers can cause errors that will be difficult to identify. As a rule, the main danger is the use of broken pointers or the use of pointers without checking for zero value.



First consider the case of explicit use of the null pointer:

 static jint JNICALL cbObjectTagInstance(....) { ClassInstancesData *data; /* Check data structure */ data = (ClassInstancesData*)user_data; if (data == NULL) { data->error = AGENT_ERROR_ILLEGAL_ARGUMENT; return JVMTI_VISIT_ABORT; } .... } 


PVS-Studio warning : V522 Dereferencing of the null pointer 'data' might take place. util.c 2424



A completely incomprehensible code using a null pointer may cause the program to crash. Perhaps, this thread never worked, thanks to which it was possible to avoid problems when the program is running. In the same file there were three more similar places:



But in the following cases, the possibility of using a null pointer is hidden deeper. This is a very common situation, such warnings are found in almost all checked projects:

 static jboolean visibleClasses(PacketInputStream *in, PacketOutputStream *out) { .... else { (void)outStream_writeInt(out, count); for (i = 0; i < count; i++) { jbyte tag; jclass clazz; clazz = classes[i]; <== tag = referenceTypeTag(clazz); (void)outStream_writeByte(out, tag); (void)outStream_writeObjectRef(env, out, clazz); } } if ( classes != NULL ) <== jvmtiDeallocate(classes); .... return JNI_TRUE; } 


PVS-Studio warning: V595 The 'classes' pointer was used before it was verified against nullptr. Check lines: 58, 66. ClassLoaderReferenceImpl.c 58



In the lower block, the pointer is checked for a zero value, which means that the programmer assumes the possibility that the pointer value will be zero. But in the block above, the pointer is used without checking. Accordingly, if the pointer value is zero, then such a check will not help us, and we will get a program crash. To correct this error, you need to check the pointer above both blocks.



I will give one more similar example:

 int InstructForm::needs_base_oop_edge(FormDict &globals) const { if( is_simple_chain_rule(globals) ) { const char *src = _matrule->_rChild->_opType; OperandForm *src_op = globals[src]->is_operand(); assert( src_op, "Not operand class of chain rule" ); return src_op->_matrule ? src_op->_matrule->needs_base_oop_edge() : 0; } // Else check instruction return _matrule ? _matrule->needs_base_oop_edge() : 0; } 


PVS-Studio warning : V595 The '_matrule' pointer was used against nullptr. Check lines: 3534, 3540. formssel.cpp 3534



Here, the pointer is checked from below in the ternary operator - _matrule? _matrule-> needs_base_oop_edge (): 0 ;. And the above is a simple reference to this pointer - const char * src = _matrule -> _ rChild -> _ opType ;. The recipe for correction is similar: you need to check the pointer before using it. There are quite a few such places, I will give them a list:



Sometimes programmers check pointers on the contrary, but they do it incorrectly:

 FileBuff::FileBuff( BufferedFile *fptr, ArchDesc& archDesc) : _fp(fptr), _AD(archDesc) { .... _bigbuf = new char[_bufferSize]; if( !_bigbuf ) { file_error(SEMERR, 0, "Buffer allocation failed\n"); exit(1); .... } 


PVS-Studio warning : V668 against using b b allocated allocated allocated allocated allocated allocated... The exception will be generated in the case of memory allocation error. filebuff.cpp 47



In this case, checking the _bigbuf pointer to zero after using new is meaningless. If the system fails to allocate memory, an exception will be generated and the function will stop. To correct the error, you can use several approaches. Make a memory allocation in a try catch block , or use the new (std :: nothrow) construction to allocate memory, which will not generate an exception in case of failure. There are several more such incorrect checks:



The last error in working with pointers was made when explicitly casting a pointer of one type to a pointer of another type:

 mlib_status mlib_convMxNext_f32(...) { mlib_d64 dspace[1024], *dsa = dspace; .... mlib_f32 *fsa; .... if (3 * wid_e + m > 1024) { dsa = mlib_malloc((3 * wid_e + m) * sizeof(mlib_d64)); if (dsa == NULL) return MLIB_FAILURE; } fsa = (mlib_f32 *) dsa; <== .... } 


PVS-Studio warning: V615 An odd explicit conversion from 'double *' type to 'float *' type. mlib_ImageConvMxN_Fp.c 294



A pointer to float mlib_f32 * fsa is attempted to assign a pointer to double mlib_d64 dspace [1024], * dsa = dspace . The types float and double have a different size and similar type conversion is most likely erroneous. The mismatch of the sizes of the types given will result in the fsa pointer pointing to the number format incorrect for the float type.



In another file there are two more such similar conversions, you need to carefully check this code and use the correct type conversions.

This concludes the consideration of errors associated with the use of pointers and examines the remaining warnings of the analyzer.



Mistakes







The following error is probably also the result of a failed copy of the code:

 static bool parse_bool (const char **pp, const char *end, unsigned int *pv) { .... /* CSS allows on/off as aliases 1/0. */ if (*pp - p == 2 || 0 == strncmp (p, "on", 2)) *pv = 1; else if (*pp - p == 3 || 0 == strncmp (p, "off", 2)) *pv = 0; else return false; return true; } 


PVS-Studio warning : V666 Consider the third argument of the function 'strncmp'. This is not the case. hb-shape.cc 104



Here is the case when the error does not affect the performance of the program. Instead of comparing three characters, only the first two are compared, I do not exclude that the author of the code can and specifically made such a check. Since the value in the buffer p can be on or off, it is sufficient to compare the first two characters. But for order, you can still adjust the code:

 else if (*pp - p == 3 || 0 == strncmp (p, "off", 3)) 


There are several places where errors are possible when implementing classes:

 class ProductionState { .... private: // Disable public use of constructor, copy-ctor, ... ProductionState( ) : _production(cmpstr, hashstr, Form::arena) { assert( false, "NotImplemented"); }; ProductionState( const ProductionState & ) : _production(cmpstr, hashstr, Form::arena) { assert( false, "NotImplemented"); }; // Deep-copy }; 


PVS-Studio warning : V690 Copy, it will be generated by the compiler. It is dangerous to use such a class. dfa.cpp 76



In this class, they tried to prohibit copying, but forgot to add a copy statement to the private area. It will be generated by default and will be available for use. Even if now this code is not used anywhere in the code, then there is no guarantee that in the future someone will accidentally call it. When such an operator is called, a post-entry copy will occur for a class that should not be copied. This can lead to various effects up to a program crash. In this case, add the statement "=" to the private area.



There are two more classes where there are similar problems; it is advisable to correct them in such a way as not to violate the “ Law of the Big Two ”.



The last error looks like a simple typo:

 bool os::start_debugging(char *buf, int buflen) { int len = (int)strlen(buf); char *p = &buf[len]; .... if (yes) { // yes, user asked VM to launch debugger jio_snprintf(buf, sizeof(buf), "gdb /proc/%d/exe %d", os::current_process_id(), os::current_process_id()); os::fork_and_exec(buf); yes = false; } return yes; } 


PVS-Studio warning : V579 It is possibly a mistake. Inspect the second argument. os_linux.cpp 6094



The programmer wanted to pass the length of the buffer, but did not take into account that it is not a locally declared array, but a pointer coming in the function argument. As a result of evaluating the expression sizeof (buf), we get not the length of the buffer, but the size of the pointer, which will be 4 or 8 bytes. It is easy to correct the error, as the above buffer length has already been obtained: int len ​​= (int) strlen (buf) ;. The correct option would be as follows:

 jio_snprintf(buf, len .... 

Conclusion







It is always interesting to check a project that uses a lot of people and monitors its quality. There was a sufficiently large number of errors, only a certain number of them were described in this article, the rest require more in-depth research. The errors found once again confirm the effectiveness of using a static analyzer, as it allows you to detect errors that are difficult to detect when viewed with the simple eye. It is best to use the analyzer on an ongoing basis, as it will save a lot of time that would have been spent debugging the program when searching for bugs. I remind you that you can try the work of the PVS-Studio static analyzer on your project by downloading its trial version .





If you want to share this article with an English-speaking audience, then please use the link to the translation: Svyatoslav Razmyslov. OpenJDK check by PVS-Studio .



Read the article and have a question?
Often our articles are asked the same questions. We collected answers to them here: Answers to questions from readers of articles about PVS-Studio, version 2015 . Please review the list.

Source: https://habr.com/ru/post/303534/



All Articles