
In December 2015, at the JSConf US conference, the developers announced that they were planning to open the source code of the key components of the Chakra JavaScript engine running in Microsoft Edge. Recently, the source code of ChackraCore under the MIT license was published in the corresponding repository on GitHub. In the article I will tell you what you managed to find interesting in the project with the help of the PVS-Studio static analyzer.
Introduction
ChakraCore is the basic component of Chakra, a high-performance JavaScript engine that runs Microsoft Edge and Windows applications written in HTML / CSS / JS. ChakraCore supports JavaScript jit compilation for x86 / x64 / ARM, garbage collection and a wide range of the latest JavaScript features.
PVS-Studio is a static analyzer for detecting errors in the source code of programs written in C, C ++ and C # languages. The PVS-Studio tool is designed for developers of modern applications and integrates into Visual Studio 2010-2015 environments.
In preparing the article about checking the next
open project , we provide in it information far from all warnings issued by the static analyzer. Therefore, we recommend that the authors of the project independently perform an analysis and study all the messages issued by the analyzer. We provide a key to open project developers for a time.
')
Mistakes
V501 There are identical sub-expressions 'this-> propId == Js :: PropertyIds :: _ superReferenceSymbol' to the left | operator. diagobjectmodel.cpp 123
IDiagObjectModelDisplay * ResolvedObject::CreateDisplay() { .... if (this->isConst || this->propId == Js::PropertyIds::_superReferenceSymbol || this->propId == Js::PropertyIds::_superReferenceSymbol) { pOMDisplay->SetDefaultTypeAttribute(....); } .... }
In the condition there are two identical checks. Perhaps, when writing code, the same constant was randomly chosen in the IntelliSense menu, for example, instead of “Js :: PropertyIds :: _superCtorReferenceSymbol”.
V501 There are identical sub-expressions 'GetVarSymID (srcIndexOpnd-> GetStackSym ()) to ==' operator. globopt.cpp 20795
void GlobOpt::EmitMemop(....) { .... IR::RegOpnd *srcBaseOpnd = nullptr; IR::RegOpnd *srcIndexOpnd = nullptr; IRType srcType; GetMemOpSrcInfo(...., srcBaseOpnd, srcIndexOpnd, srcType); Assert(GetVarSymID(srcIndexOpnd->GetStackSym()) ==
Two more identical comparisons. Most likely, we wanted to compare "srcIndexOpnd-> GetStackSym ()" with "srcBaseOpnd -> GetStackSym ()".
V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 3220, 3231. lower.cpp 3220
bool Lowerer::GenerateFastBrSrEq(...., IR::RegOpnd * srcReg1, IR::RegOpnd * srcReg2, ....) { if (srcReg2 && IsConstRegOpnd(srcReg2)) { .... } else if (srcReg1 && IsConstRegOpnd(srcReg1)) { .... } else if (srcReg2 && (srcReg2->m_sym->m_isStrConst)) { .... } else if (srcReg1 && (srcReg1->m_sym->m_isStrConst))
Two identical checks were found in the cascade of conditional statements, so that the code block in the last condition never gets control. The full code of the presented example is very large and it is difficult to notice a typo. This is a good example that shows how a static analyzer can be useful when working with the same type of code, from which the programmer quickly gets tired and loses attention.
Most likely, the last two conditions would be written like this:
.... else if (srcReg2 && (srcReg2->m_sym->m_isStrEmpty)) { .... } else if (srcReg1 && (srcReg1->m_sym-> m_isStrEmpty))
V713 The pointer to the scriptContext was used in the same logical expression. diaghelpermethodwrapper.cpp 214
template <bool doCheckParentInterpreterFrame> void HandleHelperOrLibraryMethodWrapperException(....) { .... if (!exceptionObject->IsDebuggerSkip() || exceptionObject == scriptContext->GetThreadContext()->.... || exceptionObject == scriptContext->GetThreadContext()->.... || !scriptContext)
The dereferencing of the “scriptContext” pointer takes place before its correctness is checked. Due to luck, such a mistake has not yet manifested itself and has not been noticed. Such errors can live very long in the code, and manifest themselves in rare atypical situations.
V570 The 'this-> isInlined' variable is assigned to itself. functioncodegenjittimedata.h 625
void SetupRecursiveInlineeChain( Recycler *const recycler, const ProfileId profiledCallSiteId) { if (!inlinees) { inlinees = RecyclerNewArrayZ(....); } inlinees[profiledCallSiteId] = this; inlineeCount++; this->isInlined = isInlined;
It is very suspicious that the same value is put in the boolean variable 'isInlined'. Most likely they wanted to write something else.
Another place to assign a variable to itself:
- V570 The 'sym-> m_isTaggableIntConst' variable is assigned to itself. linearscan.cpp 3170
V590 Consider inspecting the 'sub [i]! =' - '&& sub [i] ==' / '' expression. The expression is misprint. rl.cpp 1388
const char * stristr ( const char * str, const char * sub ) { .... for (i = 0; i < len; i++) { if (tolower(str[i]) != tolower(sub[i])) { if ((str[i] != '/' && str[i] != '-') || (sub[i] != '-' && sub[i] == '/')) { / <=
The analyzer found that the part of the conditional expression (sub [i]! = '-') does not affect the result of the test. To verify this, we construct a truth table. Most likely there is a typo, but what should be the correct code, I find it difficult to answer.
V603 The object was not used. If you wish to call the constructor, 'this-> StringCopyInfo :: StringCopyInfo (....)' should be used. stringcopyinfo.cpp 64
void StringCopyInfo::InstantiateForceInlinedMembers() { AnalysisAssert(false); StringCopyInfo copyInfo; JavascriptString *const string = nullptr; wchar_t *const buffer = nullptr; (StringCopyInfo());
Programmers often
make mistakes when trying to explicitly call a constructor to initialize an object. In this example, new, unnamed objects of type “StringCopyInfo” are created and immediately destroyed. As a result, the class fields remain uninitialized.
The correct option would be to create an initialization function and call it from constructors and in this place.
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. constants.h 39
class Constants { public: .... static const int Int31MinValue = -1 << 30; .... };
According to the modern standard of the C ++ language, a shift in a negative number leads to undefined behavior.
V557 Array overrun is possible. The value of 'i' index could reach 8. rl.cpp 2375
enum TestInfoKind::_TIK_COUNT = 9 const char * const TestInfoEnvLstFmt[] = { " TESTFILE=\"%s\"", " BASELINE=\"%s\"", " CFLAGS=\"%s\"", " LFLAGS=\"%s\"", NULL, NULL, NULL, NULL
The analyzer detected an index out of array. The fact is that the for () loop performs 9 iterations, and in the array “TestInfoEnvLstFmt []” there are only 8 elements.
Most likely you forgot to add another NULL at the end:
const char * const TestInfoEnvLstFmt[] = { " TESTFILE=\"%s\"", " BASELINE=\"%s\"", " CFLAGS=\"%s\"", " LFLAGS=\"%s\"", NULL, NULL, NULL, NULL
But maybe they missed some line in the middle of the array!
Dangerous Signs
Diagnostics V595 finds code points where pointer dereferencing is performed before it is checked for zero. Usually in the checked projects there is always a certain number of such warnings. In our database of errors, it ranks first in the number of defects found (see
examples ). But the fact is that the V595 diagnostics are somewhat boring in order to write many examples from some project. Also, checking and dereferencing a pointer can be quite far in the function code: tens or even hundreds of lines from each other, which complicates the description of the error in the article.
Next, I will describe just a few of the shortest and most illustrative examples of code that most likely contain an error when working with pointers.
V595 The 'instrLd' pointer was used before it was verified against nullptr. Check lines: 1823, 1831. flowgraph.cpp 1823
IR::Instr * FlowGraph::PeepTypedCm(IR::Instr *instr) { .... if (instrLd && !instrLd->GetSrc1()->IsEqual(instr->GetDst())) { return nullptr; } if(instrLd2 && !instrLd2->GetSrc1()->IsEqual(instrLd->GetDst())) { return nullptr; } .... }
Notice the pointer named "instrLd". In the first condition, the dereference of this pointer is performed together with the check for zero, but in the second condition it is forgotten to do so, a situation may arise when the dereferencing of the null pointer occurs.
V595 The 'src2Val' pointer was used before it was verified against nullptr. Check lines: 9717, 9725. globopt.cpp 9717
bool GlobOpt::TypeSpecializeIntBinary(....) { .... bool isIntConstMissingItem = src2Val->GetValueInfo()->.... if(isIntConstMissingItem) { isIntConstMissingItem = Js::SparseArraySegment<int>::.... } if (!src2Val || !(src2Val->GetValueInfo()->IsLikelyInt()) || isIntConstMissingItem) { return false; } .... }
The “src2Val” pointer is used at the beginning of the function, but then the developers actively began to check whether the pointer is zero.
V595 The m_lastInstr 'pointer was used before it was verified against nullptr. Check lines: 214, 228. irbuilderasmjs.cpp 214
void IRBuilderAsmJs::AddInstr(IR::Instr * instr, uint32 offset) { m_lastInstr->InsertAfter(instr);
Another example of careless use of a pointer, which could potentially be null.
List of similar places:
- V595 The 'arrayData' pointer was used before it was verified against nullptr. Check lines: 868, 870. immutablelist.h 868
- V595 The 'pMembersList' pointer was used before it was verified against nullptr. Check lines: 2012, 2015. diagobjectmodel.cpp 2012
- V595 The 'walkerRef' pointer was used before it was verified against nullptr. Check lines: 3191, 3193. diagobjectmodel.cpp 3191
- V595 The 'block-> loop' pointer was used before it was verified against nullptr. Check lines: 981, 1002. globopt.cpp 981
- V595 The 'src2Val' pointer was used before it was verified against nullptr. Check lines: 12528, 12536. globopt.cpp 12528
- V595 The 'symDst' pointer was used before it was verified against nullptr. Check lines: 1966, 1967. irbuilderasmjs.cpp 1966
- V595 The 'symDst' pointer was used before it was verified against nullptr. Check lines: 2010, 2011. irbuilderasmjs.cpp 2010
- V595 The 'symDst' pointer was used before it was verified against nullptr. Check lines: 2076, 2077. irbuilderasmjs.cpp 2076
- V595 The 'symDst' pointer was used before it was verified against nullptr. Check lines: 3591, 3592. irbuilderasmjs.cpp 3591
- V595 The 'symDst' pointer was used before it was verified against nullptr. Check lines: 4113, 4114. irbuilderasmjs.cpp 4113
- V595 The 'symDst' pointer was used before it was verified against nullptr. Check lines: 4510, 4511. irbuilderasmjs.cpp 4510
- V595 The m_lastInstr 'pointer was used before it was verified against nullptr. Check lines: 1102, 1116. irbuilder.cpp 1102
The list contains the most simple and clear examples. To explore all these places, developers should study the analyzer’s report themselves.
V522 Dereferencing of the null pointer 'tempNumberTracker' might take place. backwardpass.cpp 578
void BackwardPass::MergeSuccBlocksInfo(BasicBlock * block) { TempNumberTracker * tempNumberTracker = nullptr;
An example of another diagnostic, but also about pointers. Here is a fragment of the function MergeSuccBlocksInfo (), it is quite long - 707 lines. But with the help of static analysis, it was possible to find the pointer “tempNumberTracker”, the initialization of which could potentially fail due to a number of conditions. As a result, in case of an unsuccessful set of circumstances, a null pointer will be dereferenced.
Stop it! Check out the Assert!
Assert, placed in a program, indicates that the developer assumes that some expression is true for a program that works correctly. But is it possible to trust such “successful” checks?
V547 Expression 'srcIndex - src-> left> = 0' is always true. Unsigned type value is always> = 0. sparsearraysegment.inl 355
class SparseArraySegmentBase { public: static const uint32 MaxLength; .... uint32 size; .... } template<typename T> SparseArraySegment<T>* SparseArraySegment<T>::CopySegment(...., uint32 srcIndex, ....) { .... AssertMsg(srcIndex - src->left >= 0,
Pay attention to the comparison "srcIndex - src-> left> = 0". The difference of two unsigned numbers will always be greater than or equal to zero. Further, this formula is used in functions for working with memory. The result may not be the same as the programmer expected.
V547 Expression is always true. Probably the '&&' operator should be used here. bytecodegenerator.cpp 805
void ByteCodeGenerator::AssignRegister(Symbol *sym) { AssertMsg(sym->GetDecl() == nullptr || sym->GetDecl()->nop != knopConstDecl ||
In this Assert, some values ​​are partially tested. If the assumption "sym-> GetDecl () == nullptr" is false, then the following conditions are always true. You can verify this by building a truth table:
V547 Expression 'callSiteId> = 0' is always true. Unsigned type value is always> = 0. inline.cpp 1181
typedef uint16 ProfileId; Func * Inline::BuildInlinee(Js::FunctionBody* funcBody, ....) { .... Js::ProfileId callSiteId = static_cast<Js::ProfileId>(....); Assert(callSiteId >= 0); .... }
In this and two other places, the analyzer detected an incorrect comparison of an unsigned number with zero:
- V547 Expression 'callSiteId> = 0' is always true. Unsigned type value is always> = 0. inline.cpp 2627
- V547 Expression 'callSiteId> = 0' is always true. Unsigned type value is always> = 0. inline.cpp 3657
Conclusion
Microsoft has noticed a positive trend to open its projects under free licenses. For us, this is additional testing of the analyzer on new projects, as well as the opportunity to demonstrate the usefulness and effectiveness of static analysis on projects of such a large and well-known software manufacturer.
You may be interested in a list of all
verified projects , including other projects from Microsoft, such as .NET CoreCLR, .NET CoreFX and Microsoft Code Contracts.
If you want to share this article with an English-speaking audience, then please use the link to the translation: Svyatoslav Razmyslov.
ChakraCore: analysis of JavaScript-engines for Microsoft Edge .