More than two years have passed since the last verification of the LLVM project code with the help of our PVS-Studio analyzer. Let's make sure that the PVS-Studio analyzer is still the leading tool for identifying errors and potential vulnerabilities. To do this, check and find new bugs in the release of LLVM 8.0.0.
Article to be written
To be honest, I did not want to write this article. It is not interesting to write about the project, which we have repeatedly checked (
1 ,
2 ,
3 ). It is better to write about something new, but I have no choice.
Every time a new version of LLVM comes out or the
Clang Static Analyzer is updated, we have the following type of questions in the mail:
See, the new version of Clang Static Analyzer has learned to find new errors! It seems to me that the relevance of using PVS-Studio is decreasing. Clang finds more errors than before and catches up with the capabilities of PVS-Studio. What do you think about it?')
To this I always want to answer something in the spirit:
We also do not sit idle! We have significantly improved the capabilities of the PVS-Studio analyzer. So do not worry, we continue to lead, as before.
Unfortunately, this is a bad answer. There is no proof. And that is why I am writing this article now. So, the LLVM project is once again checked and various errors are found in it. Those that seemed interesting to me, I will now demonstrate. These errors cannot be found by Clang Static Analyzer (or it is extremely inconvenient to do with it). And we can. And I found and wrote out all these errors in one evening.
But writing the article was delayed for several weeks. I could not force myself to put all this in the form of text :).
By the way, if you are wondering what technologies are used in the PVS-Studio analyzer to detect errors and potential vulnerabilities, then I suggest
reading this
note .
New and old diagnostics
As already noted, about two years ago, the LLVM project was once again checked, and the errors found were corrected. Now this article will present a new portion of errors. Why were new bugs found? There are 3 reasons for this:
- The LLVM project develops, the old code changes, and a new one appears. Naturally in the modified and written code there are new errors. This clearly demonstrates that static analysis should be applied regularly, and not occasionally. Our articles show well the capabilities of the PVS-Studio analyzer, but this has nothing to do with improving the quality of the code and reducing the cost of fixing errors. Use static code analyzer regularly!
- We are refining and improving existing diagnostics. Therefore, the analyzer can detect errors that were not noticed during previous checks.
- New diagnostics have appeared in PVS-Studio, which were absent 2 years ago. I decided to allocate them in a separate section in order to demonstrate the development of PVS-Studio.
Defects identified by diagnostics that existed 2 years ago
Fragment N1: Copy-Pastestatic bool ShouldUpgradeX86Intrinsic(Function *F, StringRef Name) { if (Name == "addcarryx.u32" ||
PVS-Studio
warning :
V501 [CWE-570] There are identical sub-expressions' Name.startswith ("avx512.mask.permvar.") operator. AutoUpgrade.cpp 73
It is checked twice that the name begins with the substring "avx512.mask.permvar.". In the second check, we obviously wanted to write something else, but forgot to correct the copied text.
Fragment N2: Typo enum CXNameRefFlags { CXNameRange_WantQualifier = 0x1, CXNameRange_WantTemplateArgs = 0x2, CXNameRange_WantSinglePiece = 0x4 }; void AnnotateTokensWorker::HandlePostPonedChildCursor( CXCursor Cursor, unsigned StartTokenIndex) { const auto flags = CXNameRange_WantQualifier | CXNameRange_WantQualifier; .... }
PVS-Studio warning: V501 There are identical sub-expressions 'CXNameRange_WantQualifier' to the left and the right of the '|' operator. CIndex.cpp 7245
Because of a typo, the same named constant
CXNameRange_WantQualifier is used twice .
Fragment N3: Confusion with the priorities of operators int PPCTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val, unsigned Index) { .... if (ISD == ISD::EXTRACT_VECTOR_ELT && Index == ST->isLittleEndian() ? 1 : 0) return 0; .... }
PVS-Studio
warning :
V502 [CWE-783] Perhaps the '?:' Operator works in a different way. The '?:' Operator has a lower limit than the '==' operator. PPCTargetTransformInfo.cpp 404
In my opinion, this is a very beautiful mistake. Yes, I know that I have strange ideas about beauty :).
Now, according to the
priorities of the operators , the expression is calculated as follows:
(ISD == ISD::EXTRACT_VECTOR_ELT && (Index == ST->isLittleEndian())) ? 1 : 0
From a practical point of view, this condition does not make sense, since it can be reduced to:
(ISD == ISD::EXTRACT_VECTOR_ELT && Index == ST->isLittleEndian())
This is an obvious mistake. Most likely, 0/1 would be compared with the
Index variable. To fix the code you need to add brackets around the ternary operator:
if (ISD == ISD::EXTRACT_VECTOR_ELT && Index == (ST->isLittleEndian() ? 1 : 0))
By the way, the ternary operator is very dangerous and provokes logical errors. Be very careful with him and do not be greedy to put parentheses. I covered this topic in more detail
here , in the chapter “Fear the Operator?: And enclose it in parentheses”.
Fragment N4, N5: Null Index Init *TGParser::ParseValue(Record *CurRec, RecTy *ItemType, IDParseMode Mode) { .... TypedInit *LHS = dyn_cast<TypedInit>(Result); .... LHS = dyn_cast<TypedInit>( UnOpInit::get(UnOpInit::CAST, LHS, StringRecTy::get()) ->Fold(CurRec)); if (!LHS) { Error(PasteLoc, Twine("can't cast '") + LHS->getAsString() + "' to string"); return nullptr; } .... }
PVS-Studio
warning :
V522 [CWE-476] Dereferencing of the null pointer 'LHS' might take place. TGParser.cpp 2152
If the
LHS appears to be zero, a warning should be issued. However, instead of this, this null pointer itself will be dereferenced:
LHS-> getAsString () .
This is a very typical situation when an error is hidden in the error handler, as no one tests them. Static analyzers check all reachable code, no matter how often it is used. This is a very good example of how static analysis complements other testing and error protection techniques.
A similar
RHS pointer processing error was made in the code just below: V522 [CWE-476] Dereferencing of the null pointer 'RHS' might take place. TGParser.cpp 2186
Fragment N6: Using the pointer after moving static Expected<bool> ExtractBlocks(....) { .... std::unique_ptr<Module> ProgClone = CloneModule(BD.getProgram(), VMap); .... BD.setNewProgram(std::move(ProgClone));
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'ProgClone' might take place. Miscompilation.cpp 601
At the beginning, the smart pointer
ProgClone ceases to own the object:
BD.setNewProgram(std::move(ProgClone));
In fact, now
ProgClone is a null pointer. Therefore, a null pointer dereference should occur just below:
Function *NewF = ProgClone->getFunction(MisCompFunctions[i].first);
But, in fact, this will not happen! Note that the loop is not actually executed.
At the beginning, the
MiscompiledFunctions container is cleared:
MiscompiledFunctions.clear();
Further, the size of this container is used in the loop condition:
for (unsigned i = 0, e = MisCompFunctions.size(); i != e; ++i) {
It is easy to see that the cycle does not start. I think this is also a mistake, and the code should be written in a different way.
It seems that we met the very famous parity of errors! One mistake masks another :).
Fragment N7: Using the pointer after moving static Expected<bool> TestOptimizer(BugDriver &BD, std::unique_ptr<Module> Test, std::unique_ptr<Module> Safe) { outs() << " Optimizing functions being tested: "; std::unique_ptr<Module> Optimized = BD.runPassesOn(Test.get(), BD.getPassesToRun()); if (!Optimized) { errs() << " Error running this sequence of passes" << " on the input program!\n"; BD.setNewProgram(std::move(Test));
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Test' might take place. Miscompilation.cpp 709
Again the same situation. At the beginning, the contents of the object are moved, and then it is used as if nothing had happened. I increasingly encounter this situation in the program code, after the semantics of displacement appeared in C ++. That's why I love the C ++ language! There are more and more new ways to shoot yourself a leg. PVS-Studio analyzer will always have a job :).
Fragment N8: Null Index void FunctionDumper::dump(const PDBSymbolTypeFunctionArg &Symbol) { uint32_t TypeId = Symbol.getTypeId(); auto Type = Symbol.getSession().getSymbolById(TypeId); if (Type) Printer << "<unknown-type>"; else Type->dump(*this); }
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Type' might take place. PrettyFunctionDumper.cpp 233
In addition to error handlers, debugging printout functions are usually not tested. Before us is just such a case. The function waits for the user, who, instead of solving his problems, will be forced to fix it.
Right:
if (Type) Type->dump(*this); else Printer << "<unknown-type>";
Fragment N9: Null Index void SearchableTableEmitter::collectTableEntries( GenericTable &Table, const std::vector<Record *> &Items) { .... RecTy *Ty = resolveTypes(Field.RecType, TI->getType()); if (!Ty)
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Ty' might take place. SearchableTableEmitter.cpp 614
I think everything is clear and requires no explanation.
Fragment N10: Typo bool FormatTokenLexer::tryMergeCSharpNullConditionals() { .... auto &Identifier = *(Tokens.end() - 2); auto &Question = *(Tokens.end() - 1); .... Identifier->ColumnWidth += Question->ColumnWidth; Identifier->Type = Identifier->Type;
PVS-Studio
warning :
V570 The 'Identifier-> Type' variable is assigned to itself. FormatTokenLexer.cpp 249
It makes no sense to assign a variable to itself. Most likely they wanted to write:
Identifier->Type = Question->Type;
Fragment N11: Suspicious break void SystemZOperand::print(raw_ostream &OS) const { switch (Kind) { break; case KindToken: OS << "Token:" << getToken(); break; case KindReg: OS << "Reg:" << SystemZInstPrinter::getRegisterName(getReg()); break; .... }
PVS-Studio
warning :
V622 [CWE-478] Consider inspecting the 'switch' statement. It's possible that the operator is missing. SystemZAsmParser.cpp 652
At the beginning there is a very suspicious
break statement. Have you forgotten to write something else?
Fragment N12: Check pointer after dereference InlineCost AMDGPUInliner::getInlineCost(CallSite CS) { Function *Callee = CS.getCalledFunction(); Function *Caller = CS.getCaller(); TargetTransformInfo &TTI = TTIWP->getTTI(*Callee); if (!Callee || Callee->isDeclaration()) return llvm::InlineCost::getNever("undefined callee"); .... }
PVS-Studio
warning :
V595 [CWE-476] Check lines: 172, 174. AMDGPUInline.cpp 172
The
Callee pointer is first dereferenced at the time of the
getTTI function
call .
And then it turns out that this pointer should be checked for equality
nullptr :
if (!Callee || Callee->isDeclaration())
But it's' too late…
Fragment N13 - N ...: Check pointer after dereferenceThe situation described in the previous code snippet is not unique. She is found here:
static Value *optimizeDoubleFP(CallInst *CI, IRBuilder<> &B, bool isBinary, bool isPrecise = false) { .... Function *CalleeFn = CI->getCalledFunction(); StringRef CalleeNm = CalleeFn->getName();
PVS-Studio warning: V595 [CWE-476] Check lines: 1079, 1081. SimplifyLibCalls.cpp 1079
And here:
void Sema::InstantiateAttrs(const MultiLevelTemplateArgumentList &TemplateArgs, const Decl *Tmpl, Decl *New, LateInstantiatedAttrVec *LateAttrs, LocalInstantiationScope *OuterMostScope) { .... NamedDecl *ND = dyn_cast<NamedDecl>(New); CXXRecordDecl *ThisContext = dyn_cast_or_null<CXXRecordDecl>(ND->getDeclContext());
PVS-Studio warning: V595 [CWE-476] Check lines: 532, 534. SemaTemplateInstantiateDecl.cpp 532
And here:
- V595 [CWE-476] The 'U' pointer was used before it was verified against nullptr. Check lines: 404, 407. DWARFFormValue.cpp 404
- V595 [CWE-476] The 'ND' pointer was used against nullptr. Check lines: 2149, 2151. SemaTemplateInstantiate.cpp 2149
And then I was not interested in studying the warnings with the number V595. So I do not know if there are any more similar errors besides those listed here. Most likely there.
Fragment N17, N18: Suspicious Shift static inline bool processLogicalImmediate(uint64_t Imm, unsigned RegSize, uint64_t &Encoding) { .... unsigned Size = RegSize; .... uint64_t NImms = ~(Size-1) << 1; .... }
PVS-Studio
warning :
V629 [CWE-190] Consider inspecting the '~ (Size - 1) << 1' expression. Bit shifting of the 32-bit type. AArch64AddressingModes.h 260
Perhaps this is not an error, and the code works exactly as intended. But this is clearly a very suspicious place, and it needs to be checked.
Suppose the
Size variable is 16, and then the author of the code planned to get the value in the
NImms variable:
11111111111111111111111111111111111111111111111111111111100000
However, in fact, you get the value:
00000000000000000000000000000011111111111111111111111111100000
The point is that all calculations are performed using the 32-bit unsigned type. And only then, this 32-bit unsigned type will be implicitly extended to
uint64_t . In this case, the high bits will be zero.
You can fix the situation like this:
uint64_t NImms = ~static_cast<uint64_t>(Size-1) << 1;
Similar situation: V629 [CWE-190] Consider inspecting the 'Immr << 6' expression. Bit shifting of the 32-bit type. AArch64AddressingModes.h 269
Fragment N19: Missing the else keyword ? void AMDGPUAsmParser::cvtDPP(MCInst &Inst, const OperandVector &Operands) { .... if (Op.isReg() && Op.Reg.RegNo == AMDGPU::VCC) {
PVS-Studio
warning :
V646 [CWE-670] Consider inspecting the application's logic. It's possible that 'else' keyword is missing. AMDGPUAsmParser.cpp 5655
There is no error here. Since the then-block of the first
if ends in
continue , it does not matter
if the keyword is
else or not. In any case, the code will work the same way. However, missing
else makes the code more obscure and dangerous. If continue
continue disappears, the code will start to work in a completely different way. In my opinion, it is better to add
else .
Fragment N20: Four of the same type of typo LLVM_DUMP_METHOD void Symbol::dump(raw_ostream &OS) const { std::string Result; if (isUndefined()) Result += "(undef) "; if (isWeakDefined()) Result += "(weak-def) "; if (isWeakReferenced()) Result += "(weak-ref) "; if (isThreadLocalValue()) Result += "(tlv) "; switch (Kind) { case SymbolKind::GlobalSymbol: Result + Name.str();
PVS-Studio warnings:
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + Name.str ()' expression. Symbol.cpp 32
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + "(ObjC Class)" + Name.str ()' expression. Symbol.cpp 35
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + "(ObjC Class EH)" + Name.str ()' expression. Symbol.cpp 38
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + "(ObjC IVar)" + Name.str ()' expression. Symbol.cpp 41
Randomly, instead of the + operator =, the + operator is used. As a result, constructions are devoid of meaning.
Fragment N21: Uncertain behavior static void getReqFeatures(std::map<StringRef, int> &FeaturesMap, const std::vector<Record *> &ReqFeatures) { for (auto &R : ReqFeatures) { StringRef AsmCondString = R->getValueAsString("AssemblerCondString"); SmallVector<StringRef, 4> Ops; SplitString(AsmCondString, Ops, ","); assert(!Ops.empty() && "AssemblerCondString cannot be empty"); for (auto &Op : Ops) { assert(!Op.empty() && "Empty operator"); if (FeaturesMap.find(Op) == FeaturesMap.end()) FeaturesMap[Op] = FeaturesMap.size(); } } }
Try to find a dangerous code yourself. And this is a picture to divert attention, so as not to immediately look at the answer:
PVS-Studio
warning :
V708 [CWE-758] Dangerous construction is used: 'FeaturesMap [Op] = FeaturesMap.size ()', where 'FeaturesMap' is of 'map' class. This may lead to undefined behavior. RISCVCompressInstEmitter.cpp 490
Problem line:
FeaturesMap[Op] = FeaturesMap.size();
If the element
Op is not found, then a new element is created in the map and the number of elements in this map is recorded there. It’s just not known if the
size function will be called before or after adding a new element.
Fragment N22-N24: Reassignment Error MachOObjectFile::checkSymbolTable() const { .... } else { MachO::nlist STE = getSymbolTableEntry(SymDRI); NType = STE.n_type;
PVS-Studio
warning :
V519 [CWE-563] The 'NType' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 1663, 1664. MachOObjectFile.cpp 1664
I think there is no real error here. Just an extra recurring assignment. But still a blooper.
Similarly:
- V519 [CWE-563] The 'B.NDesc' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 1488, 1489. llvm-nm.cpp 1489
- V519 [CWE-563] The variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 59, 61. coff2yaml.cpp 61
Fragment N25-N27: More reassignmentsNow consider a slightly different reassignment option.
bool Vectorizer::vectorizeLoadChain( ArrayRef<Instruction *> Chain, SmallPtrSet<Instruction *, 16> *InstructionsProcessed) { .... unsigned Alignment = getAlignment(L0); .... unsigned NewAlign = getOrEnforceKnownAlignment(L0->getPointerOperand(), StackAdjustedAlignment, DL, L0, nullptr, &DT); if (NewAlign != 0) Alignment = NewAlign; Alignment = NewAlign; .... }
PVS-Studio warning: V519 [CWE-563] The 'Alignment' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 1158, 1160. LoadStoreVectorizer.cpp 1160
This is a very strange code, which apparently contains a logical error. In the beginning, the variable
Alignment is assigned a value depending on the condition. And then assignment occurs again, but now without any verification.
Similar situations can be seen here:
- V519 [CWE-563] The 'Effects' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 152, 165. WebAssemblyRegStackify.cpp 165
- V519 [CWE-563] The 'ExpectNoDerefChunk' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 4970, 4973. SemaType.cpp 4973
Fragment N28: Always true condition static int readPrefixes(struct InternalInstruction* insn) { .... uint8_t byte = 0; uint8_t nextByte; .... if (byte == 0xf3 && (nextByte == 0x88 || nextByte == 0x89 || nextByte == 0xc6 || nextByte == 0xc7)) { insn->xAcquireRelease = true; if (nextByte != 0x90)
PVS-Studio
warning :
V547 [CWE-571] Expression 'nextByte! = 0x90' is always true. X86DisassemblerDecoder.cpp 379
Check does not make sense. The variable
nextByte is always not equal to the value
0x90 , which follows from the previous check. This is some kind of logical error.
Fragment N29 - N ...: Always true / false conditionsThe analyzer generates many warnings that the entire condition (
V547 ) or part of it (
V560 ) is always true or false. Often, these are not real errors, but simply inaccurate code, the result of macro deployment, and the like. However, it makes sense to look at all these warnings, as from time to time there are true logical errors. For example, this section of code is suspicious:
static DecodeStatus DecodeGPRPairRegisterClass(MCInst &Inst, unsigned RegNo, uint64_t Address, const void *Decoder) { DecodeStatus S = MCDisassembler::Success; if (RegNo > 13) return MCDisassembler::Fail; if ((RegNo & 1) || RegNo == 0xe) S = MCDisassembler::SoftFail; .... }
PVS-Studio
warning :
V560 [CWE-570] A part of the conditional expression is always false: RegNo == 0xe. ARMDisassembler.cpp 939
The constant 0xE is a value of 14 in the decimal system. Checking
RegNo == 0xe does not make sense, since if
RegNo is> 13 , then the function will complete its execution.
There were many other warnings with the identifier V547 and V560, but, as in the case of
V595 , I was not interested in studying these warnings. It was clear that I had enough material to write an article :). Therefore, it is not known how many errors of this type can be identified in LLVM with the help of PVS-Studio.
I will give an example why it is boring to study these triggers. The analyzer is absolutely right when issuing a warning to the following code. But this is not a mistake.
bool UnwrappedLineParser::parseBracedList(bool ContinueOnSemicolons, tok::TokenKind ClosingBraceKind) { bool HasError = false; .... HasError = true; if (!ContinueOnSemicolons) return !HasError; .... }
PVS-Studio warning: V547 [CWE-570] Expression '! HasError' is always false. UnwrappedLineParser.cpp 1635
Fragment N30: ​​Suspicious return static bool isImplicitlyDef(MachineRegisterInfo &MRI, unsigned Reg) { for (MachineRegisterInfo::def_instr_iterator It = MRI.def_instr_begin(Reg), E = MRI.def_instr_end(); It != E; ++It) { return (*It).isImplicitDef(); } .... }
PVS-Studio
warning :
V612 [CWE-670] An unconditional 'return' within a loop. R600OptimizeVectorRegisters.cpp 63
This is either an error or a specific technique that is intended to explain something to programmers who read the code. Such a construction does not explain anything to me and looks very suspicious. It’s better not to write like that :).
Are you tired? Then it's time to make tea or coffee.
Defects revealed by new diagnostics
I think 30 responses of old diagnostics are enough. Let's now see what is interesting can be found by new diagnostics, which appeared in the analyzer after the
previous test. In total, 66 general-purpose diagnostics were added to the C ++ analyzer.
Fragment N31: Unreachable Code Error CtorDtorRunner::run() { .... if (auto CtorDtorMap = ES.lookup(JITDylibSearchList({{&JD, true}}), std::move(Names), NoDependenciesToRegister, true)) { .... return Error::success(); } else return CtorDtorMap.takeError(); CtorDtorsByPriority.clear(); return Error::success(); }
PVS-Studio
warning :
V779 [CWE-561] Unreachable code detected. It is possible that an error is present. ExecutionUtils.cpp 146
As you can see, both branches of the
if statement end with a call to the
return statement . Accordingly, the
CtorDtorsByPriority container
will never be
emptied .
Fragment N32: Unreachable Code bool LLParser::ParseSummaryEntry() { .... switch (Lex.getKind()) { case lltok::kw_gv: return ParseGVEntry(SummaryID); case lltok::kw_module: return ParseModuleEntry(SummaryID); case lltok::kw_typeid: return ParseTypeIdEntry(SummaryID);
PVS-Studio warning: V779 [CWE-561] Unreachable code detected. It is possible that an error is present. LLParser.cpp 835
An interesting situation. Let's look at this place at the beginning:
return ParseTypeIdEntry(SummaryID); break;
At first glance it seems that there is no error here. It seems that the
break statement here is superfluous, and you can simply delete it. However, not all so simple.
The analyzer issues a warning on the lines:
Lex.setIgnoreColonInIdentifiers(false); return false;
And indeed, this code is unreachable. All cases in
switch end with a call to the
return statement . And now the senseless lonely
break does not look so harmless! Perhaps one of the branches should end in
break , not
return ?
Fragment N33: Random resetting of high bits unsigned getStubAlignment() override { if (Arch == Triple::systemz) return 8; else return 1; } Expected<unsigned> RuntimeDyldImpl::emitSection(const ObjectFile &Obj, const SectionRef &Section, bool IsCode) { .... uint64_t DataSize = Section.getSize(); .... if (StubBufSize > 0) DataSize &= ~(getStubAlignment() - 1); .... }
PVS-Studio
warning :
V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. RuntimeDyld.cpp 815
Notice that the
getStubAlignment function returns
unsigned type. We calculate the value of the expression, if we assume that the function returns the value 8:
~ (getStubAlignment () - 1)
~ (8u-1)
0xFFFFFFF8u
Now notice that the
DataSize variable is of a 64-bit unsigned type. It turns out that when the DataSize & 0xFFFFFFF8u operation is performed, all thirty two high bits will be reset. Most likely, this is not what the programmer wanted. I suspect that he wanted to calculate: DataSize & 0xFFFFFFFFFFFFFFF8u.
To correct the error, you should write this:
DataSize &= ~(static_cast<uint64_t>(getStubAlignment()) - 1);
Or so:
DataSize &= ~(getStubAlignment() - 1ULL);
Fragment N34: Failure of an explicit type conversion template <typename T> void scaleShuffleMask(int Scale, ArrayRef<T> Mask, SmallVectorImpl<T> &ScaledMask) { assert(0 < Scale && "Unexpected scaling factor"); int NumElts = Mask.size(); ScaledMask.assign(static_cast<size_t>(NumElts * Scale), -1); .... }
PVS-Studio
warning :
V1028 [CWE-190] Possible overflow. Consider casting operands for the 'NumElts * Scale' operator for the 'size_t' type, not the result. X86ISelLowering.h 1577
Explicit type casting is used to prevent overflow when multiplying variables of type
int . However, here the explicit type conversion does not protect against overflow. At the beginning, the variables will be multiplied, and only then the 32-bit result of the multiplication will be expanded to the
size_t type.
Fragment N35: Unsuccessful Copy-Paste Instruction *InstCombiner::visitFCmpInst(FCmpInst &I) { .... if (!match(Op0, m_PosZeroFP()) && isKnownNeverNaN(Op0, &TLI)) { I.setOperand(0, ConstantFP::getNullValue(Op0->getType())); return &I; } if (!match(Op1, m_PosZeroFP()) && isKnownNeverNaN(Op1, &TLI)) { I.setOperand(1, ConstantFP::getNullValue(Op0->getType()));
V778 [CWE-682] Two similar code fragments were found. Perhaps this is a typo and 'Op1' variable should be used instead of 'Op0'. InstCombineCompares.cpp 5507This new interesting diagnostics reveals situations when a code fragment was copied, and in it they began to change some names, but did not correct them in one place.Please note that in the second block Op0 was changed to Op1 . But in one place did not correct. Most likely, it should have been written like this: if (!match(Op1, m_PosZeroFP()) && isKnownNeverNaN(Op1, &TLI)) { I.setOperand(1, ConstantFP::getNullValue(Op1->getType())); return &I; }
Fragment N36: Confusion in the variables struct Status { unsigned Mask; unsigned Mode; Status() : Mask(0), Mode(0){}; Status(unsigned Mask, unsigned Mode) : Mask(Mask), Mode(Mode) { Mode &= Mask; }; .... };
PVS-Studio warning : V1001 [CWE-563] The 'Mode' variable is assigned. SIModeRegister.cpp 48It is very dangerous to give the arguments of functions the same names as members of a class. It is very easy to get confused. Before us is just such a case. This expression does not make sense: Mode &= Mask;
The function argument is changing. And that's all.
This argument is no longer used. Most likely, it was necessary to write this: Status(unsigned Mask, unsigned Mode) : Mask(Mask), Mode(Mode) { this->Mode &= Mask; };
Fragment N37: Confusion in the variables class SectionBase { .... uint64_t Size = 0; .... }; class SymbolTableSection : public SectionBase { .... }; void SymbolTableSection::addSymbol(Twine Name, uint8_t Bind, uint8_t Type, SectionBase *DefinedIn, uint64_t Value, uint8_t Visibility, uint16_t Shndx, uint64_t Size) { .... Sym.Value = Value; Sym.Visibility = Visibility; Sym.Size = Size; Sym.Index = Symbols.size(); Symbols.emplace_back(llvm::make_unique<Symbol>(Sym)); Size += this->EntrySize; }
PVS-Studio warning: V1001 [CWE-563] The size of the variable is assigned. Object.cpp 424The situation is similar to the previous one. It should be written: this->Size += this->EntrySize;
Fragment N38-N47: We forgot to check the pointer.Previously, we considered examples of how the V595 diagnostics work . Its essence is that the pointer at the beginning is dereferenced, and only then it is checked. Young diagnostics V1004 is the opposite of its meaning, but also detects a lot of errors. It reveals situations when the pointer was checked at the beginning and then forgot to do it. Consider such cases found inside LLVM. int getGEPCost(Type *PointeeType, const Value *Ptr, ArrayRef<const Value *> Operands) { .... if (Ptr != nullptr) {
PVS-Studio warning: V1004 [CWE-476] The 'Ptr' pointer was used unsafely after it was verified against nullptr. Check lines: 729, 738. TargetTransformInfoImpl.h 738The Ptr variable can be equal to nullptr , as evidenced by the check: if (Ptr != nullptr)
However, below this pointer is dereferenced without preliminary check: auto PtrSizeBits = DL.getPointerTypeSizeInBits(Ptr->getType());
Consider another similar case. llvm::DISubprogram *CGDebugInfo::getFunctionFwdDeclOrStub(GlobalDecl GD, bool Stub) { .... auto *FD = dyn_cast<FunctionDecl>(GD.getDecl()); SmallVector<QualType, 16> ArgTypes; if (FD)
PVS-Studio warning: V1004 [CWE-476] The “FD” pointer was used unsafely after it was verified against nullptr. Check lines: 3228, 3231. CGDebugInfo.cpp 3231Notice the FD pointer . I am sure that the problem is clearly visible and no special explanation is required.And further:
static void computePolynomialFromPointer(Value &Ptr, Polynomial &Result, Value *&BasePtr, const DataLayout &DL) { PointerType *PtrTy = dyn_cast<PointerType>(Ptr.getType()); if (!PtrTy) {
PVS-Studio warning: V1004 [CWE-476] The 'PtrTy' pointer was used unsafely after it was verified against nullptr. Check lines: 960, 965. InterleavedLoadCombinePass.cpp 965How to protect against such errors? Be attentive to the Code-Review and use the PVS-Studio static analyzer for regular code checking.There is no sense to cite other code fragments with errors of this type. I leave in the article only a list of warnings:- V1004 [CWE-476] The 'Expr' pointer was used unsafely after it was verified against nullptr. Check lines: 1049, 1078. DebugInfoMetadata.cpp 1078
- V1004 [CWE-476] The 'PI' pointer was used unsafely after it was verified against nullptr. Check lines: 733, 753. LegacyPassManager.cpp 753
- V1004 [CWE-476] The 'StatepointCall' pointer was used unsafely after it was verified against nullptr. Check lines: 4371, 4379. Verifier.cpp 4379
- V1004 [CWE-476] The 'RV' pointer was used unsafely after it was verified against nullptr. Check lines: 2263, 2268. TGParser.cpp 2268
- V1004 [CWE-476] The 'CalleeFn' pointer was used unsafely after it was verified against nullptr. Check lines: 1081, 1096. SimplifyLibCalls.cpp 1096
- V1004 [CWE-476] The 'TC' pointer was used unsafely after it was verified against nullptr. Check lines: 1819, 1824. Driver.cpp 1824
N48-N60: , ( ) std::unique_ptr<IRMutator> createISelMutator() { .... std::vector<std::unique_ptr<IRMutationStrategy>> Strategies; Strategies.emplace_back( new InjectorIRStrategy(InjectorIRStrategy::getDefaultOps())); .... }
PVS-Studio:
V1023 [CWE-460] A pointer without owner is added to the 'Strategies' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-isel-fuzzer.cpp 58
std::vector<std::unique_ptr<X>> xxx.push_back(new X) ,
X* std::unique_ptr<X> .
xxx.emplace_back(new X) , :
emplace_back .
It's not safe. If the vector is full, the memory is allotted. The memory reassignment operation may fail, with the result that the exception std :: bad_alloc will be generated . In this case, the pointer will be lost, and the created object will never be deleted.The safe solution is to create unique_ptr , which will own the pointer before the vector tries to re-allocate memory: xxx.push_back(std::unique_ptr<X>(new X))
Starting with C ++ 14, you can use 'std :: make_unique': xxx.push_back(std::make_unique<X>())
LLVM. , . ,
, , , .
, LLVM, PVS-Studio .
:
- V1023 [CWE-460] A pointer without owner is added to the 'Passes' container by the 'emplace_back' method. A memory leak will occur in case of an exception. PassManager.h 546
- V1023 [CWE-460] A pointer without owner is added to the 'AAs' container by the 'emplace_back' method. A memory leak will occur in case of an exception. AliasAnalysis.h 324
- V1023 [CWE-460] A pointer without owner is added to the 'Entries' container by the 'emplace_back' method. A memory leak will occur in case of an exception. DWARFDebugFrame.cpp 519
- V1023 [CWE-460] A pointer without owner is added to the 'AllEdges' container by the 'emplace_back' method. A memory leak will occur in case of an exception. CFGMST.h 268
- V1023 [CWE-460] A pointer without owner is added to the 'VMaps' container by the 'emplace_back' method. A memory leak will occur in case of an exception. SimpleLoopUnswitch.cpp 2012
- V1023 [CWE-460] A pointer without owner is added to the 'Records' container by the 'emplace_back' method. A memory leak will occur in case of an exception. FDRLogBuilder.h 30
- V1023 [CWE-460] A pointer without owner is added to the 'PendingSubmodules' container by the 'emplace_back' method. A memory leak will occur in case of an exception. ModuleMap.cpp 810
- V1023 [CWE-460] A pointer without owner is added to the 'Objects' container by the 'emplace_back' method. A memory leak will occur in case of an exception. DebugMap.cpp 88
- V1023 [CWE-460] A pointer without owner is added to the 'Strategies' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-isel-fuzzer.cpp 60
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 685
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 686
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 688
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 689
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 690
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 691
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 692
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 693
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 694
- V1023 [CWE-460] A pointer without owner is added to the 'Operands' container by the 'emplace_back' method. A memory leak will occur in case of an exception. GlobalISelEmitter.cpp 1911
- V1023 [CWE-460] A pointer without owner is added to the 'Stash' container by the 'emplace_back' method. A memory leak will occur in case of an exception. GlobalISelEmitter.cpp 2100
- V1023 [CWE-460] A pointer without owner is added to the 'Matchers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. GlobalISelEmitter.cpp 2702
Conclusion
In total, I wrote out 60 warnings, after which I stopped. Are there any other defects that PVS-Studio analyzer detects in LLVM? Yes there is. However, when I wrote out the code snippets for the article, it was late evening, or rather, even night, and I decided it was time to round out.I hope you were interested, and you want to try the PVS-Studio analyzer.You can download the analyzer and get a trial key on this page .Most importantly, use static analysis regularly. One-time checks performed by us to popularize the methodology of static analysis and PVS-Studio are not a normal scenario.Good luck in improving the quality and reliability of the code!
If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. Finding Bugs in LLVM 8 with PVS-Studio .