📜 ⬆️ ⬇️

Find bugs in LLVM 8 using the PVS-Studio analyzer

PVS-Studio and LLVM 8.0.0

More than two years have passed since the last verification of the LLVM project code with the help of our PVS-Studio analyzer. Let's make sure that the PVS-Studio analyzer is still the leading tool for identifying errors and potential vulnerabilities. To do this, check and find new bugs in the release of LLVM 8.0.0.

Article to be written


To be honest, I did not want to write this article. It is not interesting to write about the project, which we have repeatedly checked ( 1 , 2 , 3 ). It is better to write about something new, but I have no choice.

Every time a new version of LLVM comes out or the Clang Static Analyzer is updated, we have the following type of questions in the mail:

See, the new version of Clang Static Analyzer has learned to find new errors! It seems to me that the relevance of using PVS-Studio is decreasing. Clang finds more errors than before and catches up with the capabilities of PVS-Studio. What do you think about it?
')
To this I always want to answer something in the spirit:

We also do not sit idle! We have significantly improved the capabilities of the PVS-Studio analyzer. So do not worry, we continue to lead, as before.

Unfortunately, this is a bad answer. There is no proof. And that is why I am writing this article now. So, the LLVM project is once again checked and various errors are found in it. Those that seemed interesting to me, I will now demonstrate. These errors cannot be found by Clang Static Analyzer (or it is extremely inconvenient to do with it). And we can. And I found and wrote out all these errors in one evening.

But writing the article was delayed for several weeks. I could not force myself to put all this in the form of text :).

By the way, if you are wondering what technologies are used in the PVS-Studio analyzer to detect errors and potential vulnerabilities, then I suggest reading this note .

New and old diagnostics


As already noted, about two years ago, the LLVM project was once again checked, and the errors found were corrected. Now this article will present a new portion of errors. Why were new bugs found? There are 3 reasons for this:

  1. The LLVM project develops, the old code changes, and a new one appears. Naturally in the modified and written code there are new errors. This clearly demonstrates that static analysis should be applied regularly, and not occasionally. Our articles show well the capabilities of the PVS-Studio analyzer, but this has nothing to do with improving the quality of the code and reducing the cost of fixing errors. Use static code analyzer regularly!
  2. We are refining and improving existing diagnostics. Therefore, the analyzer can detect errors that were not noticed during previous checks.
  3. New diagnostics have appeared in PVS-Studio, which were absent 2 years ago. I decided to allocate them in a separate section in order to demonstrate the development of PVS-Studio.

Defects identified by diagnostics that existed 2 years ago


Fragment N1: Copy-Paste

static bool ShouldUpgradeX86Intrinsic(Function *F, StringRef Name) { if (Name == "addcarryx.u32" || // Added in 8.0 .... Name == "avx512.mask.cvtps2pd.128" || // Added in 7.0 Name == "avx512.mask.cvtps2pd.256" || // Added in 7.0 Name == "avx512.cvtusi2sd" || // Added in 7.0 Name.startswith("avx512.mask.permvar.") || // Added in 7.0 // <= Name.startswith("avx512.mask.permvar.") || // Added in 7.0 // <= Name == "sse2.pmulu.dq" || // Added in 7.0 Name == "sse41.pmuldq" || // Added in 7.0 Name == "avx2.pmulu.dq" || // Added in 7.0 .... } 

PVS-Studio warning : V501 [CWE-570] There are identical sub-expressions' Name.startswith ("avx512.mask.permvar.") operator. AutoUpgrade.cpp 73

It is checked twice that the name begins with the substring "avx512.mask.permvar.". In the second check, we obviously wanted to write something else, but forgot to correct the copied text.

Fragment N2: Typo

 enum CXNameRefFlags { CXNameRange_WantQualifier = 0x1, CXNameRange_WantTemplateArgs = 0x2, CXNameRange_WantSinglePiece = 0x4 }; void AnnotateTokensWorker::HandlePostPonedChildCursor( CXCursor Cursor, unsigned StartTokenIndex) { const auto flags = CXNameRange_WantQualifier | CXNameRange_WantQualifier; .... } 

PVS-Studio warning: V501 There are identical sub-expressions 'CXNameRange_WantQualifier' to the left and the right of the '|' operator. CIndex.cpp 7245

Because of a typo, the same named constant CXNameRange_WantQualifier is used twice .

Fragment N3: Confusion with the priorities of operators

 int PPCTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val, unsigned Index) { .... if (ISD == ISD::EXTRACT_VECTOR_ELT && Index == ST->isLittleEndian() ? 1 : 0) return 0; .... } 

PVS-Studio warning : V502 [CWE-783] Perhaps the '?:' Operator works in a different way. The '?:' Operator has a lower limit than the '==' operator. PPCTargetTransformInfo.cpp 404

In my opinion, this is a very beautiful mistake. Yes, I know that I have strange ideas about beauty :).

Now, according to the priorities of the operators , the expression is calculated as follows:

 (ISD == ISD::EXTRACT_VECTOR_ELT && (Index == ST->isLittleEndian())) ? 1 : 0 

From a practical point of view, this condition does not make sense, since it can be reduced to:

 (ISD == ISD::EXTRACT_VECTOR_ELT && Index == ST->isLittleEndian()) 

This is an obvious mistake. Most likely, 0/1 would be compared with the Index variable. To fix the code you need to add brackets around the ternary operator:

 if (ISD == ISD::EXTRACT_VECTOR_ELT && Index == (ST->isLittleEndian() ? 1 : 0)) 

By the way, the ternary operator is very dangerous and provokes logical errors. Be very careful with him and do not be greedy to put parentheses. I covered this topic in more detail here , in the chapter “Fear the Operator?: And enclose it in parentheses”.

Fragment N4, N5: Null Index

 Init *TGParser::ParseValue(Record *CurRec, RecTy *ItemType, IDParseMode Mode) { .... TypedInit *LHS = dyn_cast<TypedInit>(Result); .... LHS = dyn_cast<TypedInit>( UnOpInit::get(UnOpInit::CAST, LHS, StringRecTy::get()) ->Fold(CurRec)); if (!LHS) { Error(PasteLoc, Twine("can't cast '") + LHS->getAsString() + "' to string"); return nullptr; } .... } 

PVS-Studio warning : V522 [CWE-476] Dereferencing of the null pointer 'LHS' might take place. TGParser.cpp 2152

If the LHS appears to be zero, a warning should be issued. However, instead of this, this null pointer itself will be dereferenced: LHS-> getAsString () .

This is a very typical situation when an error is hidden in the error handler, as no one tests them. Static analyzers check all reachable code, no matter how often it is used. This is a very good example of how static analysis complements other testing and error protection techniques.

A similar RHS pointer processing error was made in the code just below: V522 [CWE-476] Dereferencing of the null pointer 'RHS' might take place. TGParser.cpp 2186

Fragment N6: Using the pointer after moving

 static Expected<bool> ExtractBlocks(....) { .... std::unique_ptr<Module> ProgClone = CloneModule(BD.getProgram(), VMap); .... BD.setNewProgram(std::move(ProgClone)); // <= MiscompiledFunctions.clear(); for (unsigned i = 0, e = MisCompFunctions.size(); i != e; ++i) { Function *NewF = ProgClone->getFunction(MisCompFunctions[i].first); // <= assert(NewF && "Function not found??"); MiscompiledFunctions.push_back(NewF); } .... } 

PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'ProgClone' might take place. Miscompilation.cpp 601

At the beginning, the smart pointer ProgClone ceases to own the object:

 BD.setNewProgram(std::move(ProgClone)); 

In fact, now ProgClone is a null pointer. Therefore, a null pointer dereference should occur just below:

 Function *NewF = ProgClone->getFunction(MisCompFunctions[i].first); 

But, in fact, this will not happen! Note that the loop is not actually executed.

At the beginning, the MiscompiledFunctions container is cleared:

 MiscompiledFunctions.clear(); 

Further, the size of this container is used in the loop condition:

 for (unsigned i = 0, e = MisCompFunctions.size(); i != e; ++i) { 

It is easy to see that the cycle does not start. I think this is also a mistake, and the code should be written in a different way.

It seems that we met the very famous parity of errors! One mistake masks another :).

Fragment N7: Using the pointer after moving

 static Expected<bool> TestOptimizer(BugDriver &BD, std::unique_ptr<Module> Test, std::unique_ptr<Module> Safe) { outs() << " Optimizing functions being tested: "; std::unique_ptr<Module> Optimized = BD.runPassesOn(Test.get(), BD.getPassesToRun()); if (!Optimized) { errs() << " Error running this sequence of passes" << " on the input program!\n"; BD.setNewProgram(std::move(Test)); // <= BD.EmitProgressBitcode(*Test, "pass-error", false); // <= if (Error E = BD.debugOptimizerCrash()) return std::move(E); return false; } .... } 

PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Test' might take place. Miscompilation.cpp 709

Again the same situation. At the beginning, the contents of the object are moved, and then it is used as if nothing had happened. I increasingly encounter this situation in the program code, after the semantics of displacement appeared in C ++. That's why I love the C ++ language! There are more and more new ways to shoot yourself a leg. PVS-Studio analyzer will always have a job :).

Fragment N8: Null Index

 void FunctionDumper::dump(const PDBSymbolTypeFunctionArg &Symbol) { uint32_t TypeId = Symbol.getTypeId(); auto Type = Symbol.getSession().getSymbolById(TypeId); if (Type) Printer << "<unknown-type>"; else Type->dump(*this); } 

PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Type' might take place. PrettyFunctionDumper.cpp 233

In addition to error handlers, debugging printout functions are usually not tested. Before us is just such a case. The function waits for the user, who, instead of solving his problems, will be forced to fix it.

Right:

 if (Type) Type->dump(*this); else Printer << "<unknown-type>"; 

Fragment N9: Null Index

 void SearchableTableEmitter::collectTableEntries( GenericTable &Table, const std::vector<Record *> &Items) { .... RecTy *Ty = resolveTypes(Field.RecType, TI->getType()); if (!Ty) // <= PrintFatalError(Twine("Field '") + Field.Name + "' of table '" + Table.Name + "' has incompatible type: " + Ty->getAsString() + " vs. " + // <= TI->getType()->getAsString()); .... } 

PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Ty' might take place. SearchableTableEmitter.cpp 614

I think everything is clear and requires no explanation.

Fragment N10: Typo

 bool FormatTokenLexer::tryMergeCSharpNullConditionals() { .... auto &Identifier = *(Tokens.end() - 2); auto &Question = *(Tokens.end() - 1); .... Identifier->ColumnWidth += Question->ColumnWidth; Identifier->Type = Identifier->Type; // <= Tokens.erase(Tokens.end() - 1); return true; } 

PVS-Studio warning : V570 The 'Identifier-> Type' variable is assigned to itself. FormatTokenLexer.cpp 249

It makes no sense to assign a variable to itself. Most likely they wanted to write:

 Identifier->Type = Question->Type; 

Fragment N11: Suspicious break
 void SystemZOperand::print(raw_ostream &OS) const { switch (Kind) { break; case KindToken: OS << "Token:" << getToken(); break; case KindReg: OS << "Reg:" << SystemZInstPrinter::getRegisterName(getReg()); break; .... } 

PVS-Studio warning : V622 [CWE-478] Consider inspecting the 'switch' statement. It's possible that the operator is missing. SystemZAsmParser.cpp 652

At the beginning there is a very suspicious break statement. Have you forgotten to write something else?

Fragment N12: Check pointer after dereference

 InlineCost AMDGPUInliner::getInlineCost(CallSite CS) { Function *Callee = CS.getCalledFunction(); Function *Caller = CS.getCaller(); TargetTransformInfo &TTI = TTIWP->getTTI(*Callee); if (!Callee || Callee->isDeclaration()) return llvm::InlineCost::getNever("undefined callee"); .... } 

PVS-Studio warning : V595 [CWE-476] Check lines: 172, 174. AMDGPUInline.cpp 172

The Callee pointer is first dereferenced at the time of the getTTI function call .

And then it turns out that this pointer should be checked for equality nullptr :

 if (!Callee || Callee->isDeclaration()) 

But it's' too late…

Fragment N13 - N ...: Check pointer after dereference

The situation described in the previous code snippet is not unique. She is found here:

 static Value *optimizeDoubleFP(CallInst *CI, IRBuilder<> &B, bool isBinary, bool isPrecise = false) { .... Function *CalleeFn = CI->getCalledFunction(); StringRef CalleeNm = CalleeFn->getName(); // <= AttributeList CalleeAt = CalleeFn->getAttributes(); if (CalleeFn && !CalleeFn->isIntrinsic()) { // <= .... } 

PVS-Studio warning: V595 [CWE-476] Check lines: 1079, 1081. SimplifyLibCalls.cpp 1079

And here:

 void Sema::InstantiateAttrs(const MultiLevelTemplateArgumentList &TemplateArgs, const Decl *Tmpl, Decl *New, LateInstantiatedAttrVec *LateAttrs, LocalInstantiationScope *OuterMostScope) { .... NamedDecl *ND = dyn_cast<NamedDecl>(New); CXXRecordDecl *ThisContext = dyn_cast_or_null<CXXRecordDecl>(ND->getDeclContext()); // <= CXXThisScopeRAII ThisScope(*this, ThisContext, Qualifiers(), ND && ND->isCXXInstanceMember()); // <= .... } 

PVS-Studio warning: V595 [CWE-476] Check lines: 532, 534. SemaTemplateInstantiateDecl.cpp 532

And here:


And then I was not interested in studying the warnings with the number V595. So I do not know if there are any more similar errors besides those listed here. Most likely there.

Fragment N17, N18: Suspicious Shift

 static inline bool processLogicalImmediate(uint64_t Imm, unsigned RegSize, uint64_t &Encoding) { .... unsigned Size = RegSize; .... uint64_t NImms = ~(Size-1) << 1; .... } 

PVS-Studio warning : V629 [CWE-190] Consider inspecting the '~ (Size - 1) << 1' expression. Bit shifting of the 32-bit type. AArch64AddressingModes.h 260

Perhaps this is not an error, and the code works exactly as intended. But this is clearly a very suspicious place, and it needs to be checked.

Suppose the Size variable is 16, and then the author of the code planned to get the value in the NImms variable:

11111111111111111111111111111111111111111111111111111111100000

However, in fact, you get the value:

00000000000000000000000000000011111111111111111111111111100000

The point is that all calculations are performed using the 32-bit unsigned type. And only then, this 32-bit unsigned type will be implicitly extended to uint64_t . In this case, the high bits will be zero.

You can fix the situation like this:

 uint64_t NImms = ~static_cast<uint64_t>(Size-1) << 1; 

Similar situation: V629 [CWE-190] Consider inspecting the 'Immr << 6' expression. Bit shifting of the 32-bit type. AArch64AddressingModes.h 269

Fragment N19: Missing the else keyword ?

 void AMDGPUAsmParser::cvtDPP(MCInst &Inst, const OperandVector &Operands) { .... if (Op.isReg() && Op.Reg.RegNo == AMDGPU::VCC) { // VOP2b (v_add_u32, v_sub_u32 ...) dpp use "vcc" token. // Skip it. continue; } if (isRegOrImmWithInputMods(Desc, Inst.getNumOperands())) { // <= Op.addRegWithFPInputModsOperands(Inst, 2); } else if (Op.isDPPCtrl()) { Op.addImmOperands(Inst, 1); } else if (Op.isImm()) { // Handle optional arguments OptionalIdx[Op.getImmTy()] = I; } else { llvm_unreachable("Invalid operand type"); } .... } 

PVS-Studio warning : V646 [CWE-670] Consider inspecting the application's logic. It's possible that 'else' keyword is missing. AMDGPUAsmParser.cpp 5655

There is no error here. Since the then-block of the first if ends in continue , it does not matter if the keyword is else or not. In any case, the code will work the same way. However, missing else makes the code more obscure and dangerous. If continue continue disappears, the code will start to work in a completely different way. In my opinion, it is better to add else .

Fragment N20: Four of the same type of typo

 LLVM_DUMP_METHOD void Symbol::dump(raw_ostream &OS) const { std::string Result; if (isUndefined()) Result += "(undef) "; if (isWeakDefined()) Result += "(weak-def) "; if (isWeakReferenced()) Result += "(weak-ref) "; if (isThreadLocalValue()) Result += "(tlv) "; switch (Kind) { case SymbolKind::GlobalSymbol: Result + Name.str(); // <= break; case SymbolKind::ObjectiveCClass: Result + "(ObjC Class) " + Name.str(); // <= break; case SymbolKind::ObjectiveCClassEHType: Result + "(ObjC Class EH) " + Name.str(); // <= break; case SymbolKind::ObjectiveCInstanceVariable: Result + "(ObjC IVar) " + Name.str(); // <= break; } OS << Result; } 

PVS-Studio warnings:


Randomly, instead of the + operator =, the + operator is used. As a result, constructions are devoid of meaning.

Fragment N21: Uncertain behavior

 static void getReqFeatures(std::map<StringRef, int> &FeaturesMap, const std::vector<Record *> &ReqFeatures) { for (auto &R : ReqFeatures) { StringRef AsmCondString = R->getValueAsString("AssemblerCondString"); SmallVector<StringRef, 4> Ops; SplitString(AsmCondString, Ops, ","); assert(!Ops.empty() && "AssemblerCondString cannot be empty"); for (auto &Op : Ops) { assert(!Op.empty() && "Empty operator"); if (FeaturesMap.find(Op) == FeaturesMap.end()) FeaturesMap[Op] = FeaturesMap.size(); } } } 

Try to find a dangerous code yourself. And this is a picture to divert attention, so as not to immediately look at the answer:

Hmmm ...


PVS-Studio warning : V708 [CWE-758] Dangerous construction is used: 'FeaturesMap [Op] = FeaturesMap.size ()', where 'FeaturesMap' is of 'map' class. This may lead to undefined behavior. RISCVCompressInstEmitter.cpp 490

Problem line:

 FeaturesMap[Op] = FeaturesMap.size(); 

If the element Op is not found, then a new element is created in the map and the number of elements in this map is recorded there. It’s just not known if the size function will be called before or after adding a new element.

Fragment N22-N24: Reassignment

 Error MachOObjectFile::checkSymbolTable() const { .... } else { MachO::nlist STE = getSymbolTableEntry(SymDRI); NType = STE.n_type; // <= NType = STE.n_type; // <= NSect = STE.n_sect; NDesc = STE.n_desc; NStrx = STE.n_strx; NValue = STE.n_value; } .... } 

PVS-Studio warning : V519 [CWE-563] The 'NType' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 1663, 1664. MachOObjectFile.cpp 1664

I think there is no real error here. Just an extra recurring assignment. But still a blooper.

Similarly:


Fragment N25-N27: More reassignments

Now consider a slightly different reassignment option.

 bool Vectorizer::vectorizeLoadChain( ArrayRef<Instruction *> Chain, SmallPtrSet<Instruction *, 16> *InstructionsProcessed) { .... unsigned Alignment = getAlignment(L0); .... unsigned NewAlign = getOrEnforceKnownAlignment(L0->getPointerOperand(), StackAdjustedAlignment, DL, L0, nullptr, &DT); if (NewAlign != 0) Alignment = NewAlign; Alignment = NewAlign; .... } 

PVS-Studio warning: V519 [CWE-563] The 'Alignment' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 1158, 1160. LoadStoreVectorizer.cpp 1160

This is a very strange code, which apparently contains a logical error. In the beginning, the variable Alignment is assigned a value depending on the condition. And then assignment occurs again, but now without any verification.

Similar situations can be seen here:


Fragment N28: Always true condition

 static int readPrefixes(struct InternalInstruction* insn) { .... uint8_t byte = 0; uint8_t nextByte; .... if (byte == 0xf3 && (nextByte == 0x88 || nextByte == 0x89 || nextByte == 0xc6 || nextByte == 0xc7)) { insn->xAcquireRelease = true; if (nextByte != 0x90) // PAUSE instruction support // <= break; } .... } 

PVS-Studio warning : V547 [CWE-571] Expression 'nextByte! = 0x90' is always true. X86DisassemblerDecoder.cpp 379

Check does not make sense. The variable nextByte is always not equal to the value 0x90 , which follows from the previous check. This is some kind of logical error.

Fragment N29 - N ...: Always true / false conditions

The analyzer generates many warnings that the entire condition ( V547 ) or part of it ( V560 ) is always true or false. Often, these are not real errors, but simply inaccurate code, the result of macro deployment, and the like. However, it makes sense to look at all these warnings, as from time to time there are true logical errors. For example, this section of code is suspicious:

 static DecodeStatus DecodeGPRPairRegisterClass(MCInst &Inst, unsigned RegNo, uint64_t Address, const void *Decoder) { DecodeStatus S = MCDisassembler::Success; if (RegNo > 13) return MCDisassembler::Fail; if ((RegNo & 1) || RegNo == 0xe) S = MCDisassembler::SoftFail; .... } 

PVS-Studio warning : V560 [CWE-570] A part of the conditional expression is always false: RegNo == 0xe. ARMDisassembler.cpp 939

The constant 0xE is a value of 14 in the decimal system. Checking RegNo == 0xe does not make sense, since if RegNo is> 13 , then the function will complete its execution.

There were many other warnings with the identifier V547 and V560, but, as in the case of V595 , I was not interested in studying these warnings. It was clear that I had enough material to write an article :). Therefore, it is not known how many errors of this type can be identified in LLVM with the help of PVS-Studio.

I will give an example why it is boring to study these triggers. The analyzer is absolutely right when issuing a warning to the following code. But this is not a mistake.

 bool UnwrappedLineParser::parseBracedList(bool ContinueOnSemicolons, tok::TokenKind ClosingBraceKind) { bool HasError = false; .... HasError = true; if (!ContinueOnSemicolons) return !HasError; .... } 

PVS-Studio warning: V547 [CWE-570] Expression '! HasError' is always false. UnwrappedLineParser.cpp 1635

Fragment N30: ​​Suspicious return

 static bool isImplicitlyDef(MachineRegisterInfo &MRI, unsigned Reg) { for (MachineRegisterInfo::def_instr_iterator It = MRI.def_instr_begin(Reg), E = MRI.def_instr_end(); It != E; ++It) { return (*It).isImplicitDef(); } .... } 

PVS-Studio warning : V612 [CWE-670] An unconditional 'return' within a loop. R600OptimizeVectorRegisters.cpp 63

This is either an error or a specific technique that is intended to explain something to programmers who read the code. Such a construction does not explain anything to me and looks very suspicious. It’s better not to write like that :).

Are you tired? Then it's time to make tea or coffee.

Coffee


Defects revealed by new diagnostics


I think 30 responses of old diagnostics are enough. Let's now see what is interesting can be found by new diagnostics, which appeared in the analyzer after the previous test. In total, 66 general-purpose diagnostics were added to the C ++ analyzer.

Fragment N31: Unreachable Code

 Error CtorDtorRunner::run() { .... if (auto CtorDtorMap = ES.lookup(JITDylibSearchList({{&JD, true}}), std::move(Names), NoDependenciesToRegister, true)) { .... return Error::success(); } else return CtorDtorMap.takeError(); CtorDtorsByPriority.clear(); return Error::success(); } 

PVS-Studio warning : V779 [CWE-561] Unreachable code detected. It is possible that an error is present. ExecutionUtils.cpp 146

As you can see, both branches of the if statement end with a call to the return statement . Accordingly, the CtorDtorsByPriority container will never be emptied .

Fragment N32: Unreachable Code

 bool LLParser::ParseSummaryEntry() { .... switch (Lex.getKind()) { case lltok::kw_gv: return ParseGVEntry(SummaryID); case lltok::kw_module: return ParseModuleEntry(SummaryID); case lltok::kw_typeid: return ParseTypeIdEntry(SummaryID); // <= break; // <= default: return Error(Lex.getLoc(), "unexpected summary kind"); } Lex.setIgnoreColonInIdentifiers(false); // <= return false; } 

PVS-Studio warning: V779 [CWE-561] Unreachable code detected. It is possible that an error is present. LLParser.cpp 835

An interesting situation. Let's look at this place at the beginning:

 return ParseTypeIdEntry(SummaryID); break; 

At first glance it seems that there is no error here. It seems that the break statement here is superfluous, and you can simply delete it. However, not all so simple.

The analyzer issues a warning on the lines:

 Lex.setIgnoreColonInIdentifiers(false); return false; 

And indeed, this code is unreachable. All cases in switch end with a call to the return statement . And now the senseless lonely break does not look so harmless! Perhaps one of the branches should end in break , not return ?

Fragment N33: Random resetting of high bits

 unsigned getStubAlignment() override { if (Arch == Triple::systemz) return 8; else return 1; } Expected<unsigned> RuntimeDyldImpl::emitSection(const ObjectFile &Obj, const SectionRef &Section, bool IsCode) { .... uint64_t DataSize = Section.getSize(); .... if (StubBufSize > 0) DataSize &= ~(getStubAlignment() - 1); .... } 

PVS-Studio warning : V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. RuntimeDyld.cpp 815

Notice that the getStubAlignment function returns unsigned type. We calculate the value of the expression, if we assume that the function returns the value 8:

~ (getStubAlignment () - 1)

~ (8u-1)

0xFFFFFFF8u

Now notice that the DataSize variable is of a 64-bit unsigned type. It turns out that when the DataSize & 0xFFFFFFF8u operation is performed, all thirty two high bits will be reset. Most likely, this is not what the programmer wanted. I suspect that he wanted to calculate: DataSize & 0xFFFFFFFFFFFFFFF8u.

To correct the error, you should write this:

 DataSize &= ~(static_cast<uint64_t>(getStubAlignment()) - 1); 

Or so:

 DataSize &= ~(getStubAlignment() - 1ULL); 

Fragment N34: Failure of an explicit type conversion

 template <typename T> void scaleShuffleMask(int Scale, ArrayRef<T> Mask, SmallVectorImpl<T> &ScaledMask) { assert(0 < Scale && "Unexpected scaling factor"); int NumElts = Mask.size(); ScaledMask.assign(static_cast<size_t>(NumElts * Scale), -1); .... } 

PVS-Studio warning : V1028 [CWE-190] Possible overflow. Consider casting operands for the 'NumElts * Scale' operator for the 'size_t' type, not the result. X86ISelLowering.h 1577

Explicit type casting is used to prevent overflow when multiplying variables of type int . However, here the explicit type conversion does not protect against overflow. At the beginning, the variables will be multiplied, and only then the 32-bit result of the multiplication will be expanded to the size_t type.

Fragment N35: Unsuccessful Copy-Paste

 Instruction *InstCombiner::visitFCmpInst(FCmpInst &I) { .... if (!match(Op0, m_PosZeroFP()) && isKnownNeverNaN(Op0, &TLI)) { I.setOperand(0, ConstantFP::getNullValue(Op0->getType())); return &I; } if (!match(Op1, m_PosZeroFP()) && isKnownNeverNaN(Op1, &TLI)) { I.setOperand(1, ConstantFP::getNullValue(Op0->getType())); // <= return &I; } .... } 

V778 [CWE-682] Two similar code fragments were found. Perhaps this is a typo and 'Op1' variable should be used instead of 'Op0'. InstCombineCompares.cpp 5507

This new interesting diagnostics reveals situations when a code fragment was copied, and in it they began to change some names, but did not correct them in one place.

Please note that in the second block Op0 was changed to Op1 . But in one place did not correct. Most likely, it should have been written like this:

 if (!match(Op1, m_PosZeroFP()) && isKnownNeverNaN(Op1, &TLI)) { I.setOperand(1, ConstantFP::getNullValue(Op1->getType())); return &I; } 

Fragment N36: Confusion in the variables

 struct Status { unsigned Mask; unsigned Mode; Status() : Mask(0), Mode(0){}; Status(unsigned Mask, unsigned Mode) : Mask(Mask), Mode(Mode) { Mode &= Mask; }; .... }; 

PVS-Studio warning : V1001 [CWE-563] The 'Mode' variable is assigned. SIModeRegister.cpp 48

It is very dangerous to give the arguments of functions the same names as members of a class. It is very easy to get confused. Before us is just such a case. This expression does not make sense:

 Mode &= Mask; 

The function argument is changing. And that's all.This argument is no longer used. Most likely, it was necessary to write this:

 Status(unsigned Mask, unsigned Mode) : Mask(Mask), Mode(Mode) { this->Mode &= Mask; }; 

Fragment N37: Confusion in the variables

 class SectionBase { .... uint64_t Size = 0; .... }; class SymbolTableSection : public SectionBase { .... }; void SymbolTableSection::addSymbol(Twine Name, uint8_t Bind, uint8_t Type, SectionBase *DefinedIn, uint64_t Value, uint8_t Visibility, uint16_t Shndx, uint64_t Size) { .... Sym.Value = Value; Sym.Visibility = Visibility; Sym.Size = Size; Sym.Index = Symbols.size(); Symbols.emplace_back(llvm::make_unique<Symbol>(Sym)); Size += this->EntrySize; } 

PVS-Studio warning: V1001 [CWE-563] The size of the variable is assigned. Object.cpp 424

The situation is similar to the previous one. It should be written:

 this->Size += this->EntrySize; 

Fragment N38-N47: We forgot to check the pointer.

Previously, we considered examples of how the V595 diagnostics work . Its essence is that the pointer at the beginning is dereferenced, and only then it is checked. Young diagnostics V1004 is the opposite of its meaning, but also detects a lot of errors. It reveals situations when the pointer was checked at the beginning and then forgot to do it. Consider such cases found inside LLVM.

 int getGEPCost(Type *PointeeType, const Value *Ptr, ArrayRef<const Value *> Operands) { .... if (Ptr != nullptr) { // <= assert(....); BaseGV = dyn_cast<GlobalValue>(Ptr->stripPointerCasts()); } bool HasBaseReg = (BaseGV == nullptr); auto PtrSizeBits = DL.getPointerTypeSizeInBits(Ptr->getType()); // <= .... } 

PVS-Studio warning: V1004 [CWE-476] The 'Ptr' pointer was used unsafely after it was verified against nullptr. Check lines: 729, 738. TargetTransformInfoImpl.h 738

The Ptr variable can be equal to nullptr , as evidenced by the check:

 if (Ptr != nullptr) 

However, below this pointer is dereferenced without preliminary check:

 auto PtrSizeBits = DL.getPointerTypeSizeInBits(Ptr->getType()); 

Consider another similar case.

 llvm::DISubprogram *CGDebugInfo::getFunctionFwdDeclOrStub(GlobalDecl GD, bool Stub) { .... auto *FD = dyn_cast<FunctionDecl>(GD.getDecl()); SmallVector<QualType, 16> ArgTypes; if (FD) // <= for (const ParmVarDecl *Parm : FD->parameters()) ArgTypes.push_back(Parm->getType()); CallingConv CC = FD->getType()->castAs<FunctionType>()->getCallConv(); // <= .... } 

PVS-Studio warning: V1004 [CWE-476] The “FD” pointer was used unsafely after it was verified against nullptr. Check lines: 3228, 3231. CGDebugInfo.cpp 3231

Notice the FD pointer . I am sure that the problem is clearly visible and no special explanation is required.

And further:

 static void computePolynomialFromPointer(Value &Ptr, Polynomial &Result, Value *&BasePtr, const DataLayout &DL) { PointerType *PtrTy = dyn_cast<PointerType>(Ptr.getType()); if (!PtrTy) { // <= Result = Polynomial(); BasePtr = nullptr; } unsigned PointerBits = DL.getIndexSizeInBits(PtrTy->getPointerAddressSpace()); // <= .... } 

PVS-Studio warning: V1004 [CWE-476] The 'PtrTy' pointer was used unsafely after it was verified against nullptr. Check lines: 960, 965. InterleavedLoadCombinePass.cpp 965

How to protect against such errors? Be attentive to the Code-Review and use the PVS-Studio static analyzer for regular code checking.

There is no sense to cite other code fragments with errors of this type. I leave in the article only a list of warnings:


N48-N60: , ( )

 std::unique_ptr<IRMutator> createISelMutator() { .... std::vector<std::unique_ptr<IRMutationStrategy>> Strategies; Strategies.emplace_back( new InjectorIRStrategy(InjectorIRStrategy::getDefaultOps())); .... } 

PVS-Studio: V1023 [CWE-460] A pointer without owner is added to the 'Strategies' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-isel-fuzzer.cpp 58

std::vector<std::unique_ptr<X>> xxx.push_back(new X) , X* std::unique_ptr<X> .

xxx.emplace_back(new X) , : emplace_back .

It's not safe. If the vector is full, the memory is allotted. The memory reassignment operation may fail, with the result that the exception std :: bad_alloc will be generated . In this case, the pointer will be lost, and the created object will never be deleted.

The safe solution is to create unique_ptr , which will own the pointer before the vector tries to re-allocate memory:

 xxx.push_back(std::unique_ptr<X>(new X)) 

Starting with C ++ 14, you can use 'std :: make_unique':

 xxx.push_back(std::make_unique<X>()) 

LLVM. , . , , , , .

, LLVM, PVS-Studio .

:


Conclusion


In total, I wrote out 60 warnings, after which I stopped. Are there any other defects that PVS-Studio analyzer detects in LLVM? Yes there is. However, when I wrote out the code snippets for the article, it was late evening, or rather, even night, and I decided it was time to round out.

I hope you were interested, and you want to try the PVS-Studio analyzer.

You can download the analyzer and get a trial key on this page .

Most importantly, use static analysis regularly. One-time checks performed by us to popularize the methodology of static analysis and PVS-Studio are not a normal scenario.

Good luck in improving the quality and reliability of the code!



If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. Finding Bugs in LLVM 8 with PVS-Studio .

Source: https://habr.com/ru/post/450008/


All Articles