
Now I will tell and show by examples why physicists should also use static code analysis tools. I would like this tool to be PVS-Studio. But, of course, any other tool will also be useful. Code analyzer will reduce the time for debugging applications and reduce headaches from blunt errors. It is better to think more about physics and less about errors in C ++ programs.
')
Sad background
In fact, this article was “by”. Even the one who searches for others' mistakes makes mistakes. :)
I overlooked. I have instructed to prepare a Geant4 project for testing a new young employee. It was necessary to download the source code, generate a project for Visual Studio and generally do everything necessary so that I could take and check. He did it all. I just took the first available ancient version of Geant4_9_4, which was described in the article about building the project in Windows. Eh! I learned about it when I wrote this article.However, this has a positive side. I wrote a new article, Continuing the Geant4 Review , where we analyze the latest source code (version 10.0-beta). Now you can find out which errors were corrected, which means that they could be found much earlier and easier with the help of static analysis. And you can see the errors that are still hiding.Of the sixteen warnings listed in the article in the new version:So, although this article is not relevant, it shows very well the capabilities of the PVS-Studio code analyzer. It is not so important that the old project is verified. It is important how many errors can be avoided at the stage of writing the code.
Introduction
I continue a series of notes on analyzing the code used for scientific purposes. Previous notes:
This time on turn the project Geant4. I will give a description of the project,
taken from Wikipedia :
Geant4 (eng. GEometry ANd Tracking - geometry and tracking) is a program for simulating the passage of elementary particles through matter using Monte Carlo methods. Developed at CERN in the object-oriented programming language C ++. It is a further development of previous versions of GEANT, substantially reworked and expanded.As stated on the project’s official website, “the areas of application include high energy physics and nuclear reaction research, medicine, particle accelerators, and space physical research.” The software is used in many research projects around the world.Project website:
http://geant4.org . The project has an average size: 76 megabytes of source code. For comparison:
- VirtualDub project - 13 megabytes;
- Apache HTTP Server project - 26 megabytes;
- Chromium project (along with additional libraries) - 710 megabytes.
For the analysis, the
PVS-Studio code analyzer was used. Since the Geant4 project is not small, the chance to find something interesting in it was very large. No errors can be found only in small projects (see
the non-linear error density ). Of course, there are also large projects where PVS-Studio does not find anything. It is a pity, but this is only an exception.
I want to apologize right away if I accidentally wrote some nonsense about physics. But note that I was able to find software errors without understanding the essence of partons, and indeed almost everything related to nuclear reactions!
Note. The article lists not all that was able to notice. A more complete list of warnings that I noticed: geant4_old.txt .Let's see what we managed to find interesting in the Geant4 code.
Copy-Paste and muons
About
muons, I only know that this is an elementary particle. But I understand much better what Copy-Paste is. Here is a beautiful example of an error when a line of code has been propagated. The copied lines have been modified, but not everywhere. Result:
void G4QMessenger::SetNewValue(G4UIcommand* aComm, G4String aS) { if(photoDir) { if (aComm==theSynchR) thePhoto->SetSynchRadOnOff(aS); else if(aComm==minGamSR) thePhoto->SetMinGammaSR(.... else if(aComm==theGamN) thePhoto->SetGammaNuclearOnOff(.... else if(aComm==theMuoN) thePhoto->SetElPosNuclearOnOff(.... else if(aComm==theMuoN) thePhoto->SetMuonNuclearOnOff(aS); else if(aComm==theMuoN) thePhoto->SetTauNuclearOnOff(aS); else if(aComm==biasPhotoN)thePhoto->SetPhotoNucBias(.... } .... }
PVS-Studio warning: V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 195, 196. G4phys_builders g4qmessenger.cc 195
Note the repeating check three times (aComm == theMuoN).
Note. Bug fixed in new version of Gint4. Or this code is deleted.Decay of baryons
It is difficult to study radioactive decay or try to detect the
proton decay . This is especially difficult to do when programs contain errors.
void G4QEnvironment::DecayBaryon(G4QHadron* qH) { .... else if(qM<mSzPi)
PVS-Studio warning: V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 8373, 8380. G4hadronic_body_ci g4qenvironment.cc 8373
The same condition is used twice (qM <mSzPi). This means that we will never get into the branch to which the comment "// Both Lambda + PiM & Sigma0 + PiM are possible".
Note. Bug fixed in new version of Gint4. Or this code is deleted.Calculate the number of partons
Parton is a point-like component of hadrons, which is manifested in experiments on deep inelastic scattering of hadrons on leptons and other hadrons.
It is a pity, but there is a problem with counting their number:
G4bool G4CollisionMesonBaryonElastic:: IsInCharge(const G4KineticTrack& trk1, const G4KineticTrack& trk2) const { G4bool result = false; G4ParticleDefinition * p1 = trk1.GetDefinition(); G4ParticleDefinition * p2 = trk2.GetDefinition(); if( (GetNumberOfPartons(p1) != 2 || GetNumberOfPartons(p2) != 3) ||(GetNumberOfPartons(p1) != 3 || GetNumberOfPartons(p2) != 2) ) { result = false; } .... }
PVS-Studio warning: V547 Expression is always true. Probably the '&&' operator should be used here. G4had_im_r_matrix g4collisionmesonbaryonelastic.cc 53
Immediately the essence of the error may be invisible. Therefore, I will simplify the expression:
A = GetNumberOfPartons(p1); B = GetNumberOfPartons(p2); if ( (A != 2 || B != 3) || (A != 3 || B != 2) )
Brackets for simplicity can also be discarded. We get:
if ( A != 2 || B != 3 || A != 3 || B != 2 )
This condition is always true. The variable 'A' is always not equal to 2 or not equal to 3. Similarly to the variable 'B'. Apparently, there was confusion somewhere. Most likely, the '&&' operator should be present here.
Note. Bug fixed in new version of Gint4. Or this code is deleted.Coulomb blockade and going beyond the boundaries of the array
Coulomb blockade - blocking the passage of electrons through a quantum dot connected between two tunneling contacts, due to the repulsion of electrons in the contacts from the electron at the point, as well as an additional Coulomb potential barrier, which creates an electron, sitting at the point.
There is something wrong with the SetCoulombEffects () function. The array of pointers 'sig' contains the addresses of nonexistent subarrays. Working with 'sig' will lead to undefined program behavior. Well, if the program ends in emergency mode. It is much worse if the program continues and randomly reads and writes into memory occupied by other arrays and variables.
enum { NENERGY=22, NANGLE=180 }; class G4LEpp : public G4HadronicInteraction { .... G4float * sig[NANGLE]; static G4float SigCoul[NENERGY][NANGLE]; .... }; G4LEpp::SetCoulombEffects(G4int State) { if (State) { for(G4int i=0; i<NANGLE; i++) { sig[i] = SigCoul[i]; } elab = ElabCoul; } .... }
PVS-Studio warning: V557 Array overrun is possible. The value of 'i' index could reach 179. g4lepp.cc 62
The 'sig' array contains 180 pointers. These pointers should reference the different lines of the two-dimensional 'SigCoul' array. But in the array 'SigCoul' only 22 lines. It turns out that most of the pointers in the 'sig' array will indicate incomprehensibly where.
It is difficult to say exactly what the error is. The SigCoul array may have been incorrectly declared. For example, you need to swap the number of rows and columns:
SigCoul [NENERGY] [NANGLE] - >> SigCoul [NANGLE] [NENERGY]
Note. The error is still present in the new version of Gint4.Typo and Twist Surfaces
void G4VTwistSurface::GetBoundaryLimit(G4int areacode, G4double limit[]) const { .... if (areacode & sC0Min1Max) { limit[0] = fAxisMin[0]; limit[1] = fAxisMin[1]; } else if (areacode & sC0Max1Min) { limit[0] = fAxisMax[0]; limit[1] = fAxisMin[1]; } else if (areacode & sC0Max1Max) { limit[0] = fAxisMax[0]; limit[1] = fAxisMax[1]; } else if (areacode & sC0Min1Max) { limit[0] = fAxisMin[0]; limit[1] = fAxisMax[1]; } .... }
PVS-Studio warning: V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 793, 802. G4specsolids g4vtwistsurface.cc 793
There are 4 variables:
- sC0Min1Max
- sC0Max1Min
- sC0Min1Min
- sC0Max1Max
When working with borders, only 3 of them are used. The check (areacode & sC0Min1Max) is performed 2 times. If you take a closer look, you can see that after the first check the minimums are selected: fAxisMin [0], fAxisMin [1]. Most likely, the first check should have been like this:
if (areacode & sC0Min1Min) { limit[0] = fAxisMin[0]; limit[1] = fAxisMin[1]; } <b>. Gint4. .</b>
X-rays and irregular IF
X-rays are electromagnetic waves whose photon energy lies on the scale of electromagnetic waves between ultraviolet radiation and gamma radiation.
As I understand it, the G4ForwardXrayTR class is related to X-Ray. Error crept into index comparison.
G4VParticleChange* G4ForwardXrayTR::PostStepDoIt(....) { .... if (iMat == jMat || ( (fMatIndex1 >= 0 && fMatIndex1 >= 0) && ( iMat != fMatIndex1 && iMat != fMatIndex2 ) && ( jMat != fMatIndex1 && jMat != fMatIndex2 ) ) .... }
PVS-Studio warning: V501 operator: fMatIndex1> = 0 && fMatIndex1> = 0 G4xrays g4forwardxraytr.cc 620
An index with the name 'fMatIndex1' is checked twice. And about 'fMatIndex2' forgotten. Probably should be written like this:
(fMatIndex1 >= 0 && fMatIndex2 >= 0)
Note. The error is still present in the new version of Gint4.Typo and neutrons
G4double G4MesonAbsorption:: GetTimeToAbsorption( const G4KineticTrack& trk1, const G4KineticTrack& trk2) { .... if(( trk1.GetDefinition() == G4Neutron::Neutron() || trk1.GetDefinition() == G4Neutron::Neutron() ) && sqrtS>1.91*GeV && pi*distance>maxChargedCrossSection) return time; .... }
PVS-Studio warning: V501 There are identical sub-expressions 'trk1.GetDefinition () == G4Neutron :: Neutron ()' operator. G4had_im_r_matrix g4mesonabsorption.cc 285
Yes, I do not know what this function does. But, as it seems to me, at the entrance it receives two trajectories of the movement of elementary particles. The function must handle the situation in a special way if at least one particle is a neutron. But, in fact, only the first particle is checked.
Apparently, in fact, they planned to write:
trk1.GetDefinition() == G4Neutron::Neutron() || trk2.GetDefinition() == G4Neutron::Neutron()
A similar typo can be found here: g4scatterer.cc 138
Note. Bug fixed in new version of Gint4. Or this code is deleted.Adding parton
There is a InsertParton () function that inserts a parton into a container. You can specify after which parton to insert a new element. If the insertion point is not specified, then you can probably insert it anywhere. But just this case is implemented incorrectly.
typedef std::vector<G4Parton *> G4PartonVector; inline void G4ExcitedString::InsertParton( G4Parton *aParton, const G4Parton * addafter) { G4PartonVector::iterator insert_index; .... if ( addafter != NULL ) { insert_index=std::find(thePartons.begin(), thePartons.end(), addafter); .... } thePartons.insert(insert_index+1, aParton); }
PVS-Studio warning: V614 Potentially uninitialized iterator 'insert_index' used. g4excitedstring.hh 193
If the 'addafter' pointer is zero, then the “insert_index” iterator remains uninitialized. As a result, the insertion of a new element can lead to unpredictable consequences.
Note. The error is still present in the new version of Gint4.Processing not all nucleons
Nucleons are the common name for protons and neutrons.
There is a packNucleons () function that does not handle all items. The fact is that the loop ends immediately after the first iteration. At the end of the loop body there is a 'break' operator, but there is no 'continue' operator anywhere.
void G4QMDGroundStateNucleus::packNucleons() { .... while ( nmTry < maxTrial ) { nmTry++; G4int i = 0; for ( i = 1 ; i < GetMassNumber() ; i++ ) { .... } if ( i == GetMassNumber() ) { areTheseMsOK = true; } break; } .... }
PVS-Studio warning: V612 An unconditional 'break' within a loop. g4qmdgroundstatenucleus.cc 274
It seems to me that the 'break' statement at the end is just superfluous and written by chance.
Note. The error is still present in the new version of Gint4.Lund string model and typo in the index
This is a phenomenological model of hadronization.
When you have to work with individual elements of arrays, it is very easy to make a typo. This is exactly what happened in the constructor of the G4LundStringFragmentation class. In the code below, the error is immediately visible. Twice we assign values to the same cell. However, this is a very large function, where many elements of the array are initialized. And to notice the error, considering the function, is an extremely difficult task. This is the case where
static code analysis is simply indispensable.
G4LundStringFragmentation::G4LundStringFragmentation() { .... BaryonWeight[0][1][2][2]=pspin_barion*0.5; .... BaryonWeight[0][1][2][2]=(1.-pspin_barion); .... }
PVS-Studio warning: V519 The 'BaryonWeight [0] [1] [2] [2]' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 205, 208. g4lundstringfragmentation.cc 208
Note I want to note that in the code I have met many places where a variable is assigned two different values in a row. Many places are quite harmless. For example, at the beginning, variables are assigned 0, and then the real value. However, many of these places may contain an error. Therefore, I recommend to Geant4 authors to carefully consider the diagnostic warnings
V519 . I myself looked at these warnings very superficially.
By the way, the approach is not very clear, when at first the variable is initialized with the default value and then with the desired value. Why do you need it? It is easier to immediately declare a variable where it is needed and initialize it with the necessary number.
Note. The error is still present in the new version of Gint4.Some other warnings of V519
The copy operator in the class G4KineticTrack looks suspicious:
const G4KineticTrack& G4KineticTrack::operator=( const G4KineticTrack& right) { .... the4Momentum = right.the4Momentum; the4Momentum = right.GetTrackingMomentum(); .... }
PVS-Studio warning: V519 The4Momentum variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 451, 452. g4kinetictrack.cc 452
Note. The error is still present in the new version of Gint4.By the way, the designers generally had a lot of warnings V519. Perhaps in this code there is a sense? For example, for debugging purposes. Unclear. So I think we should give a couple more examples of suspicious code:
void G4IonisParamMat::ComputeDensityEffect() { .... fX0density = 0.326*fCdensity-2.5 ; fX1density = 5.0 ; fMdensity = 3. ; while((icase > 0)&&(fCdensity < ClimiG[icase])) icase-- ; fX0density = X0valG[icase]; fX1density = X1valG[icase]; .... }
PVS-Studio warnings: V519 The 'fX0density' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 245, 247. g4ionisparammat.cc 247
V519 The 'fX1density' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 245, 247. g4ionisparammat.cc 247
Note. The error is still present in the new version of Gint4. void G4AdjointPhotoElectricModel::SampleSecondaries(....) { .... pre_step_AdjointCS = totAdjointCS; post_step_AdjointCS = AdjointCrossSection(aCouple, electronEnergy,IsScatProjToProjCase); post_step_AdjointCS = totAdjointCS; .... }
PVS-Studio warning: V519 The 'post_step_AdjointCS' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 76, 77. g4adjointphotoelectricmodel.cc 77
Note. The error is still present in the new version of Gint4.And the last suspicious fragment I noticed. Pay attention to the member 'erecrem'.
void G4Incl::processEventIncl(G4InclInput *input) { .... varntp->mzini = izrem; varntp->exini = esrem; varntp->pxrem = pxrem; varntp->pyrem = pyrem; varntp->pzrem = pzrem; varntp->mcorem = mcorem; varntp->erecrem = pcorem; varntp->erecrem = erecrem; .... }
PVS-Studio warning: V519 The 'varntp-> erecrem' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 896, 897. g4incl.cc 897
Note. Bug fixed in new version of Gint4. Or this code is deleted.Count of array elements from 1
It seems to me that here for a moment someone forgot that the elements of the arrays in C ++ are numbered from scratch. As a result, it may happen beyond the array boundary. Plus there is no comparison with the value 1.4, located at the very beginning of the array.
void G4HEInelastic::MediumEnergyClusterProduction(....) { .... G4double alem[] = {1.40, 2.30, 2.70, 3.00, 3.40, 4.60, 7.00}; .... for (j = 1; j < 8; j++) { if (alekw < alem[j]) { jmax = j; break; } } .... }
PVS-Studio warning: V557 Array overrun is possible. The value of 'j' index could reach 7. g4heinelastic.cc 4682
Note. The error is still present in the new version of Gint4.Physics and undefined behavior
C ++ is a cruel language. Slightly relaxed - and immediately shot himself a proton leg. Moreover, it can be immediately overlooked. Here is an example of an incorrectly implemented addition operator:
template <typename T> GMocrenDataPrimitive<T> & GMocrenDataPrimitive<T>::operator + (const GMocrenDataPrimitive<T> & _right) { GMocrenDataPrimitive<T> rprim; .... return rprim; }
PVS-Studio warning: V558 Function returns the current local object: rprim. G4GMocren g4gmocrenio.cc 131
The function returns a reference to the local object. Attempting to work with this link will result in undefined behavior.
Note. The error is still present in the new version of Gint4.Time to wrap up
Unfortunately, it's time to end the excursion into the world of physics. The fact is that this is still an article, and not a multi-page report. It is a pity that I did not have time to tell about many other mistakes. You can get acquainted with the suspicious code, which I managed to notice when looking at the PVS-Studio diagnostic warnings, in this file -
geant4_old.txt .
Just I ask you not to conclude that these are all the errors that PVS-Studio could find. I looked at the report rather superficially and could miss a lot of things. I ask the project developers to check the code with PVS-Studio on their own. Write to us and we will allocate a key for this for some time.
And of course, as always, I repeat that it is more efficient to use static code analysis regularly, and not occasionally. To realize the importance of regular use, please read
here and
here .
As I said, the file contains a lot more suspicious code fragments than in this article. I think all situations are clear enough. As a last resort, I want to remind you that for each error code in the
documentation there is a detailed description with examples.
The only thing I want to briefly explain the diagnosis of V636 and V624. Sometimes they may indicate inaccuracies in the calculations. It seems to me that these diagnostics are important when it comes to computing programs.
Diagnostic Example V636: G4double G4XAqmTotal::CrossSection( const G4KineticTrack& trk1, const G4KineticTrack& trk2) const { .... G4int sTrk1 = ....; G4int qTrk1 = ....; G4double sRatio1 = 0.; if (qTrk1 != 0) sRatio1 = sTrk1 / qTrk1; .... }
PVS-Studio warning: V636 The 'sTrk1 / qTrk1' expression was implicitly casted from 'int' type to 'double' type. Consider using a fractional part. An example: double A = (double) (X) / Y ;. g4xaqmtotal.cc 103
The result of dividing “double X = 3/2” is 1, not 1.5, as it may seem due to negligence. In the beginning, integer division occurs, and only then the result is converted to the type 'double'. Considering unfamiliar code, it is difficult to say whether you really need to use integer division or is it a mistake. I think such places in Geant4 deserve attention.
Note. If integer division is required, such places should be accompanied by a comment.Diagnostic Example V624: dSigPodT = HadrTot*HadrTot*(1+HadrReIm*HadrReIm)* (....)/16/3.1416*2.568;
PVS-Studio warning: V624 The constant 3.1416 is being utilized. The resulting value could be inaccurate. Consider using the M_PI constant from <math.h>. g4elastichadrnucleushe.cc 750
It is not clear why rigid constants are used in mass terms to denote the number Pi, Pi / 2, and so on. Ready to believe that accuracy is enough. But still, this is not an explanation. Such constants are another reason for typos and inaccuracies. There is a mass of predefined constants, such as M_PI, M_PI_4, M_LN2. PVS-Studio makes appropriate recommendations for replacing hard constants.
I almost forgot. In the
geant4_old.txt file
, I also quoted messages related to micro-optimizations. For example, regarding the increment of iterators:
class G4PhysicsTable : public std::vector<G4PhysicsVector*> { .... }; typedef G4PhysicsTable::iterator G4PhysicsTableIterator; inline void G4PhysicsTable::insertAt (....) { G4PhysicsTableIterator itr=begin(); for (size_t i=0; i<idx; ++i) { itr++; } .... }
PVS-Studio warning: V803 Decreased performance. In the case of it it is an iterator Replace iterator ++ with ++ iterator. g4physicstable.icc 83
Similarly, why this should be changed is described in the article:
Does it make practical sense to use the prefix increment operator ++ it for iterators instead of the postfix it ++ .
Conclusion
Accept that mistakes and typographical errors are made by
everyone . And yes, even you, dear readers. It's unavoidable. Using static code analysis tools, you can fix a lot of errors at the earliest stages. This will reduce the time required to search for defects and will allow more attention to be paid to the problem being solved directly.
Additional links
- Andrey Karpov. Myths about static analysis. The second myth - professional developers do not make stupid mistakes .
- Andrey Karpov. Answers to questions that are often asked after reading our articles .
- News about C ++ language, interesting articles and information about the projects we tested: @Code_Analysis .
- First acquaintance with the analyzer: PVS-Studio for Visual C ++ .