
After hot discussions about the “
Big Calculator ”, I wanted to check out some of the research related projects. The first thing that was found was the open-source project OpenMS associated with protein mass spectrometry. This project turned out to be written with a serious approach. When developing it is used at least Cppcheck. So there was nothing sensational to wait for. However, there was interest, what errors can PVS-Studio find after Cppcheck. Interested invite to continue reading the article.
')
There is an OpenMS project. For what it is, I do not undertake to retell in my own words. More lyapnu stupidity. Just to quote a fragment of the description from the site Wikipedia:
This is an open-source 2-clause BSD license. OpenMS has been used for data processing, feature (s), visualization in 1D (spectra or chromatogram), 2D and 3D, map mapping and peptide identification. It supports label-free and isotopic-label based quantification (such as iTRAQ and TMT and SILAC). Furthermore, it also supports metabolomics workflows and DIA / SWATH targeted analysis.Original Source: Wikipedia. OpenMSThe project is medium in size but very complex. The size of the source code is more than 20 megabytes. Plus a large number of third-party libraries (Boost, Qt, Zlib, and so on). The project is very actively used templates. The source code of the project can be
downloaded from the site SourceForge .
When developing an OpenMS project, static code analysis is used. The presence of "cppcheck.cmake" and comments in the spirit of:
if (i != peptide.size())
indicates that at least Cppcheck is used. There is also a mention
Cpplint and there is a file "cpplint.py". Professional approach. Well done.
Now let's see what can be found with the help of the
PVS-Studio analyzer.
Note. For some reason, C ++ files have the extension '* .c'. So do not be embarrassed when you see a sample code in C ++ located in the '* .c' file.1. Shortcomings with OpenMP
I very rarely come across projects that use
OpenMP technology. In general, I sometimes attend thoughts, remove these diagnostics from the analyzer. Therefore, I was genuinely surprised when I saw the warnings related to OpenMP. Over the past year, having checked dozens of projects, I have never seen a single warning on that topic. Wow, someone still uses this technology.
Among the warnings issued are false positives, but there were warnings in the case.
DoubleReal ILPDCWrapper::compute(....) const { .... DoubleReal score = 0; .... #pragma omp parallel for schedule(dynamic, 1) for (SignedSize i = 0; i < (SignedSize)bins.size(); ++i) { score += computeSlice_(fm, pairs, bins[i].first, bins[i].second, verbose_level); } return score; }
PVS-Studio warning: V1205 Data race risk. Unprotected concurrent operation with the 'score' variable. ilpdcwrapper.c 213
Incorrectly counted amount. The variable 'score' is not protected from simultaneous use in different streams.
Other comments are not so critical, but still they should pay attention. Inside parallel sections, all exceptions must be intercepted. Exceeding the exception outside the parallel section is most likely to lead to the emergency termination of the program. You can read more about this topic in the notes: "
OpenMP and exceptions ", "
Exception handling inside parallel sections ".
An exception can be generated explicitly using the throw operator. It can also occur when calling the operator new (std :: bad_alloc).
First option. The getTheoreticalmaxPosition () function may throw an exception.
Size getTheoreticalmaxPosition() const { if (!this->size()) { throw Exception::Precondition(__FILE__, __LINE__, __PRETTY_FUNCTION__, "There must be at least one trace to ......"); } .... } virtual void run() { .... #pragma omp parallel for for (SignedSize i = 0; i < (SignedSize)seeds.size(); ++i) { .... f.setMZ( traces[traces.getTheoreticalmaxPosition()].getAvgMZ()); .... } .... }
PVS-Studio warning: V1301 The 'throw' keyword cant be try..catch block in a parallel section. featurefinderalgorithmpickedhelperstructs.h 199
The second option. Calling the 'new' operator may result in an exception.
TraceFitter<PeakType>* chooseTraceFitter_(double& tau) {
PVS-Studio warning: V1302 Try it out. featurefinderalgorithmpicked.h 1926
Other similar warnings:
- V1301 featurefinderalgorithmpicked.h 1261
- V1301 mzmlfile.h 114
- V1301 rawmssignalsimulation.c 598
- V1301 rawmssignalsimulation.c 1152
- V1301 chromatogramextractor.h 103
- V1301 chromatogramextractor.h 118
- V1302 featurefinderalgorithmpicked.h 1931
- V1302 rawmssignalsimulation.c 592
- V1302 rawmssignalsimulation.c 601
- V1302 openswathanalyzer.c 246
2. Typos
std::vector< std::pair<std::string, long> > spectra_offsets; std::vector< std::pair<std::string, long> > chromatograms_offsets; template <typename MapType> void MzMLHandler<MapType>::writeFooter_(std::ostream& os) { .... int indexlists; if (spectra_offsets.empty() && spectra_offsets.empty() ) { indexlists = 0; } else if (!spectra_offsets.empty() && !spectra_offsets.empty() ) { indexlists = 2; } else { indexlists = 1; } .... }
PVS-Studio warnings:
V501 There are identical sub-expressions 'spectra_offsets.empty ()' to the left. mzmlhandler.h 5288
V501 There are identical sub-expressions "! Spectra_offsets.empty ()" mzmlhandler.h 5292
Very strange checks. The container 'spectra_offsets' is checked twice. Most likely, this is a typo and you should check two different containers: spectra_offsets and chromatograms_offsets.
template <typename MapType> void MzMLHandler<MapType>::characters( const XMLCh* const chars, const XMLSize_t) { .... if (optionalAttributeAsString_(data_processing_ref, attributes, s_data_processing_ref)) { data_.back().meta.setDataProcessing( processing_[data_processing_ref]); } else { data_.back().meta.setDataProcessing( processing_[data_processing_ref]); } .... }
PVS-Studio warning: V523 The 'then' statement is equivalent to the 'else' statement. mzmlhandler.h 534
If you look at similar code fragments, you can conclude that it was necessary to use:
- processing_ [data_processing_ref]
- processing_ [default_processing_]
Many typos were associated with the generation of exceptions. Errors are commonplace. The 'throw' keyword is forgotten. It simply creates a temporary object and is immediately destroyed. An example of such an error:
inline UInt asUInt_(const String & in) { UInt res = 0; try { Int tmp = in.toInt(); if (tmp < 0) { Exception::ConversionError( __FILE__, __LINE__, __PRETTY_FUNCTION__, ""); } res = UInt(tmp); } catch (Exception::ConversionError) { error(LOAD, String("UInt conversion error of \"") + in + "\""); } return res; }
PVS-Studio warning: V596 The object was not created. The 'throw' keyword could be missing: throw ConversionError (FOO); xmlhandler.h 247
Identical typos here:
- inclusionexclusionlist.c 281
- inclusionexclusionlist.c 285
- precursorionselectionpreprocessing.c 257
- modificationsdb.c 419
- modificationsdb.c 442
- svmtheoreticalspectrumgeneratorset.c 103
- logconfighandler.c 285
- logconfighandler.c 315
- suffixarraytrypticcompressed.c 488
- tooldescription.c 147
- tofcalibration.c 147
And the last typo I noticed:
inline typename Value<Pipe>::Type const & operator*() { tmp.i1 = *in.in1; tmp.i2 = *in.in2; tmp.i3 = *in.in2; return tmp; }
PVS-Studio warning: V525 The code containing the collection of similar blocks. Check items 'in1', 'in2', 'in2' in lines 112, 113, 114. pipe_joiner.h 112
It should be written:
tmp.i1 = *in.in1; tmp.i2 = *in.in2; tmp.i3 = *in.in3;
3. Strange condition
CompressedInputSource::CompressedInputSource( const String & file_path, const char * header, MemoryManager * const manager) : xercesc::InputSource(manager) { if (sizeof(header) / sizeof(char) > 1) { head_[0] = header[0]; head_[1] = header[1]; } else { head_[0] = '\0'; head_[1] = '\0'; } .... }
PVS-Studio warning: V514 Dividing sizeof a pointer 'sizeof (header)' by another value. There is a possibility of logical error presence. compressedinputsource.c 52
If we divide the size of the pointer by the size of a byte, then we always get a value greater than one. At least I don’t know the quaint architecture where it’s not. So there is some kind of mistake.
A similar strange test is available here: compressedinputsource.c 104
4. Returning a link to a local object
template <typename TStringSet, typename TSpec> inline Iter<TStringSet, ConcatVirtual<TSpec> > const & operator++(Iter<TStringSet, ConcatVirtual<TSpec> > & me, int) { Iter<TStringSet, ConcatVirtual<TSpec> > before = me; goNext(me); return before; }
PVS-Studio warning: V558 Function returns temporary reference object: before. iter_concat_virtual.h 277
The function returns a reference to the temporary variable 'before'. When exiting a function, this variable will be destroyed. Using a reference to a destroyed object will lead to unpredictable consequences.
I think it was necessary not to create a new iterator, but to save the link:
Iter<TStringSet, ConcatVirtual<TSpec> > &before = me;
[PS Wrong. See the comments.]
The operator has a similar problem: '-': iter_concat_virtual.h 310
5. Rough calculations
typedef size_t Size; typedef double DoubleReal; void updateMeanEstimate(const DoubleReal & x_t, DoubleReal & mean_t, Size t) { DoubleReal tmp(mean_t); tmp = mean_t + (1 / (t + 1)) * (x_t - mean_t); mean_t = tmp; }
PVS-Studio warning: V636 The '1 / (t + 1)' expression was implicitly casted from 'int' type to 'double' type. Consider using a fractional part. An example: double A = (double) (X) / Y ;. masstracedetection.c 129
The expression "(1 / (t + 1))" is always zero or one. This is due to the fact that this expression is an integer. Perhaps it was intended to get a completely different result. I do not know the logic of the program, but I think the following was meant:
tmp = mean_t + (1.0 / (t + 1)) * (x_t - mean_t);
I also did not like that instead of the constant M_PI, explicit values are used, and not very accurate ones. This is certainly not a mistake, but not very good. Example:
bool PosteriorErrorProbabilityModel::fit( std::vector<double> & search_engine_scores) { .... incorrectly_assigned_fit_param_.A = 1 / sqrt(2 * 3.14159 * pow(incorrectly_assigned_fit_param_.sigma, 2)); .... }
PVS-Studio warning: V624 The constant 3.14159 is being utilized. The resulting value could be inaccurate. Consider using the M_PI constant from <math.h>. posteriorerrorprobabilitymodel.c 92
Similarly:
- posteriorerrorprobabilitymodel.c 101
- posteriorerrorprobabilitymodel.c 110
- posteriorerrorprobabilitymodel.c 155
- posteriorerrorprobabilitymodel.c 162
6. Going beyond the array boundary
static const Int CHANNELS_FOURPLEX[4][1]; static const Int CHANNELS_EIGHTPLEX[8][1]; ExitCodes main_(int, const char **) { .... if (itraq_type == ItraqQuantifier::FOURPLEX) { for (Size i = 0; i < 4; ++i) { std::vector<std::pair<String, DoubleReal> > one_label; one_label.push_back(std::make_pair<String, DoubleReal>( String("Channel ") + String(ItraqConstants::CHANNELS_FOURPLEX[i][0]), DoubleReal(ItraqConstants::CHANNELS_FOURPLEX[i][0]))); labels.push_back(one_label); } } else
PVS-Studio warning: V557 Array overrun is possible. Itraqanalyzer.c 232
In fact, this error could be attributed to Copy-Paste errors. But let it be "going beyond the border of the array." It sounds more frightening. Anyway, this division is very conditional. The same error can be classified in different ways.
Here, in the 'else' branch, we had to work with the 'CHANNELS_EIGHTPLEX' arrays. This is evidenced by the comment:
else
However, the copied code snippet is not completely changed. As a result, it uses the CHANNELS_FOURPLEX array, which is smaller in size.
A similar error here (another Copy-Paste): tmtanalyzer.c 225
Consider another example.
DoubleReal masse_[255];
PVS-Studio warning: V557 Array overrun is possible. The value of 'i' index could reach 255. edwardslippertiterator.c 134
The copy constructor does not work correctly with the masse_ array. It consists of 255 elements. And copied 256 elements.
The correct cycle:
for (Size i = 0; i < 255; i++) { masse_[i] = source.masse_[i]; }
Better yet, never use magic constants.
7. Obsolete call to the 'new' operator
svm_problem * LibSVMEncoder::encodeLibSVMProblem(....) { .... node_vectors = new svm_node *[problem->l]; if (node_vectors == NULL) { delete[] problem->y; delete problem; return NULL; } .... }
PVS-Studio warning: V668 it was no use. The exception will be generated in the case of memory allocation error. libsvmencoder.c 177
The “if (node_vectors == NULL)” check does not make sense. If you fail to allocate memory, an exception will be raised. As a result, the program will behave differently than the programmer planned. For example, a memory leak will occur.
Similar outdated revisions:
- file_page.h 728
- libsvmencoder.c 160
Conclusion
I think it will be useful to use not only Cppcheck, Cpplint, but PVS-Studio. Especially if you do it regularly. I invite project developers to write to us at
support@viva64.com . For a while we will allocate a key for more detailed verification of OpenMS.