⬆️ ⬇️

Docotic.Pdf: What problems will PVS-Studio detect in a mature project?

Docotic.Pdf and PVS-studio


Quality is important to us. And we have heard about PVS-Studio. All this led to the desire to check Docotic.Pdf and find out what else can be improved.



Docotic.Pdf is a general purpose library for working with PDF files. Written in C #, no unsafe code, no external dependencies other than .NET runtime. It works both under .NET 4+ and under .NET Standard 2+.



The library has been in development for a little over 10 years and it has 110 thousand lines of code without taking tests, examples and other things into account. For static analysis, we constantly use Code Analysis and StyleCop. Several thousand automatic tests protect us from regressions. Our clients from different countries and different industries trust the quality of the library.



What problems will PVS-Studio detect?

')

Installation and first impression



I downloaded the trial version from the PVS-Studio website. Pleasantly surprised by the small size of the installer. Installed with the default settings: analysis engines, a separate PVS-Studio environment, integration into Visual Studio 2017.



After installation, nothing started, and two shortcuts with identical icons were added to the Start menu: Standalone and PVS-Studio. For a moment I wondered what to launch. I launched Standalone and was unpleasantly surprised by the interface. The 200% scale set for my Windows is supported crookedly. Part of the text is too small, part of the text does not fit in the space reserved for it. The name, Unicorn, and Actions list are cropped at any window size. Even with full screen.







Well, okay, I decided to open my project file. Suddenly, the File menu did not find such an option. There I was offered only to open separate files. Thank you, I thought, I’ll try another option. I launched PVS-Studio - they showed me a window with blurred text. Scale 200% again made itself felt. Text reported: look for me in “Three Crowns”; look for the PVS-Studio menu in Visual Studio. Well, opened the Studio.



Opened the solution. Indeed, there is a PVS-Studio menu, and in it there is an opportunity to check “Current Project”. I made the project I needed current and launched the check. Below in the Studio a window appeared with the results of the analysis. A window with the progress of the checkout appeared in the background, but I did not immediately find it. At first there was a feeling that the test did not start or immediately ended.



The result of the first check



The analyzer checked all 1253 project files in approximately 9 minutes and 30 seconds. By the end of the check, the file counter did not change as quickly as at the beginning. Perhaps there is some non-linear dependence of the scan duration on the number of files to be scanned.



Information about 81 High, 109 Medium and 175 Low warnings appeared in the results window. If you count the frequency, you get 0.06 High warnings / file, 0.09 Medium warnings / file, and 0.14 Low warnings / per file. Or

0.74 High warnings per thousand lines of code, 0.99 Medium warnings per thousand lines of code, and 1.59 Low warnings per thousand lines of code.



Here in this article it is indicated that in CruiseControl.NET with its 256 thousand lines of code, the analyzer found 15 High, 151 Medium and 32 Low warnings.



It turns out that as a percentage for Docotic.Pdf, more warnings were issued in each group.



What is found?



Warnings Low I decided to disregard at this stage.



I sorted out warnings on the Code column and it turned out that V3022 "Expression is always true / false" and V3063 "If it is evaluated", became the absolute frequency record holder. In my opinion, they are about one thing. In total, these two warnings give 92 cases out of 190. Relative frequency = 48%.



The logic of division into High and Medium is not entirely clear. I expected V3072 "The 'A' class containing IDisposable members doesn’t itself implement IDisposable" and V3073 "Not all IDisposable members are properly disposed. Call 'Dispose' when disposing 'A' class' in the High group, for example. But it tastes, of course.



I was surprised that V3095 “The object was used before it was verified against null. Check lines: N1, N2 ”is marked twice as High and once as Medium. Bug







Trust but check



It is time to check whether the received warnings are justified. Are any real errors found? Are there any incorrect warnings?



I divided the warnings I found into groups below.



Important warnings



Their correction increased the stability of work, solved problems with memory leaks, etc. Real errors / imperfections.



There were 16 of them, which is about 8% of all warnings.



I will give some examples.



V3019 "Possibly incorrectly variable compared to null after type conversion using 'as' keyword. Check variables 'color', 'indexed' »



public override bool IsCompatible(ColorImpl color) { IndexedColorImpl indexed = color as IndexedColorImpl; if (color == null) return false; return indexed.ColorSpace.Equals(this); } 


As you can see, instead of indexed, the variable color is compared with null. This is wrong and can lead to NRE.



V3080 “Possible null dereference. Consider inspecting 'cstr_index.tile_index' »



A small fragment for illustration:



 if (cstr_index.tile_index == null) { if (cstr_index.tile_index[0].tp_index == null) { // .. } } 


Obviously, the first condition implied! = Null. In the current view, the code will call NRE each time it is called.



V3083 “Unsafe invocation of event 'OnProgress', NullReferenceException is possible. Consider assigning an event to a local variable before invoking it. ”



 public void Updated() { if (OnProgress != null) OnProgress(this, new EventArgs()); } 


The warning helped correct a potential exception. Why can it occur? There is a good explanation on Stackoverflow.



V3106 “Possibly index is out of bounds. The '0' index is pointing beyond 'v' bound »



 var result = new List<FontStringPair>(); for (int i = 0; i < text.Length; ++i) { var v = new List<FontStringPair>(); createPairs(text[i].ToString(CultureInfo.InvariantCulture)); result.Add(v[0]); } 


The error is that the result of createPairs is ignored, and instead an appeal is made to the empty list. Apparently, initially createPairs took the list as a parameter, but an error was made during the method change process.



V3117 "Constructor parameter 'validateType' is not used



An alert has been issued for code similar to



 public SomeClass(IDocument document, bool validateType = true) : base(document, true) { m_provider = document; } 


The warning itself does not seem important. But the problem is more serious than it seems at first glance. In the process of adding the optional parameter validateType, the call to the constructor of the base class was forgotten fixed.



V3127 „Two similar code fragments were found. Perhaps this is a typo and a range of variable should be used instead of a domain



 private void fillTransferFunction(PdfStreamImpl function) { // .. PdfArrayImpl domain = new PdfArrayImpl(); domain.AddReal(0); domain.AddReal(1); function.Add(Field.Domain, domain); PdfArrayImpl range = new PdfArrayImpl(); range.AddReal(0); range.AddReal(1); function.Add(Field.Range, domain); // .... } 


Perhaps a warning will not be issued if parts of the code are slightly different. But in this case, an error was found when using copy-paste.



Theoretical / formal warnings



They are either correct, but fixing them does not fix any specific errors and does not affect the readability of the code. Or they point to places where there could be a mistake, but there is none. For example, the order of the parameters is intentionally changed. For such alerts do not need to change anything in the program.



There were 57 of them, which is about 30% of all warnings. I will give examples for cases that deserve attention.



V3013 “It is the odd that the body of the BeginText function is fully equivalent to the body of the EndText function (166, line 171)“



 public override void BeginText() { m_state.ResetTextParameters(); } public override void EndText() { m_state.ResetTextParameters(); } 


Both body functions are actually the same. But it is right. And is it really strange if the bodies of functions from one line coincide?



V3106 „Possible negative index value. The value of 'c1' index could reach -1 “



 freq[256] = 1; // .... c1 = -1; v = 1000000000L; for (i = 0; i <= 256; i++) { if (freq[i] != 0 && freq[i] <= v) { v = freq[i]; c1 = i; } } // .... freq[c1] += freq[c2]; 


It agree, I resulted a slice of not the most clear algorithm. But, in my opinion, in this case, the analyzer suffers in vain.



V3107 "Identical expression 'neighsum' to the left and to the right of compound assignment"



The warning is caused by a completely banal code:



 neighsum += neighsum; 


Yes, it can be rewritten via multiplication. But there is no error.



V3109 „The 'l_cblk.data_current_size' The sub-expression is present on both sides of the operator. The expression is incorrect or it can be simplified. “



 /* Check possible overflow on size */ if ((l_cblk.data_current_size + l_seg.newlen) < l_cblk.data_current_size) { // ... } 


A comment in the code explains the intent. False alarm again.



Reasonable warnings



Their correction had a positive effect on the readability of the code. That is, reduced unnecessary conditions, checks, etc. The impact on how the code works is not obvious.



There were 103 of them, which is about 54% of all warnings.



V3008 „The 'l_mct_deco_data' variable is assigned values ​​twice successively. Perhaps this is a mistake “



 if (m_nb_mct_records == m_nb_max_mct_records) { ResizeMctRecords(); l_mct_deco_data = (OPJ_INT32)m_nb_mct_records; } l_mct_deco_data = (OPJ_INT32)m_nb_mct_records; 


Rights analyzer: assignment inside if is not necessary.



V3009 „It’s odd that this method always returns one and the same value“



 private static bool opj_dwt_decode_tile(opj_tcd_tilecomp_t tilec, OPJ_UINT32 numres) { if (numres == 1U) return true; // ... return true; } 


On the advice of the analyzer, the method has been changed and returns nothing more.



V3022 „Expression '! Add' is always true“



 private void addToFields(PdfDictionaryImpl controlDict, bool add) { // ... if (add) { // .. return; } if (!add) { // ... } // ... } 


Indeed, there is no point in the second if. The condition will always be true.



V3029 "The conditional expression of the" if "statements followed alongside each other are identical"



 if (stroke) extGState.OpacityStroke = opacity; if (stroke) state.AddReal(Field.CA, opacity); 


It is not clear how such a code came about. But now we fixed it.



V3031 Anonymous check can be simplified. The '||' operator is surrounded by opposite expressions “



This is a nightmare condition:



 if (!(cp.m_enc.m_tp_on != 0 && ((!opj_codec_t.OPJ_IS_CINEMA(cp.rsiz) && (t2_mode == J2K_T2_MODE.FINAL_PASS)) || opj_codec_t.OPJ_IS_CINEMA(cp.rsiz)))) { // ... } 


After the change is much better



 if (!(cp.m_enc.m_tp_on != 0 && (opj_codec_t.OPJ_IS_CINEMA(cp.rsiz) || t2_mode == J2K_T2_MODE.FINAL_PASS))) { // ... } 


V3063 „A part of conditional expression is always true if it is evaluated: x! = Null“

V3022 “Expression 'x! = Null' is always true“



Here I took the warning that checking for null does not make sense. Is it right to do so - the question is ambiguous. Below I have described the essence of the issue.



Unwarranted Warnings



False positives. Due to errors in the implementation of a specific test or some lack of the analyzer.



14 of them were issued, which is about 7% of all warnings.



V3081 „The 'i' counter is not used inside a nested loop. Consider inspecting usage of 'j' counter “



A slightly simplified version of the code for which this warning was issued:



 for (uint i = 0; i < initialGlyphsCount - 1; ++i) { for (int j = 0; j < initialGlyphsCount - i - 1; ++j) { // ... } } 


Obviously, i is used in a nested loop.



V3125 „The object was used after it was verified against null“



The code for which this warning is issued:



 private static int Compare_SecondGreater(cmapEncodingRecord er1, cmapEncodingRecord er2) { if (er1 == er2) return 0; if (er1 != null && er2 == null) return 1; if (er1 == null && er2 != null) return -1; return er1.CompareTo(er2); } 


er1 cannot be null at the time of calling CompareTo ().



Another code for which this warning is issued:



 private static void realloc(ref int[][] table, int newSize) { int[][] newTable = new int[newSize][]; int existingSize = 0; if (table != null) existingSize = table.Length; for (int i = 0; i < existingSize; i++) newTable[i] = table[i]; for (int i = existingSize; i < newSize; i++) newTable[i] = new int[4]; table = newTable; } 


table cannot be null in a loop.



V3134 „Shift by [32.255] bits is greater than the size of 'UInt32' type of expression '(uint) 1'“



A piece of code for which this warning is issued:



 byte bitPos = (byte)(numBits & 0x1F); uint mask = (uint)1 << bitPos; 


It can be seen that bitPos can have a value from the range [0..31], but the analyzer believes that it can have a value from the range [0..255], which is incorrect.



I will not give other similar cases, as they are equivalent.



Additional thoughts about some checks



It seemed undesirable to warn that 'x! = Null' is always true in cases where x is the result of calling some method. Example:



 private X GetX(int y) { if (y > 0) return new X(...); if (y == 0) return new X(...); throw new ArgumentOutOfRangeException(nameof(x)); } private Method() { // ... X x = GetX(..); if (x != null) { // ... } } 


Yes, formally the analyzer is right: x will always not be null, because GetX will either return a full instance, or throw an exception. But will the code improve removing null checks? What if GetX changes later? Is Method required to know the implementation of GetX?



Inside the team opinions are divided. It has been suggested that the current method has a contract in which it should not return null. And it makes no sense to write redundant code "for the future" with each call. And if the contract changes - it is necessary to update the calling code.



In support of this opinion, the following argument was cited: checking for null is like wrapping up every call in try / catch in case the method starts throwing exceptions in the future.



As a result, according to the principle of YAGNI , they decided not to hold on to the checks and deleted them. All warnings were moved from theoretical / formal to reasonable.



I would be glad to read your opinion in the comments.



findings



Static analysis is a useful thing. PVS-Studio allows you to find real errors.



Yes, there are unfounded / incorrect warnings. Still, PVS-Studio found real errors in the project, where Code Analysis is already used. Our product is fairly well covered with tests, our clients test it in one way or another, but static analysis still benefits the robots do it better .



Finally, some statistics.



Top 3 unfounded warnings



V3081 „The 'X' counter is not used inside a nested loop. Consider inspecting usage of 'Y' counter “

1 of 1 is considered unreasonable



V3125 „The object was used against it. Check lines: N1, N2 “

9 out of 10 found to be unfounded



V3134 „Shift by N bits is greater than the size of type“

4 out of 5 were found to be unfounded



Top 3 Important Warnings



V3083 Unsafe invocation of event, NullReferenceException is possible. Consider assigning an event to a local variable before invoking it “

5 out of 5 recognized as important



V3020 „An unconditional 'break / continue / return / goto' within a loop“

V3080 „Possible null dereference“

2 out of 2 recognized as important



V3019 "It is possible that you are not correct after the type conversion using 'as' keyword"

V3127 „Two similar code fragments were found. Perhaps this is a typo and 'X' variable should be used instead of 'Y' “

1 of 1 recognized as important

Source: https://habr.com/ru/post/425741/



All Articles