📜 ⬆️ ⬇️

The most dangerous feature in the C / C ++ world

memset ()
Testing various C / C ++ projects for many years, I declare: the most unfortunate and dangerous function is memset (). When using the function memset () allow the greatest number of errors in comparison with the use of other functions. I understand that my conclusion is unlikely to shake the foundations of the universe or is incredibly valuable. However, I think readers will be interested to know why I came to this conclusion.

Hello


My name is Andrey Karpov. I combine many positions and occupations. But the main thing that I do is tell programmers about the benefits that static code analysis can bring. Naturally, I do it with a mercenary purpose, trying to interest readers in the PVS-Studio analyzer. However, this does not reduce the interestingness and usefulness of my articles.

The only type of advertising that programmers can penetrate with scaly armor is a demonstration of examples of errors that PVS-Studio can find. To this end, I check a large number of open source projects and write articles about research results. Universal Benefit. Open projects are getting a little better, and our company has new customers.
')
Now it will become clear to what I am leading. Being engaged in checking open projects, I have accumulated a large base of examples of errors. And now, based on it, I can find interesting patterns.

For example, one of the interesting observations was that programmers make mistakes during Copy-Paste most often at the very end. On this topic, I offer the article " The effect of the last line ".

New observation


Now I have another interesting observation. Using these or other functions, programmers can make mistakes. In this case, the probability of making an error depends on the function used. In other words, some functions provoke errors, while others do not.

So, I am ready to name the function, which, when used, is the most likely to sit in a puddle.

So, the winner on glyuchnost - memset function!

How it happened - it's hard to say. Apparently she has a bad interface. Plus, its very use is quite laborious and easy to make a mistake by calculating the values ​​of the actual arguments.

The honorable second place is taken by the printf () function and its varieties. I think it will not surprise anyone. About the danger of the function printf () did not write just lazy. Perhaps because of the well-known problems associated with printf (), she got into second place.

I have a total of 9055 errors in my database. These are the errors that the PVS-Studio analyzer is able to find. It is clear that he can not all. However, the large number of errors found allows me to be confident in my conclusions. So, I decided that using the memset () function involves 329 errors.

Total, about 3.6% of errors in the database is associated with the memset () function. It's a lot!

Examples


Let's look at some typical examples of errors. Looking at them, I think you will agree that something is wrong with the memset () function. It attracts evil.

First, let's refresh in memory how this function is declared:

void * memset (void * ptr, int value, size_t num);
Example N1 (ReactOS project)
void Mapdesc::identify( REAL dest[MAXCOORDS][MAXCOORDS] ) { memset( dest, 0, sizeof( dest ) ); for( int i=0; i != hcoords; i++ ) dest[i][i] = 1.0; } 

The error is that in C and in C ++ it is impossible to pass arrays by value ( more ). The argument 'dest' is nothing more than an ordinary pointer. Therefore, the sizeof () operator calculates the size of a pointer, not an array.

Like memset () and not guilty. But on the other hand, this function will fill only 4 or 8 bytes with zeros (exotic architectures do not count). There is an error and it occurred when calling the memset () function.

Example N2 (project Wolfenstein 3D)
 typedef struct cvar_s { char *name; ... struct cvar_s *hashNext; } cvar_t; void Cvar_Restart_f( void ) { cvar_t *var; ... memset( var, 0, sizeof( var ) ); ... } 

A similar error. It is admitted most likely due to inattention. The variable 'var' is a pointer. So memset () will only reset part of the structure. In practice, only the 'name' member will be zeroed out.

Sample N3 (SMTP Client project)
 void MD5::finalize () { ... uint1 buffer[64]; ... // Zeroize sensitive information memset (buffer, 0, sizeof(*buffer)); ... } 

A very common error pattern, about which few programmers are nonetheless aware. The fact is that the memset () function will be deleted by the compiler. The buffer after calling memset () is no longer used. And the compiler removes the function call for optimization. From the point of view of the C / C ++ language, this has no effect on the behavior of the program. It really is. The fact that private information will remain in memory will not affect the operation of the program.

This is not a compiler error. And this is not my fantasy. The compiler really deletes memset () calls. Every time I describe this vulnerability error, I get emails where they start arguing with me. I'm already tired of responding to these emails. Therefore, I ask all doubters, before starting the discussion, carefully get acquainted with the following materials:
Example N4 (project Notepad ++)
 #define CONT_MAP_MAX 50 int _iContMap[CONT_MAP_MAX]; ... DockingManager::DockingManager() { ... memset(_iContMap, -1, CONT_MAP_MAX); ... } 

It is often forgotten that the third argument of the memset () function is not the number of elements, but the size of the buffer in bytes. This is exactly what happened in the above code snippet. As a result, only a quarter of the buffer will be filled (provided that the size of the 'int' type is 4 bytes).

Example N5 (Newton Game Dynamics project)
 dgCollisionCompoundBreakable::dgCollisionCompoundBreakable(....) { ... dgInt32 faceOffsetHitogram[256]; dgSubMesh* mainSegmenst[256]; ... memset(faceOffsetHitogram, 0, sizeof(faceOffsetHitogram)); memset(mainSegmenst, 0, sizeof(faceOffsetHitogram)); ... } 

We are dealing with a typo. Most likely someone was too lazy to dial the memset () function twice. Duplicate line. In one place, they replaced 'faceOffsetHitogram' with 'mainSegmenst', and in the other they forgot.

It turns out that sizeof () calculates the size of the wrong array, which is filled with zeros. It seems like the memset () function is not to blame. But it will be wrong to work it.

Example N6 (CxImage project)
 static jpc_enc_tcmpt_t *tcmpt_create(....) { ... memset(tcmpt->stepsizes, 0, sizeof(tcmpt->numstepsizes * sizeof(uint_fast16_t))); ... } 

Here there is an extra sizeof () operator. The correct size is calculated as follows:
 tcmpt->numstepsizes * sizeof(uint_fast16_t) 

But they wrote an extra sizeof () and it turned out stupid:
 sizeof(tcmpt->numstepsizes * sizeof(uint_fast16_t)) 

Here the sizeof () operator calculates the size type of size_t. It is this type that has an expression.

I know that I want to argue. This is not the first time the error has been associated with the sizeof () operator. Those. the programmer is mistaken when calculating the buffer size. However, the cause of these errors is still the memset () function. It is designed so that you have to do these various calculations, in which it is so easy to make a mistake.

Example N7 (WinSCP project)
 TForm * __fastcall TMessageForm::Create(....) { .... LOGFONT AFont; .... memset(&AFont, sizeof(AFont), 0); .... } 

The memset () function is omnivorous. Therefore, calmly react, if you mix up the 2nd and 3rd arguments. This is exactly what happened here. This function fills 0 bytes.

Example N8 (Multi Theft Auto project)

And here is another similar error. It seems that the Win32 API developers joked when they created this macro:
 #define RtlFillMemory(Destination,Length,Fill) \ memset((Destination),(Fill),(Length)) 

By meaning, this is an alternative to memset (). But you have to be careful. Note that the 2 and 3 argument are swapped.

When they start using RtlFillMemory (), they treat it as memset (). And they think that their parameters are the same. As a result, errors occur.
 #define FillMemory RtlFillMemory LPCTSTR __stdcall GetFaultReason ( EXCEPTION_POINTERS * pExPtrs ) { .... PIMAGEHLP_SYMBOL pSym = (PIMAGEHLP_SYMBOL)&g_stSymbol ; FillMemory ( pSym , NULL , SYM_BUFF_SIZE ) ; .... } 

NULL is nothing like 0. Therefore, the memset () function has filled 0 bytes.

Sample N9 (IPP Samples project)

I think you understand that I can give examples of errors for a long time. But this is not very interesting, since they will be very monotonous and similar to those already shown in the article. But one more case let's consider.

Although some of the above errors were found in C ++ code, they have nothing to do with C ++. In other words, these errors occur when programming in C. style.

The next error is related to the incorrect use of memset () in the C ++ program. The example is quite long, so you can not peer into it. Read the description below and everything will become clear.
 class _MediaDataEx { ... virtual bool TryStrongCasting( pDynamicCastFunction pCandidateFunction) const; virtual bool TryWeakCasting( pDynamicCastFunction pCandidateFunction) const; }; Status VC1Splitter::Init(SplitterParams& rInit) { MediaDataEx::_MediaDataEx *m_stCodes; ... m_stCodes = (MediaDataEx::_MediaDataEx *) ippsMalloc_8u(START_CODE_NUMBER*2*sizeof(Ipp32s)+ sizeof(MediaDataEx::_MediaDataEx)); ... memset(m_stCodes, 0, (START_CODE_NUMBER*2*sizeof(Ipp32s)+ sizeof(MediaDataEx::_MediaDataEx))); ... } 

The memset () function is used to initialize an array consisting of class objects. The biggest trouble is that the class contains virtual functions. Accordingly, the memset () function not only clears the class fields, but also a pointer to a virtual method table (vptr). What this will lead to is unknown. But there is nothing good about this. You can't handle classes like this.

Conclusion


As you can see, the memset () function has a very bad interface. As a result, the memset () function more than any other provokes errors. Be carefull!

I am not ready to say now how to use my observation. But I hope you were interested to get acquainted with this article. Perhaps now, using memset (), you will be more attentive. And this is good.

Thank you all for your attention and subscribe to my @ Code_Analysis twitter .

I also lead the C ++ Hints resource, where I share various useful tips that come to my mind. Subscribe.

Source: https://habr.com/ru/post/272243/


All Articles