
Once again, I made sure that programmers write programs completely disorderly. And they do not work because of their merits, but because of luck and the care of the developers of the compilers at Microsoft or Intel. Yes, yes, it is they who care and at the right moment substitute the crutches for our curved programs.
Pray, pray for compilers and their developers. They are making so much effort that our programs work, despite many shortcomings and even mistakes. Moreover, their work is difficult and not visible. They are noble coding knights and patron angels for all of us.
')
I knew that Microsoft has a department that deals with ensuring the maximum compatibility of new versions of operating systems with old applications. In their database, more than 10,000 of the most famous old programs that must necessarily work in new versions of Windows. Thanks to such efforts, I recently was able to play Heroes of Might and Magic II (game of 1996) without any problems, running 64-bit Windows Vista. I think the game will start successfully in Windows 7. Here are interesting notes by Alexey Pakhunov on compatibility [
1 ,
2 ,
3 ], I highly recommend reading.
But apparently there are also departments that are engaged in helping our terrible C / C ++ code to work, work and work. I'll start this story from the beginning.
I am involved in the development of the
PVS-Studio tool for analyzing the source code of applications. Quiet comrades, quiet - this is not advertising. This time it’s definitely a godly business, for we started to create a free general-purpose static analyzer. So far, even the alpha version is far away, but the work is slowly going and someday I will do a post about this analyzer on Habrahabr. I started talking about this because we started to collect the most interesting sample errors and learn how to diagnose them.
Many errors are associated with the use of ellipses in programs. Theoretical reference:
There are functions in the description of which it is impossible to specify the number and types of all valid parameters. Then the list of formal parameters ends with an ellipsis (...), which means: "and, perhaps, a few more arguments." For example: int printf (const char * ...);One such unpleasant, but easily diagnosable error is the transfer to a function with a variable number of arguments of an object of type class, instead of a pointer to a string. Here is an example of this error:
wchar_t buf [100];
std :: wstring ws (L "12345");
swprintf (buf, L "% s", ws);
Such a code will lead to the formation of a rubbish in the buffer or to a program crash. In a real program, of course, the code will be more confusing, so please - do not need to write comments that, unlike Visual C ++, the GCC compiler will check the arguments and warn you. Strings can come from resources or other functions and nothing will be checked. Here, the diagnosis is simple - the class object is passed to the function of forming the string, which leads to an error.
The correct code should look like this:
wchar_t buf [100];
std :: wstring ws (L "12345");
swprintf (buf, L "% s", ws.c_str ());
It is due to the fact that in a function with a variable number of arguments, it is possible to pass anything to them and are not recommended to be used in almost all books on C ++ programming. Instead, it is proposed to use secure mechanisms, for example, boost :: format. However, the recommendations are recommendations, and the code with different printf, sprintf, CString :: Format is huge and we will live with it for a very long time. That is why we implemented a diagnostic rule that identifies such dangerous structures.
Let's understand theoretically what the above code is wrong. It turns out he is incorrect twice.
- The argument does not match the specified format. Since we specify "% s", then we must pass a pointer to the string. However, theoretically, we can write our own function sprintf, which will know that an object of class std :: wstring has been passed to it and will print it correctly. However, this is also impossible due to the reason number 2.
- The argument for the ellipsis "..." can only be the POD type. And std :: string POD type is not.
Theoretical reference about POD types:
POD is an abbreviation of “Plain Old Data”, which can be translated as “Simple data in the style of C”. POD types include:- all built-in arithmetic types (including wchar_t and bool);
- types declared using the enum keyword;
- pointers;
- POD structures (struct or class) and POD unions (union) that satisfy the requirements below:
- do not contain custom constructors, destructor or copy assignment operator;
- do not have base classes;
- do not contain virtual functions;
- do not contain protected (protected) or closed (private) non-static data members;
- do not contain non-static data members of non-POD types (or arrays of such types), as well as references.
Accordingly, the class std :: wstring does not apply to POD types, since it has constructors, a base class, and so on.
Moreover, if you pass an object that is not a POD type into an ellipse, this leads to undefined behavior. Thus, at least theoretically, we cannot in any way correctly pass an object of type std :: wstring as an ellipsis argument.
The same picture should be observed with the Format functions from the class CString. Invalid code option:
CString s;
CString arg (L "OK");
s.Format (L "Test CString:% s \ n", arg);
The correct code is:
s.Format (L "Test CString:% s \ n", arg.GetString ());
Or, as suggested in MSDN [
4 ], to obtain a pointer to a string, you can use the explicit cast operator LPCTSTR, implemented in the CString class. An example of the correct code from MSDN:
CString kindOfFruit = "bananas";
int howmany = 25;
printf ("You have% d% s \ n", howmany, (LPCTSTR) kindOfFruit);
So, everything seems to be transparent and clear. How to make the rule clear too. We will detect typos when using functions with a variable number of arguments.
This was done. And here I was shocked by the result. It turns out that most developers never think about these problems at all and calmly write code like this:
class CRuleDesc
{
CString GetProtocol ();
CString GetSrcIp ();
CString GetDestIp ();
CString GetSrcPort ();
CString GetIpDesc ​​(CString strIp);
...
CString CRuleDesc :: GetRuleDesc ()
{
CString strDesc;
strDesc.Format (
_T ("% s all network traffic from <br>% s"
"on% s <br> to% s on% s <br> for the% s"),
GetAction (), GetSrcIp (), GetSrcPort (),
GetDestIp (), GetDestPort (), GetProtocol ());
return strDesc;
}
// ---------------
CString strText;
CString _strProcName (L "");
...
strText.Format (_T ("% s"), _strProcName);
// ---------------
CString m_strDriverDosName;
CString m_strDriverName;
...
m_strDriverDosName.Format (
_T ("\\\\. \\% s"), m_strDriverName);
// ---------------
CString __stdcall GetResString (UINT dwStringID);
...
_stprintf (acBuf, _T ("% s"),
GetResString (IDS_SV_SERVERINFO));
// ---------------
// I think it is clear
// that examples can be given and lead.
And some are thinking, but forgotten. And that is why the following code looks so touching:
CString sAddr;
CString m_sName;
CString sTo = GetNick (hContact);
sAddr.Format (_T ("\\\% s \\ mailslot \\% s"))
sTo, (LPCTSTR) m_sName);
And there were so many such examples in the projects on which we test PVS-Studio that it was not clear how this could be. And, nevertheless, it all works wonderfully, which I was able to verify by writing a test program and trying various options for using CString.
What is the matter? Apparently the developers of the compilers could not stand the endless questions why the Hindus programs using CString do not work and the accusations of “the compiler's glitches that work incorrectly with strings”. And they quietly performed the sacred ritual of exorcism, expelling the evil from CString. They made the impossible possible. Namely, the CString class is implemented in a special tricky way, so that it can be passed into functions of the form printf, Format.
This is done quite cleverly and who is interested, it can read the source code of the CStringT class, as well as get acquainted with the detailed discussion of "
Pass CString to printf? " [5]. I will not go into details. I note only an important point. The special implementation of CString is not sufficient; in theory, the transfer of a non-POD type leads to unpredictable behavior. So the developers of Visual C ++, and along with them, Intel C ++ made it so that unpredictable behavior is always the correct result. :) After all, the correct operation of the program is quite a subset of unpredictable behavior. :)
And now I'm starting to think about some strange features of the compiler's behavior when building 64-bit programs. There is a suspicion that the compiler developers deliberately make the program's behavior not theoretical, but practical (workable), in those simple cases when they recognize a certain pattern. The most understandable example is the loop pattern. An example of incorrect code:
size_t n = BigValue;
for (unsigned i = 0; i <n; i ++) {...}
Theoretically, if the value of n> UINT_MAX is greater, then an infinite loop should occur. However, in the Release version it does not occur, since the 64-bit register is used for the variable “i”. Of course, if the code is more complicated, then an infinite loop will arise, but at least in some cases the program will be lucky. I wrote more about this in the article "
64-bit horse that can count " [6].
I used to think that such an unexpectedly successful behavior of the program is connected solely with the features of optimizing the release versions. But now I'm not so sure. Perhaps this is a conscious attempt, at least sometimes, to make an unworkable program workable. Of course, I do not know whether the reason is optimization or care of a big brother, but this is a wave of reason to philosophize. :) Well, who knows, he is unlikely to say. :)
I am sure that there are other moments when the compiler gives a hand to programs for cripples. If you get something else interesting, be sure to tell.
I wish you a bug-free code!
Bibliographic list
- Alexey Pakhunov's blog. Backward compatibility is serious. http://www.viva64.com/go.php?url=390
- Alexey Pakhunov's blog. AppCompat. http://www.viva64.com/go.php?url=391
- Alexey Pakhunov's blog. Is Windows 3.x alive? http://www.viva64.com/go.php?url=392
- Msdn CString Operations Relating to C-Style Strings. Topic: Using CString Objects with Variable Argument Functions. http://www.viva64.com/go.php?url=393
- Discussion on eggheadcafe.com. Pass CString to printf? http://www.viva64.com/go.php?url=394
- Andrey Karpov. 64-bit horse that can count. http://www.viva64.com/art-1-1-1064884779.html