Authors: Andrei Karpov, Bjorn Straustrup.Cfront is a C ++ compiler that has been around since 1983 and was developed by Bjorn Straustrup. At the time, it was known as "C with classes." Cfront had a full parser, symbol tables, built a tree for each class, function, etc. Cfront was based on CPre. Cfront determined the development of the language until about 1990. Many obscure points that occur in C ++ are related to the limitations of the Cfront implementation. The reason is that Cfront translated from C ++ to C. In short, Cfront is a sacred artifact for any C ++ programmer. And I just could not pass by without checking out this project.
Introduction
The idea to check out
Cfront came from a note dedicated to the 30th anniversary of the first Release version of this compiler: "
30 YEARS OF C ++ ". We contacted Bjarne Stroustrup to get Cfront source codes. For some reason I thought that there would be a whole story to get them. It turned out, everything is simple. These sources are publicly available at
http://www.softwarepreservation.org/projects/c_plus_plus/ and are available to anyone.
The first commercial version of Cfront, released in October 1985, was selected for verification. After all, she was 30 years old.
')
Björn warned us that checking may not be so simple:
M machine * M M M M M M M M M M 1MB It is also a part of my full time job.And really. So just to take and check out the project turned out to be impossible. For example, in those times it was not four points (: :) that were used to separate the class name from the function name, but simply the period (.). Example:
inline Pptr type.addrof() { return new ptr(PTR,this,0); }
The
PVS-Studio analyzer was not ready for this. I had to connect a colleague of a student who manually went through the sources and corrected them. It helped, though not to the end. Anyway, in many places, PVS-Studio bulges eyes and refuses to analyze. However, somehow we managed to check out the project.
I must say that I did not find something grand. There were no serious bugs, I think for 3 reasons:
- The project has a small size. Only 100 KLOC in 143 files.
- The code is qualitative.
- The PVS-Studio analyzer still could not verify everything.
“These are all words. Show me the code. ”Linus Torvalds
But enough words. Our readers have gathered here to see at least one mistake of Straustrup himself. Let's watch the code.
First fragment typedef class classdef * Pclass; #define PERM(p) p->permanent=1 Pexpr expr.typ(Ptable tbl) { .... Pclass cl; .... cl = (Pclass) nn->tp; PERM(cl); if (cl == 0) error('i',"%k %s'sT missing",CLASS,s); .... }
PVS-Studio warning: V595 The 'cl' pointer was used before it was verified against nullptr. Check lines: 927, 928. expr.c 927
The 'cl' pointer can be NULL. This is evidenced by checking if (cl == 0). The trouble is that even before this check this pointer is dereferenced. This happens in the PERM macro.
Those. If we expand the macro, we get:
cl = (Pclass) nn->tp; cl->permanent=1 if (cl == 0) error('i',"%k %s'sT missing",CLASS,s);
Second fragmentSame. Exchanged pointer, and only then it was checked:
Pname name.normalize(Pbase b, Pblock bl, bit cast) { .... Pname n; Pname nn; TOK stc = b->b_sto; bit tpdf = b->b_typedef; bit inli = b->b_inline; bit virt = b->b_virtual; Pfct f; Pname nx; if (b == 0) error('i',"%d->N.normalize(0)",this); .... }
PVS-Studio warning: V595 The 'b' pointer was used before it was verified against nullptr. Check lines: 608, 615. norm.c 608
Third fragment int error(int t, loc* lc, char* s ...) { .... if (in_error++) if (t!='t' || 4<in_error) { fprintf(stderr,"\nUPS!, error while handling error\n"); ext(13); } else if (t == 't') t = 'i'; .... }
PVS-Studio warning: V563 It is possible that this 'else' branch must apply to the previous 'if' statement. error.c 164
I don’t know if there is an error here or not, but the code is incorrectly designed. 'else' refers to the nearest 'if'. Therefore, the code does not work as it looks. If you format it correctly, you get:
if (in_error++) if (t!='t' || 4<in_error) { fprintf(stderr,"\nUPS!, error while handling error\n"); ext(13); } else if (t == 't') t = 'i';
Fourth slice extern genericerror(int n, char* s) { fprintf(stderr,"%s\n", s?s:"error in generic library function",n); abort(111); return 0; };
PVS-Studio warning: V576 Incorrect format. A different number of actual arguments is expected while calling the 'fprintf' function. Expected: 3. Present: 4. generic.c 8
Note the format specifiers: "% s". A string is printed. But the variable 'n' has remained out of business.
OtherUnfortunately (or fortunately), I can’t show anything more like real mistakes. The analyzer issued a number of warnings to the code, which, although it deserves attention, is not dangerous. For example, the analyzer does not like the names of the following global variables:
extern int Nspy, Nn, Nbt, Nt, Ne, Ns, Nstr, Nc, Nl;
PVS-Studio warning: V707 Giving short names for global variables. It is suggested to rename 'Nn' variable. cfront.h 50
Or, for example, the fprintf () function uses the specifier "% i" to print out pointer values. In the modern version of the language for this is "% p". But as I understand it, 30 years ago there was no "% p", and the code is absolutely correct.
Interesting observations
This pointerI noticed that earlier with 'this' they worked an order of magnitude more boldly and roughly. A couple of examples on this topic:
expr.expr(TOK ba, Pexpr a, Pexpr b) { register Pexpr p; if (this) goto ret; .... this = p; .... } inline toknode.~toknode() { next = free_toks; free_toks = this; this = 0; }
As you can see, in those days it was not considered something forbidden, to take and change the value of 'this'. Now it is forbidden not only to change the pointer, but even
lost the sense of comparing this with nullptr .
This is the place for paranoiaAs they say, nothing can be sure. I liked this snippet of code that I came across:
if (this == 0) error('i',"0->Cdef.dcl(%d)",tbl); if (base != CLASS) error('i',"Cdef.dcl(%d)",base); if (cname == 0) error('i',"unNdC"); if (cname->tp != this) error('i',"badCdef"); if (tbl == 0) error('i',"Cdef.dcl(%n,0)",cname); if (tbl->base != TABLE) error('i',"Cdef.dcl(%n,tbl=%d)", cname,tbl->base);
Commentary by Bjarne Straustrup
- Cfront was based on Cpre, but completely rewritten. There is not a single line from Cpre in the Cfront code.
- In the use-before-test-of-0 error (use before checking to 0), of course, there is nothing good, but, curiously, the configuration on which I mostly worked (the DEC machine and the OS Research Unix) implemented write protection page zero (and here too), so this bug would not work without being detected.
- With a bug (if it really is a bug) with an if-then-else turned out to be unusual. I looked at the source code: this is not just a typo, but an error. However, interestingly, it does not affect the result in any way: there will be only a slight difference in the error message that will appear before the completion. Not surprisingly, I did not notice her.
- Yes, I should have used more readable names. Initially, I didn’t expect that other people will support the program for many years (and I’m also bad at typing).
- Yes, there were no% p specifiers at that time.
- Yes, the rules for this have changed.
- The paranoid test was used in the main compiler loop. I proceeded from the considerations that if something happens to the software or hardware, one of these tests will fail. At least once, he revealed the consequences of one bug in the code generator that was used to build Cfront. I believe that all serious applications should use such a “paranoid test” to catch “impossible” errors.
findings
The value of Cfront is difficult to overestimate. He influenced the development of a whole branch of programming and presented the world with the ever-living and developing C ++ language. I express my thanks to Björn for all the work he has done in creating C ++. Thank. I, in turn, was pleased to at least “stand side by side” with Cfront.
Thanks to all readers, and I want to wish fewer bugs.