📜 ⬆️ ⬇️

On the thirtieth anniversary of the first C ++ compiler: look for errors in Cfront

Bjarne Stroustrup
Authors: Andrei Karpov, Bjorn Straustrup.

Cfront is a C ++ compiler that has been around since 1983 and was developed by Bjorn Straustrup. At the time, it was known as "C with classes." Cfront had a full parser, symbol tables, built a tree for each class, function, etc. Cfront was based on CPre. Cfront determined the development of the language until about 1990. Many obscure points that occur in C ++ are related to the limitations of the Cfront implementation. The reason is that Cfront translated from C ++ to C. In short, Cfront is a sacred artifact for any C ++ programmer. And I just could not pass by without checking out this project.

Introduction


The idea to check out Cfront came from a note dedicated to the 30th anniversary of the first Release version of this compiler: " 30 YEARS OF C ++ ". We contacted Bjarne Stroustrup to get Cfront source codes. For some reason I thought that there would be a whole story to get them. It turned out, everything is simple. These sources are publicly available at http://www.softwarepreservation.org/projects/c_plus_plus/ and are available to anyone.

The first commercial version of Cfront, released in October 1985, was selected for verification. After all, she was 30 years old.
')
Björn warned us that checking may not be so simple:

M machine * M M M M M M M M M M 1MB It is also a part of my full time job.

And really. So just to take and check out the project turned out to be impossible. For example, in those times it was not four points (: :) that were used to separate the class name from the function name, but simply the period (.). Example:
inline Pptr type.addrof() { return new ptr(PTR,this,0); } 

The PVS-Studio analyzer was not ready for this. I had to connect a colleague of a student who manually went through the sources and corrected them. It helped, though not to the end. Anyway, in many places, PVS-Studio bulges eyes and refuses to analyze. However, somehow we managed to check out the project.

I must say that I did not find something grand. There were no serious bugs, I think for 3 reasons:
  1. The project has a small size. Only 100 KLOC in 143 files.
  2. The code is qualitative.
  3. The PVS-Studio analyzer still could not verify everything.

“These are all words. Show me the code. ”Linus Torvalds


But enough words. Our readers have gathered here to see at least one mistake of Straustrup himself. Let's watch the code.

First fragment
 typedef class classdef * Pclass; #define PERM(p) p->permanent=1 Pexpr expr.typ(Ptable tbl) { .... Pclass cl; .... cl = (Pclass) nn->tp; PERM(cl); if (cl == 0) error('i',"%k %s'sT missing",CLASS,s); .... } 

PVS-Studio warning: V595 The 'cl' pointer was used before it was verified against nullptr. Check lines: 927, 928. expr.c 927

The 'cl' pointer can be NULL. This is evidenced by checking if (cl == 0). The trouble is that even before this check this pointer is dereferenced. This happens in the PERM macro.

Those. If we expand the macro, we get:
 cl = (Pclass) nn->tp; cl->permanent=1 if (cl == 0) error('i',"%k %s'sT missing",CLASS,s); 

Second fragment

Same. Exchanged pointer, and only then it was checked:
 Pname name.normalize(Pbase b, Pblock bl, bit cast) { .... Pname n; Pname nn; TOK stc = b->b_sto; bit tpdf = b->b_typedef; bit inli = b->b_inline; bit virt = b->b_virtual; Pfct f; Pname nx; if (b == 0) error('i',"%d->N.normalize(0)",this); .... } 

PVS-Studio warning: V595 The 'b' pointer was used before it was verified against nullptr. Check lines: 608, 615. norm.c 608

Third fragment
 int error(int t, loc* lc, char* s ...) { .... if (in_error++) if (t!='t' || 4<in_error) { fprintf(stderr,"\nUPS!, error while handling error\n"); ext(13); } else if (t == 't') t = 'i'; .... } 

PVS-Studio warning: V563 It is possible that this 'else' branch must apply to the previous 'if' statement. error.c 164

I don’t know if there is an error here or not, but the code is incorrectly designed. 'else' refers to the nearest 'if'. Therefore, the code does not work as it looks. If you format it correctly, you get:
 if (in_error++) if (t!='t' || 4<in_error) { fprintf(stderr,"\nUPS!, error while handling error\n"); ext(13); } else if (t == 't') t = 'i'; 

Fourth slice
 extern genericerror(int n, char* s) { fprintf(stderr,"%s\n", s?s:"error in generic library function",n); abort(111); return 0; }; 

PVS-Studio warning: V576 Incorrect format. A different number of actual arguments is expected while calling the 'fprintf' function. Expected: 3. Present: 4. generic.c 8

Note the format specifiers: "% s". A string is printed. But the variable 'n' has remained out of business.

Other

Unfortunately (or fortunately), I can’t show anything more like real mistakes. The analyzer issued a number of warnings to the code, which, although it deserves attention, is not dangerous. For example, the analyzer does not like the names of the following global variables:
 extern int Nspy, Nn, Nbt, Nt, Ne, Ns, Nstr, Nc, Nl; 

PVS-Studio warning: V707 Giving short names for global variables. It is suggested to rename 'Nn' variable. cfront.h 50

Or, for example, the fprintf () function uses the specifier "% i" to print out pointer values. In the modern version of the language for this is "% p". But as I understand it, 30 years ago there was no "% p", and the code is absolutely correct.

Interesting observations


This pointer

I noticed that earlier with 'this' they worked an order of magnitude more boldly and roughly. A couple of examples on this topic:
 expr.expr(TOK ba, Pexpr a, Pexpr b) { register Pexpr p; if (this) goto ret; .... this = p; .... } inline toknode.~toknode() { next = free_toks; free_toks = this; this = 0; } 

As you can see, in those days it was not considered something forbidden, to take and change the value of 'this'. Now it is forbidden not only to change the pointer, but even lost the sense of comparing this with nullptr .

This is the place for paranoia

As they say, nothing can be sure. I liked this snippet of code that I came across:
 /* this is the place for paranoia */ if (this == 0) error('i',"0->Cdef.dcl(%d)",tbl); if (base != CLASS) error('i',"Cdef.dcl(%d)",base); if (cname == 0) error('i',"unNdC"); if (cname->tp != this) error('i',"badCdef"); if (tbl == 0) error('i',"Cdef.dcl(%n,0)",cname); if (tbl->base != TABLE) error('i',"Cdef.dcl(%n,tbl=%d)", cname,tbl->base); 

Commentary by Bjarne Straustrup



findings

The value of Cfront is difficult to overestimate. He influenced the development of a whole branch of programming and presented the world with the ever-living and developing C ++ language. I express my thanks to Björn for all the work he has done in creating C ++. Thank. I, in turn, was pleased to at least “stand side by side” with Cfront.

Thanks to all readers, and I want to wish fewer bugs.

Source: https://habr.com/ru/post/270191/


All Articles