📜 ⬆️ ⬇️

How to read ads in C

Even completely green C programmers have no problems reading these ads:
int foo[5]; // foo 5 int
char *foo; // foo char
double foo(); // foo double

But as soon as the ads become a bit more complicated, it’s difficult to say exactly what it is. For example:
char *(*(**foo[][8])())[];


It turns out that the rules for reading arbitrarily cumulative declarations are easily learned even by novice programmers (although it is impossible to use such a declared variable).

Basic and derived types


In addition to the variable name, the declaration consists of one basic type and may also contain a derived type, and this is the key to understanding the differences between them.

Basic types:
• char
• signed char
• unsigned char
• short
• unsigned short
• int
• unsigned int
• long
• unsigned long
• float
• double
• long double
• void
• struct tag
• union tag
• enum tag
• long long
• unsigned long long
')
An ad can contain only one main type , and it is always on the left of an expression. The main types are complemented by derived types, in C there are three:

1) * - pointer to ...
It is denoted by the * symbol, and it is important to understand that the pointer always points to something.

2) [] - an array of ...
The array can be dimensionless - [], and maybe dimensional [10]. True or not, it does not matter when reading ads (usually the size of the array is written). It should be clear that an array is always an "array of something."

3) () - a function that returns ...
It is usually indicated by a pair of parentheses (), but it is also possible that within them there will be parameter models. The parameter list (if any) does not play a significant role when reading ads, and we usually ignore it. Note that the parentheses used to designate functions are different from the brackets used for grouping: grouping brackets surround variables, whereas brackets to designate a function are on the right. The function does not make sense if it returns nothing (when we declare a function with a return type of void then it just looks like the function returns void)

Derived types always modify something, be it a basic type or a derivative, and in order to correctly read the declarations, one should always insert a preposition (“to”, “from”, “returning”). Using the “pointer” instead of “pointer to” when reading, you will definitely read the announcement incorrectly.

The priority of operators.


Almost every C programmer is familiar with operator precedence tables, which state that (for example) multiplication and division have higher priority (are performed earlier) than addition and subtraction, and grouping brackets are used to change this priority. This seems normal for “ordinary” expressions but the same rules apply to ads - they are just “typical”, not computational.
The operators “array of” [] and “return function” () have a higher priority than “pointer to”, which leads to a rather simple decoding rule:
Always start with the variable name:

foo is .....

And finish the decoding with the main type:

... of int type

What will be in the middle is usually harder to make out, but you can formulate the rule:
move to the right, if possible, and move to the left if necessary
Starting with the variable name, following the rules of priority, move to the right as much as possible by deleting lexemes until you reach the grouping brackets. After that, move to the left in accordance with the brackets.

A simple example.


Let's start with a simple example:
-> long ** foo [7];

Let's try to figure it out by focusing on one or two parts, highlighting them in bold type , otherwise we’ve already decided to cross out

-> long ** foo [7];

We start with the variable name and end with the main type:
foo is ... like long

We analyze further:
-> long ** foo [7] ;

At this moment, the variable name is surrounded by a token meaningful "array of 7" and a token meaningful "pointer to", and in accordance with the rule we move to the right and append to our description an "array of 7":
foo is an array of 7 ... of type long

-> long * * foo [7] ;

There is no place to move to the right, and the nearest token is a “pointer to”. Add it:
foo is an array of 7 pointers to ... a value of type long

-> long * * foo [7] ;

The nearest lexeme is also a “pointer to”, add it as well:
foo is an array of 7 pointers to pointers to a value of type long

Well that's all.

Difficult example



To test our skills, we need to try to read a very complex announcement that will never occur in real life (in fact, we thought for a very long time how to apply this announcement). But we need to show that the rules also work for very complex declarations.

-> char * (* (** foo [] [8]) ()) [];
All ads should start reading with "variable name ... main type"
foo is ... of type char ;

-> char * (* (** foo [] [8]) ()) [];
The name is adjacent to the "pointer to" and "array of", go right:
foo is an array of ... of type char;

char * (* (** foo [] [8] ) ()) [];
We can choose the right or left adjacent lexeme, but the rule says that you need to move to the right as far as possible, while there is something adjacent to the inside of the grouping brackets, so we go to the right.
foo is an array from an array of ... of type char;

-> char * (* (* * foo [] [8] ) ()) [];
We have reached the grouping brackets, and it is not possible to move further to the right, so we move to the left until we reach the pair grouping bracket to cross out all other lexemes.
foo is an array from an array of pointers to ... of type char;

-> char * (* ( * * foo [] [8] ) ()) [];
Again we move to the left and assign a “pointer to”.
foo is an array from an array of pointers to pointers to ... of type char;

-> char * (* (** foo [] [8]) () ) [];
After we added the “pointer to” in the previous step, we reached the pairing grouping bracket, so we will continue to attach to the “grouping brackets”. Now they are adjoined by the “function returns” on the right and the “pointer to” on the left. We move to the right.
foo is an array from an array of pointers to pointers to a function that returns ... of type char;

-> char * ( * (** foo [] [8]) () ) [];
We again rested against the grouping brackets, so we return to the left again.
foo is an array from an array of pointers to function pointers that return pointers to ... of type char;

-> char * (* (** foo [] [8]) ()) [] ;
Bypassing the grouping brackets, we see that now to the crossed out tokens there is an “array of” on the right and a “pointer to” on the left, the “array of” is on the right, we will add.
foo is an array from an array of pointers to function pointers that return pointers to an array of ... of type char;

-> char * (* (** foo [] [8]) ()) [] ;
Well, add the last lexeme.
foo is an array from an array of pointers to function pointers that return pointers to an array of pointers to char;

We really do not know how to apply it, but the type description is correct.

Abstract ads



The C standard allows the use of abstract declarations when a type is to be declared, but not associated with a variable name. This is used when casting, and as a sizeof argument, sometimes it looks terrifying:

int (* (*) ()) () ;

Naturally, the question arises of where to start, so the answer will sound like "you need to find a place where the variable name will stand and be treated as a regular declaration." Such a place will be only one, and to find it is actually very simple. Using the syntax rules we know:

• to the right of all tokens
• to the left of all the tokens “array of”
• to the left of all tokens "function returns"
• inside all grouping brackets

And now let's look at an example. We see that the left set of tokens “pointer to” sets one border and the right set of tokens “function returns” sets another border.
int (* (* •) • ()) () ;

Red dots show where to put the name of a variable, but only one place satisfies the conditions (inside grouping brackets). And what about us then with the announcement? Here's what:

int (* (* foo) ()) () ;
which our rules describe as:
foo is a pointer to a function that returns a pointer to a function that returns a value of type int

Semantic constraints / Notes


Not all combinations of derived types are allowed. It is possible to create ads that perfectly fit the syntactic rules, but which nevertheless will be erroneous (they will be syntactically correct, but semantically erroneous, for example)

• Cannot create an array of functions
But you can use an array of function pointers.

• The function cannot return a function.
But it can return a pointer to a function.

• The function cannot return an array.
Again, the function can return a pointer to an array.

• In arrays, only the left token [] may be empty
C supports multi-dimensional arrays (for example, foo [1] [2] [3] [4]), which is a very simple data structure. However, when an array has more than one dimension, only the first brackets can be empty. char foo [] and char foo [] [5] have the right to exist, but char foo [5] [] is already prohibited
• Type "void" limited
The type “void” is a pseudo-type, and variables of this type can only be “pointer to” and “return function”. It is forbidden (more precisely, impossible) to use an “array of void” and just variables of the “void” type.
void * foo; // allowed
void foo (); // allowed
void foo; //prohibited
void foo []; //prohibited

Add call agreement type


When developing on the windows platform, it is often added to the description of the call convention function. This tells the computer which method to use to call the function in the request, and the method should be the same as the function expects. Here's what it looks like:

extern int __cdecl main(int argc, char **argv);

extern BOOL __stdcall DrvQueryDriverInfo(DWORD dwMode, PVOID pBuffer,
DWORD cbBuf, PDWORD pcbNeeded);


Such an addition is very often found in the development under win32, it is quite simple to understand. More information in the article Using win32 calling conventions .

Where it becomes somehow more complicated is when the calling convention has to be included in the “pointer” (including typedef), because the token does not look like it fits the normal pattern. This is often used when it comes to working with LoadLibrary ( ) and GetProcAddress () API to call a function call from a newly loaded library.
This can often be found with typedef:
typedef BOOL (__stdcall *PFNDRVQUERYDRIVERINFO)(
DWORD dwMode,
PVOID pBuffer,
DWORD cbBuf,
PDWORD pcbNeeded
);

...

/* get the function address from the DLL */
pfnDrvQueryDriverInfo = (PFNDRVRQUERYDRIVERINFO)
GetProcAddress(hDll, "DrvQueryDriverInfo")


Call negotiation is an attribute of a function, not a pointer, so when reading this, you need to put it in front of the pointer, but still inside the grouping brackets:

BOOL (__stdcall * foo) (...);

Read:
foo is a pointer to the __stdcall function returning a BOOL.

ps Please write about inaccuracies in a personal.

Source: https://habr.com/ru/post/116255/


All Articles