📜 ⬆️ ⬇️

GridStack - A practical example of flex + bison

Recently , several articles devoted to the grammatical analysis of expressions have appeared on Habré.
And that's great! In my humble opinion, every programmer should at least once in his life write an expression parsing. I will try and make my contribution to the common cause.

There are many methods of analysis (I recommend the following review by Dick Grune, Ceriel JH Jacobs - Parsing Techniques: A Practical Guide, ISBN 0-13-651431-6 ). Moreover, implementations of methods range from fully manual to using automated generators, such as bison, antlr, lemon, and others.
While manual writing of lexical and syntactic (I will call from lexer and parser) allows you to achieve maximum speed and control (especially over errors and ways to overcome them), the use of generators allows you to focus directly on the task, facilitates the modification of grammar and saves time. The ability to own such tools allows more often to resort to DSL (Domain Specific Language) and generally see the possibility of their use.

I want to give an example of using bison (parser) and flex (lexer) in real life: from the origin of the problem, to its solution.
')

There is a wonderful program GridMove , which allows you to organize application windows by any kind of binding grid. Binding grids are specified in a regular ini-file, in which the trigger areas and the corresponding window sizes and positions are indicated. Together, many remarkable meshes have been developed. But there was a problem: they didn’t fit my layouts, creating several grids manually, I wanted to somehow automate the process.

After analyzing the grids I needed, I realized that they can be described by a simple recursive structure.
A region is either a summary window, or a vertical region, or a horizontal region. The vertical region consists of regions that have the same width and divide the height in the specified shares. The horizontal region, respectively, consists of regions that have the same height and divide the width in the specified shares.

Consider a simple example.
window splitting example
The entire monitor is represented as a horizontal region, which consists of subregions A and B. Region A is a vertical region of subregions C and D. Region C is simply a window that is no longer divided. Region D is a horizontal region of three simple windows. Region B is a vertical region of three simple windows.
This can all be described as follows on the DSL invented by me:
Monitor 1
HStack # All Window
(
VStack 3 # Region A
(
Window 2 # Region C
HStack # Region D
(
Window # Region E
Window # Region F
Window # Region G
)
)
VStack # Region B
(
Window # Region H
Window # Region I
Window # Region J
)
)

Window, HStack, VStack is a description of the regions. Numbers in the region - relative weight, the proportion of the region in the superregion (default == 1). If the number is accompanied by the “!” Sign, then this is a hard-coded pixel size.

When I realized that the arrangement of the windows on the screen can be described as follows, I scribbled something above on a piece of paper, only in Russian. Then I decided that it would be more convenient for me to create a DSL, disassemble it and process it.
Let's build a grammar. Non-terminal symbols are italicized.
start :: = monitor_list
monitor_list :: = monitor | monitor monitor_list
monitor :: = MONITOR DIMENSION region
region_list :: = region | region region_list
region :: = window | hstack | vstack
window :: = WINDOW DIMENSION | WINDOW FIX_DIMENTION | WINDOW
hstack :: = HSTACK OPEN region_list CLOSE | HSTACK DIMENSION OPEN region_list CLOSE | HSTACK FIX_DIMENTION OPEN region_list CLOSE
vstack :: = VSTACK OPEN region_list CLOSE | VSTACK DIMENSION OPEN region_list CLOSE | VSTACK FIX_DIMENTION OPEN region_list CLOSE

There are very few terminal characters for which you need to build a lexical analyzer: MONITOR, HSTACK, VSTACK, WINDOW, OPEN, CLOSE, DIMENSION and FIX_DIMENTION. For such a case, you can, of course, write a lexical analyzer manually, but even for such a case it is easier to use flex. I will provide the finished GridStack.lex file. Please do not pay special attention to how the case-insensitive vocabulary is implemented. It was possible to call flex with the -i option , but I did case-insensitively that way. I started the terminals for the brackets, as I was not sure about the final format.
/*<br/>
* GridStack.lex<br/>
*/
<br/>
<br/>
% { <br/>
#include <stdio.h> <br/>
#include <string.h> <br/>
#include <malloc.h> <br/>
#include "common.h" <br/>
#include "GridStack.tab.h" <br/>
/*"GridStack.tab.h" */ <br/>
/* yylval, */ <br/>
/* bison */ <br/>
<br/>
<br/>
#undef ECHO <br/>
#define ECHO <br/>
% } <br/>
<br/>
digit [ 0 - 9 ] <br/>
<br/>
int_constant { digit } + <br/>
<br/>
% option noyywrap <--- yywrap <br/>
<br/>
%% <br/>
[ Mm ] [ Oo ] [ Nn ] [ Ii ] [ Tt ] [ Oo ] [ Rr ] { <br/>
yylval. IntVal = MONITOR;<br/>
return MONITOR;<br/>
} <br/>
[ Hh ] [ Ss ] [ Tt ] [ Aa ] [ Cc ] [ Kk ] { <br/>
yylval. IntVal = HSTACK;<br/>
return HSTACK;<br/>
} <br/>
[ Vv ] [ Ss ] [ Tt ] [ Aa ] [ Cc ] [ Kk ] { <br/>
yylval. IntVal = VSTACK;<br/>
return VSTACK;<br/>
} <br/>
[ Ww ] [ Ii ] [ Nn ] [ Dd ] [ Oo ] [ Ww ] { <br/>
yylval. IntVal = WINDOW;<br/>
return WINDOW;<br/>
} <br/>
{ int_constant } ! { <br/>
sscanf ( yytext , "%d!" ,& yylval. IntVal ) ;<br/>
return FIX_DIMENTION;<br/>
} <br/>
{ int_constant } { <br/>
sscanf ( yytext , "%d" ,& yylval. IntVal ) ;<br/>
return DIMENSION;<br/>
} <br/>
\ ( { <br/>
yylval. IntVal = OPEN;<br/>
return OPEN;<br/>
} <br/>
\ ) { <br/>
yylval. IntVal = CLOSE;<br/>
return CLOSE;<br/>
} <br/>
#.*$ /* */ <br/>
[ \n\r\t ] + /* */ <br/>
. { <br/>
return * yytext;<br/>
} <br/>
%% <br/>
<br/>


When writing a parser, there are usually two approaches:
  1. The implementation of all logic immediately in place.
  2. Data collection and subsequent processing. In this case, the structure is usually built called AST (Abstract Syntax Tree).

I am a supporter of the second method, so all I did was collect all the data. An additional advantage is that now you can bring the code entirely, without clearing the excess logic of calculating the dimensions of the regions.
Here is a file GridStack.y with comments.

% {
#include <stdio.h>
#include <malloc.h>
#include "common.h"
% }
% union / * This is the definition of the yylval join type, which bison and flex * /
{ / * used to transfer the values ​​of tokens. And bison for data transfer * /
int intVal; / * between the rules. Field names are types that can be assigned * /
TRegion * Region; / * tokens and nonterminals. * /
TListCage * RegionList; / * Below, the type of tokens is set with% token, and with% type - * /
TMonitor * Monitor; / * non-terminals. When used in rules of structures of the form $ 1, .., n and $$ * /
} / * the corresponding yylval fields are used. * /

% token < IntVal > MONITOR WINDOW HSTACK VSTACK OPEN CLOSE
% token < IntVal > DIMENSION FIX_DIMENTION
% type < Region > window hstack vstack region
% type < RegionList > region_list
% type < Monitor > monitor monitor_list

%%
start : monitor_list
{
Monitors = $ 1 ;
}
;
monitor_list : monitor
{
$$ = $ 1 ;
}
| monitor monitor_list
{
$ 1 -> Next = $ 2 ;
$$ = $ 1 ;
}
;
monitor : MONITOR DIMENSION region
{
TMonitor * Monitor = ( TMonitor * ) malloc ( sizeof ( TMonitor ) ) ;
Monitor -> Region = $ 3 ;
Monitor -> Monitor = $ 2 ;
Monitor -> Next = NULL ;
$$ = Monitor;
}
;
region_list : region
{
TListCage * Cage = ( TListCage * ) malloc ( sizeof ( TListCage ) ) ;
Cage -> Region = $ 1 ;
Cage -> Next = NULL ;
$$ = Cage;
}
| region region_list
{
TListCage * Cage = ( TListCage * ) malloc ( sizeof ( TListCage ) ) ;
Cage -> Region = $ 1 ;
Cage -> Next = $ 2 ;
$$ = Cage;
}
;
region : window
| hstack
| vstack
;
window : WINDOW DIMENSION
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtWindow;
Window -> isFixed = 0 ;
Window -> Dimension = $ 2 ;
Window -> RegionList = NULL ;
$$ = Window;
}
| WINDOW FIX_DIMENTION
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtWindow;
Window -> isFixed = 1 ;
Window -> Dimension = $ 2 ;
Window -> RegionList = NULL ;
$$ = Window;
}
| WINDOW
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtWindow;
Window -> isFixed = 0 ;
Window -> Dimension = 1 ;
Window -> RegionList = NULL ;
$$ = Window;
}
;
hstack : HSTACK OPEN region_list CLOSE
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtHorizontal;
Window -> isFixed = 0 ;
Window -> Dimension = 1 ;
Window -> RegionList = $ 3 ;
$$ = Window;
}
| HSTACK DIMENSION OPEN region_list CLOSE
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtHorizontal;
Window -> isFixed = 0 ;
Window -> Dimension = $ 2 ;
Window -> RegionList = $ 4 ;
$$ = Window;
}
| HSTACK FIX_DIMENTION OPEN region_list CLOSE
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtHorizontal;
Window -> isFixed = 1 ;
Window -> Dimension = $ 2 ;
Window -> RegionList = $ 4 ;
$$ = Window;
}
;
vstack : VSTACK OPEN region_list CLOSE
{
TRegion * Window = ( TRegion * ) malloc ( sizeof ( TRegion ) ) ;
Window -> RegionType = rtVertical;
Window -> isFixed = 0 ;
Window -> Dimension = 1 ;
Window <font color

Source: https://habr.com/ru/post/50509/


All Articles