A couple of years ago I was going to write a Prolog interpreter in Delphi. I decided to start by creating a parser. Writing an analyzer specifically for Prolog seemed terribly complicated, it seemed easier to write a universal analyzer and the Prolog syntax to it. Well, since this is all incredibly difficult and long, I abandoned this idea. But the parser remained. Here I will tell about his writing.
Objective : to write a parser that supports
context-free grammars . Also, the parser can perform some actions (associated, for example, with interpretation) in the analysis process - so-called. "User part" of the parser.
Let the syntax be stored in memory in this way: there is an initial syntax non-terminal, from which the syntax analysis begins; Let also each character * has a list of characters from which to continue the analysis after this character. If the list is empty, then the analysis can be considered successfully completed.
* syntax symbols are of two types - terminal and non-terminal
Fig. one |
The algorithm is recursive, each iteration works not with one term, but
with the current list of terms :
')
:
, :
.. , ;
;
;
:
.. , :
.... ;
.... ;
.... // " "
.... ,
...... , ;
.... ;
.. ;
:
.. ;
.. :
.... ;
.... // " "
.... ,
...... , ;
.... ;
.. ;
;
// ,
;
, , .
(here used an intuitive human language ... or improvisation on his subject)
In case of unsuccessful analysis, only a part of the first syntax error to the end of the text remains from the source text. In order to implement the search for several errors, it is enough to do one simple thing: remove the system commands from the user part and provide an error message in it, intercept the parsing result and change it from 'failure' to 'success'.
In more detail about the user part:
- It so happens that you need to make a character that represents any string of English characters or numbers. This will have to encode thirty different terms (which is difficult / long). To avoid this, it is enough to make a simple check of the string in the “user part” of the parser and change the preliminary result of parsing to the result of the check.
- Also in the user part it is possible to provide for the execution of the system commands of the language (when interpreting), for example, if the Brainfuck language syntax is modeled and the '+' symbol is found in the text, the user part increases the value of the current cell by one.
That's all with the writing, it remains to implement this algorithm and try it out.
You can also read about this in a Wikipedia article
Parse .
PS Here there is an archive with source codes (on Delphi) and a parser DLL, its import in the form of a project on Delphi and right there a test program. You can write the syntax in text form (how to do this, written in the readme.txt) and analyze the input lines on it. The input windows that appear imitate the user part of the parser: the text in the field is the remainder of the text being analyzed (it can be modified in the user part), the text over the text contains a literal (if it is), in the heading True or False are assumed user part) results of the analysis on the current term - Success or Failure, respectively, and the Ok and Cancel buttons - the future result of the analysis on the term returned by the user part. At the end, the sound will be played - the sound of an error or a window of information indicating the result of the parsing. The input line in the input box will also be changed.
EDIT : Edited the article in accordance with reality.