Introduction to DSL. Part 1 - Problems of design and coding

For several decades, the challenge has been to find a repeatable, predictable process or methodology that would improve the productivity, quality, and reliability of a design. Some tried to systematize and formalize this seemingly unpredictable process. Others applied project management techniques and software engineering techniques to it. Still others believed that without constant control by the customer, software development is out of control, which entails an increase in time and financial costs.
Computer science as a scientific discipline offers and uses technology of reliable software development based on structured programming methods, using software testing and verification based on evidence-based programming methods to systematically analyze the correctness of algorithms and develop programs without algorithmic errors.
This methodology is aimed at solving problems on a computer, a similar technology for the development of algorithms and programs used in programming contests by Russian students and programmers using testing and structural pseudo-code for documenting programs at IBM since the 70s.
The methodology of structural software design can be used using various languages and programming tools to develop reliable programs for any purpose.
However, when using the classical approach to development, there are problems described under habrakat:

Lack of transparency. At any time it is difficult to say what the state of the project is and what the percentage of its completion is. This problem occurs when there is insufficient planning of the structure (or architecture) of the future software product, which is often the result of a lack of sufficient funding for the project or low qualifications of developers.
Lack of control. Without an accurate assessment of the development process, work schedules are disrupted and budgets are exceeded. It is difficult to estimate the amount of work completed and remaining. This problem occurs at a stage when a project that is completed more than half, continues to be developed after additional funding without an assessment of the degree of completion of the project.
Lack of monitoring. Problems associated with the inability to monitor the progress of the project, do not allow to monitor the development in real time. Using tools, project managers make decisions based on real-time data. This problem arises in conditions when the cost of training in the management of the ownership of tools is comparable to the cost of developing the program itself.
Uncontrolled changes. The customer constantly has new ideas about the software being developed. The impact of changes can greatly change the architecture of the project being developed, so it is important to evaluate the proposed changes and implement only those approved, controlling this process using software tools. This problem arises due to the unwillingness of the end user to use certain software environments. For example, when creating a client-server system, a consumer places demands not only on the operating system on client computers, but also on the server computer.
Insufficient reliability. The most difficult process is the search and correction of errors in computer programs. Since the number of errors in programs is not known in advance, then the duration of debugging programs and the lack of guarantees that programs are not erroneous is unknown. It should be noted that the involvement of an evidence-based approach to software design allows you to detect errors in the program before it is executed. Professor Wirth, by developing Pascal and Oberon, due to the rigor of their syntax, achieved the mathematical provability of the completeness and correctness of programs written in these languages. Especially a major contribution to the discipline of programming made Donald Knut. His four-volume “The Art of Programming” is a necessary book for every serious programmer.
The lack of guarantees of the quality and reliability of programs due to the inability to ensure the absence of errors in software products up to the formal delivery of programs to customers.

To solve the problems discussed above, it is proposed to introduce the following innovations:

Systematic reuse. The most important approach is to isolate families of products whose components vary. Based on these families, product lines are being developed. Products designed as family components reuse requirements, architecture, frameworks, components, texts, etc.
Automation assembly. Facilitates the assembly of independently developed components. When automating the assembly, a number of innovations appear:
- platform independent protocols;
- auto-description (reduces architectural inconsistencies based on contract and specification);
- delayed encapsulation (reduces the level of architectural inconsistencies by interweaving adaptations into published components);
- architecture-driven development (based on software architecture, you can make suggestions about its performance).
Model Driven Development (MDD). This approach proposes using the model as source code, and not as documentation. For this, the model must be accurate, and the exact modeling language must be designed for a specific purpose. A modeling language is a system created for the specification of model-based programs. It raises the level of abstraction and translates the implementation into a vocabulary domain.

Technologies of modeling domain knowledge

')
Application Programming Interface (API) - a group of system services focused on solving common problems.
Component technologies - a set of software modules with a standardized interface, focused on solving common problems.
Architectural patterns are design solutions that describe the architecture of a software system based on a certain concept.
GoF templates are design solutions that describe aspects of the implementation of a software system for solving certain programming tasks.
XML-based languages are a structured description of some data and transformation mechanisms.
SQL is a structured query language for the DBMS.
An ontology is the representation of any field of knowledge or part of the real world that is used for semantic analysis of texts.
Domain-specific language (DSL) models concepts identified in a particular domain. A well-designed DSL is a powerful modeling language that has a higher degree of uniqueness than a general purpose modeling language.
DSL is a programming language specifically designed to solve a specific range of tasks, as opposed to general-purpose programming languages. There are three main types of DSL:

internal DSL (internal DSL);
external DSL (external DSL) is a DSL that is written in a language different from the main language of the software application;
DSL (Language Workbench) development environment.

External DSL uses separate from the basic syntax constructs that are close to natural language. Requires an external compiler, interpreter, or postprocessor, and therefore runs at compile time, as opposed to internal DSL. External DSL often uses special languages, but in many general cases, tags are used that are taken from the syntax of other languages, such as XML, as a common alternative. Traditionally, Unix systems use the “little languages” style. One of the first examples of external DSL was regular expressions, SQL, awk, and XML, which were used in systems like Struts and Hibernate. The biggest advantage of external DSLs is that they can be written as the developer wishes. In other words, you can express the subject area in the simplest and readable and editable form. The format of such a DSL will be limited only by the ability to create a translator that can read the configuration file and issue some executable code in the main language of the application. This also implies the main drawback of external DSL - the need to create a direct translator.
Internal DSL uses a portion of the syntax constructs of a common programming language to express, in close to natural language, certain aspects of the application. It does not require a third-party compiler for execution, it is executed when the main program code is executed in a common programming language. Internal DSL is used by some common programming languages to extend the capabilities of the programs, but it creates a substantially limited subset of the structures for managing the program. A classic example of using internal DSL is Lisp and Ruby.
Integrated Development Environment (IDE) is a tool for creating DSL. It provides editor and generator capabilities for defining the abstract syntax of a language, like modern IDE for developing programs.
In general, DSL, supplemented by metaprogramming technology, is an effective means of automating software development and is currently widely used in information technology.
The generalized algorithm for developing a new DSL is as follows:
1 Define syntax in terms of the implementation language
2 Use DSL patterns to implement new DSL
3 Use metaprogramming tools to implement DSL within the source language.
Using DSL also has several advantages:

At the design stage, it is possible to create solutions in terms of the domain, thanks to which specialists in a given domain can create and modify DSL programs.
when designing in a similar subject area, you can use ready DSL;
DSL domain problem solving occurs at an appropriate level of abstraction. This allows subject matter experts to understand and verify DSL programs;
programs written using DSL are concise. Writing a DSL using domain terms makes it possible to read the program easily later on;
there is an increase in reliability, efficiency and quality of maintenance. Since operations are easier at the model level, they are more efficient and subject to fewer errors than the same operations at the code level;
DSL allows for optimization and validation at the level of abstraction, the corresponding domain;
A domain description at one level of abstraction can then be converted to a lower level with detailed details. Thus, it is possible to complement the model at different stages of development.

The disadvantages of use include the following:

the cost of design, implementation and maintenance is large enough;
the need for user training;
DSL scope is difficult to determine;
the difficulty of maintaining a balance between constructions used in DSL and constructs of a general-purpose programming language.

The first (introductory) part ends here.
Thanks to those who read to the end, I would like to hear your opinion on the problems of design in general and the languages for describing subject areas in particular as one of the solutions to the difficulties encountered.

Source: https://habr.com/ru/post/94259/

All Articles

Introduction to DSL. Part 1 - Problems of design and coding

More articles: