This is a continuation of the article
“Why Racket? Why Lisp? ” I wrote about a year after I discovered
Racket . Being a novice, I could not understand the praises that were poured from all sides towards Lisp. I did not know what to think. How to understand that Lisp will eventually cause
"deep enlightenment .
" Okay, as you say, bro.
I had a simple question:
what is the use? In the last article I tried to answer it and summarized the reasons why someone would want to learn Lisp or, in particular, Racket.
I have compiled a list of
nine features of the language that are most valuable to me as a newcomer to Racket. For example, feature number 5 - "the creation of new programming languages." This method is also called
language-oriented programming , or
LOP .
Since then, IOP has become my favorite part of Racket, and I shared my admiration in the online book
“Beautiful Racket” , which explains the YaO technique and the Racket tool.
')
One of the examples in my work is
Pollen . I wrote this programming language for convenient typographical design of my online books. In Pollen, the previous paragraph is programmed as follows:
#lang pollen ◊link["https://beautifulracket.com/appendix/why-racket-why-lisp.html#so-really-whats-in-it-for-me-now"]{ }, Racket. , № 5 — « ». ◊em{- }, ◊em{}.
Another example is
brag , a parser generator (in the
lex/yacc
style) that accepts
BNF grammar as source code. A simple example for the
bf language:
#lang brag bf-program : (bf-op | bf-loop)* bf-op : ">" | "<" | "+" | "-" | "." | "," bf-loop : "[" (bf-op | bf-loop)* "]"
Both languages are implemented in Racket and can be run with the usual Racket interpreter or inside a Racket IDE (called DrRacket).
Main issues
And yet ... Despite the fact that the book forced thousands of people to start exploring Racket, it sometimes seems to me that I step on the same quirky ground as the Lisp fans that I once criticized.
If LOP is so cool, then why spend a few days reading a book. Right? I can explain everything briefly, without further ado. Two simple questions need to be answered:
- What problems are best for language programming?
- Why is Racket best for creating languages?
The second question is simple. The first is no. I was asked it many times. I often quoted Judge Potter Stewart’s
famous phrase: you will understand this when you see it. The answer is good enough for those who are really interested. But not for those who are worth the side and would like to hear meaningful arguments.
So, I will try. Keep in mind that I am not a professor of computer science and I can not argue about the theory of programming languages. Rather, I use Racket and domain-specific languages (DSL) for practical purposes: my daily work depends on them. Therefore, focus on practical aspects.
Short answer
- LOP is actually an interface design method. It is ideal for tasks that require minimal notation while maintaining maximum accuracy . Minimal notation means the only allowed notation. Nothing extra. Maximum accuracy, that is, the value of this notation, is exactly what you say. No ambiguity or patterns. IaP gets to the bottom line like nothing else.
(The impatient can go to specific categories of tasks that benefit from the LKO).
- Racket is perfect for IOP because of its macro system . They work in compiler style, simplifying code conversion. Macro Racket is better than any other.
At this point, half of the readers of the article will want to post anonymous comments criticizing my theses. But please keep in mind: I am the winner in any case. YaOP and Racket have incredibly increased my productivity in programming. I am pleased to share this knowledge so that you too can take advantage of these benefits. But I will also be happy if these tools remain my secret weapon. In this case, I will stay at 0.01% of the most productive programmers, getting a more impressive and profitable result than the other 99.9%
So the choice is yours.
Long answer
If you think about the most important questions, they come down to one meta-question: why is it difficult to explain the advantages of IEP?
Perhaps when we speak of
languages , the term is loaded with expectations about what a language is and what it does. While we are inside this paradigm, it is difficult to understand the value of programming languages.
But if you scale down and consider languages as part of a wider category of human-computer interfaces, it is easier to see the specific advantages of LOP. So let's do it.
General purpose and domain-specific languages
First, a bit of terminology. Language-oriented programming (also known as the YaO) is an idea to solve programming problems by creating a new language and then writing a program on it. Often, such “small languages” are called subject-oriented languages (DSL).
As the name implies, a subject-oriented language is adapted to the tasks of a specific area. For example, PostScript, SQL,
make
, regular expressions,
.htaccess
and HTML are considered subject-oriented languages. They are not trying to do everything. Rather, they focus on doing one thing well.
At the other end of the spectrum
are general purpose languages . Here we see C, Pascal, Perl, Java, Python, Ruby, Racket, etc. Why aren't they considered subject-specific? Because they position themselves for a wide range of computational tasks.
In practice, general purpose languages often specialize in one area. For example, C is best suited for system programming. Perl - for scripts in system administration. Python stands out as a beginner language. Racket for language-oriented programming. In each case, this is what the language was originally designed for.
There is a fine line between DSL and general languages. For example, Ruby was created as a general purpose language, but became popular mainly for web applications through the association with Ruby on Rails. JavaScript, on the other hand, was originally a domain-specific language for web browser scripts. But he mutated like a virus, and since then has grown far beyond the original task.
What is language?
If this whole wide spectrum is called language, then what are the defining features of the language?
I know what you are thinking: “Here you are mistaken. HTML is not a language. This is just a markup. He can't describe the algorithm. ” Or: “Regular expressions are not language. They do not work by themselves. It’s just syntax for another language. ”
I once thought so too. But the more closely I peered, the more vague these differences seemed. Thus, my first basic statement (out of three): a programming language is in its essence a medium of exchange — a
system of symbols that people and computers can understand .
The "notation" (notation) means that the language has syntax. “Understandable” means that by its syntax, a language conveys a
meaning (or
semantics , if you use a fancy word). This definition covers all general purpose programming languages. And all DSL. (But not every data stream, which will be discussed in more detail later).
(By the way, although “programming” and “language” are words idiomatically used together, these languages are used not only by people to program computers. Sometimes they are used by computers to communicate with us (for example, S-expressions), sometimes for communication with each other (for example, XML, JSON, HTML). Certainly, it seems wrong to exclude these possibilities. But in practice, yes - what we usually do with a programming language, this, in fact, is programming).
Consider the HTML code: a way to tell a computer — in particular, a web browser — how to draw a web page. This is a notation (angle brackets, tags, attributes, etc.) that the person and the computer can understand (the
charset
attribute indicates the character encoding, the
p
tag contains a paragraph, and so on).
Here is a small HTML page:
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>My web page</title> </head> <body> <p>Hello <strong>world</strong></p> </body> <html>
Suppose you do not agree that HTML is a programming language. Good. Let's display our page in Python. This is a real programming language, right?
print "<!DOCTYPE html>" print "<html>" print "<head>" print "<meta charset=\"UTF-8\">" print "<title>My web page</title>" print "</head>" print "<body>" print "<p>Hello <strong>world</strong></p>" print "</body>" print "<html>"
If Python is a programming language, but HTML is not, then this Python sample is a program, and HTML sample is not.
Obviously, this is a forced distinction. Here, pythonization adds nothing but complexity and pattern. The most piquant thing is that the only interesting semantic content in the Python program - from the point of view of controlling the web browser - is what is embedded in HTML (perhaps HTML tags, such as
DOCTYPE
,
meta
and
strong
can be considered as argument functions). Logic leads us to conclude that HTML, although simpler and less flexible, is still a programming language.
Embedded languages
We came up with an example with HTML and Python. But embedding DSL in another language is ubiquitous. The languages used in this way are called
embedded . They represent the most common form of language programming. As a programmer, you have relied on YaP for many years, even if you didn't know its name.
For example, regular expressions (other examples:
printf
for formatting strings, CLDR for date / time templates, SQL). We cannot think of regular expression as an independent language. But every programmer knows what it is:
^fo+(bar)*$
Moreover, you can probably enter this regular expression into your favorite programming language, and it will just work. Such consistent behavior is possible only because the regular expression notation is a built-in language defined externally (
POSIX ).
As with HTML, we could write an equivalent expression in the host language notation. For example, Racket supports
Scheme regular expressions (SRE): these are regular expressions with
S-expression notation. The above template will be written like this:
(seq bos "f" (+ "o") (* (submatch "bar")) eos)
But Racket programmers rarely use SRE expressions. They are too long and difficult to remember.
Another ubiquitous example of embedded DSL: mathematical expressions. Every programmer knows what this means:
(1 + 2) * (3 / 4) - 5
By themselves, mathematical expressions do not create interesting programs. We need to combine them with other language constructs. But, as with regular expressions, this is an ergonomic and practical entry. Mathematical expressions have their own designations and meanings that can be understood both by people and computers, so they qualify as a separate built-in language.
Are you kidding that HTML is programming?
No, it is. I argue that HTML (both regular expressions and mathematical expressions) qualify as rudimentary programming languages. This means that writing HTML (or regular expressions or mathematical expressions) qualifies as rudimentary programming.
Please do not panic. Of course, the “programmer” on LinkedIn with knowledge of only HTML and arithmetic is nonsense (although in a week he will probably get a job for $ 180 thousand). But this is a separate question, which means "programmer" in the labor market. We are not talking about this.
Turing fullness trap
If this definition of programming languages still annoys you, perhaps you think that a real programming language should express all possible algorithms - that is, it should be
Turing complete .
I understand that such a thought is intuitive. Every general purpose programming language is Turing complete.
But the problem is that this is a low bar. Turing completeness is a technical metric that does not correspond to the use of language in the real world. For example, regular expressions are not Turing complete, but they are useful in expressing many calculations with minimal notation. HTML is also not Turing complete, but it is a useful way to control the browser. In contrast, the
bf language is full in Turing, but even the most trivial tasks require kilometers of impassable code.
Language restrictions
Anything goes under my definition of language? Not.
- Binary data formats are not considered languages. For example, a
jpeg
file. Although the computer can understand them, man is not. Or PDF: if it is hacked, inside there are some parts that are read by humans. But this is due to the way PDF works. There is no point in writing some ideas using PDF constructs.
- Text files are not languages. Suppose we have a file with Homer's Iliad. We humans can read and understand it. Although a computer can trivially process a file, say, by printing its contents, but the text inside is incomprehensible to the computer.
- Graphical user interfaces are not languages. Yes, these are notation systems (which rely on text and image). But they are understandable only to people. Computers draw GUIs, but do not understand them.
Languages as interfaces
Above, I described a programming language as a “medium of exchange” between people and computers. Thus, languages fit into a wider category, which we call
interfaces .
This leads to the second basic statement (out of three): that
language programming is basically a method of interface design . If you like to think about interfaces, you will like IaP. If not, you will still love IOP for making it possible for interfaces that are otherwise unattainable.
One of my favorite examples of the language as an interface is
brag , a parser generator language created with Racket. If you have ever used the lex / yacc toolchain, then you know that often the goal is to generate a parser from a
BNF grammar . For example, for the
bf language, it looks like this:
bf-program : (bf-op | bf-loop)* bf-op : ">" | "<" | "+" | "-" | "." | "," bf-loop : "[" (bf-op | bf-loop)* "]"
To make a parser in a general purpose language, you need to translate this grammar into a bunch of your own code. This is a tedious job. And meaningless - have we not already recorded grammar? Why do it again?
However,
brag
fulfills our desire. To make the parser, we simply add the
#Lang brag
line to the file, which magically converts the BNF grammar into
brag
source code:
#Lang brag bf- : (Bf-op | Bf-loop)* bf-op : ">" | "<" | "+" | "-" | "."| "," Bf-loop : "["(Bf-op | Bf-loop)* "]"
Done! When compiled, this file exports the
parse
function, which implements this BNF grammar.
This is one of my favorite examples, because it is undeniably superior to other options. Moreover, with a general-purpose language, such an interface is almost impossible.
But a programmer at YaO constantly makes such interfaces.
Where language is the best interface
This brings me to my third and last basic thesis that
languages have unique advantages among interfaces . Of course, the categories below are not exhaustive or exclusive. But I found that YaP has a lot to offer in such situations:
1. When you want to create an interface for less-qualified programmers, or non-programmers, or lazy programmers (do not underestimate the size of the latter category).
For example, Racket has a complex
web application library . But a simple web server can also be quickly launched using the
web-server/insta
language:
#lang web-server/insta (define (start request) (response/xexpr '(html (body "Hello LOP World"))))
Matthew Flatt in the article
"Creating Languages on Racket" demonstrates a language that generates playable text adventures. Like
brag
, it looks more like a specification than a program, but it works:
#lang txtadv ===VERBS=== north, n "go north" south, s "go south" get _, grab _, take _ "get" ===THINGS=== ---cactus--- get "Ouch!" ===PLACES=== ---desert--- "You're in a desert. There is nothing for miles around." [cactus, key] north meadow south desert
2. When you want to simplify the notation. One example is regular expressions. Another example is my domain-specific language,
Pollen, for writing online books. Pollen is similar to Racket, only here you start working in text mode and use special characters to denote Racket commands that are embedded in content (Pollen is based on the Racket documentation language called
Scribble , which takes the bulk of the workload). So, the beginning of this paragraph is programmed as follows:
. — . — - ◊link["https://pollenpub.com/"]{Pollen} -.
Pollen takes care of inserting all the necessary tags and transforming them into infallible HTML. I still have all the advantages of manual layout (full control over the page), but no flaws (for example, I can’t accidentally leave an unclosed tag).
Another example of simplified notation is
lindenmayer
, the language for generating and drawing fractals of the
Lindenmayer system , like this:
In the usual Racket, the Lindenmayer program might look like this:
#lang racket/base (require lindenmayer/simple/compile) (define (finish val) (newline)) (define (A value) (display 'A)) (define (B value) (display 'B)) (lindenmayer-system (void) finish 3 (A) (A -> AB) (B -> A))
But you can use a simplified notation just by changing the
#lang
notation at the top of the file:
#lang lindenmayer/simple ## axiom ## A ## rules ## A -> AB B -> A ## variables ## n=3
Language assumes that you are already familiar with the L-system. But the simplified notation makes it easy to write down your wishes in a program that does what you want.
3. When you want to work with existing notation. Above, we have seen how
brag
uses the BNF grammar as source code.
#lang brag bf-program : (bf-op | bf-loop)* bf-op : ">" | "<" | "+" | "-" | "." | "," bf-loop : "[" (bf-op | bf-loop)* "]"
Another example. People who tried Pollen said, “Yes, that’s great, but I prefer the Markdown.” No problem:
pollen/markdown
is a Pollen dialect that suggests Pollen semantics, but accepts the usual Markdown notation:
. — . — - [Pollen]("https://pollenpub.com/") -.
The most pleasant? I wrote this dialect in just an hour, combining the
Markdown parser with the existing code.
4. If you want to create an intermediate goal for other languages. JSON, YAML, S-expressions, and XML are all domain-specific languages that define data formats for machine-read and write.
In “The Beautiful Racket,” one training language is called
jsonic
. It allows you to insert Racket expressions into JSON, thereby making JSON programmable. The source code looks like this:
#lang jsonic // a line comment [ @$ 'null $@, @$ (* 6 7) $@, @$ (= 2 (+ 1 1)) $@, @$ (list "array" "of" "strings") $@, @$ (hash 'key-1 'null 'key-2 (even? 3) 'key-3 (hash 'subkey 21)) $@ ]
Compiled into regular JSON:
[ null, 42, true, ["array","of","strings"], {"key-1":null,"key-3":{"subkey":21},"key-2":false} ]
5. When the main part of the program is the configuration. For example, Dotfiles can be described as DSL. A more complex example from Racket is
Riposte from Jesse Alama, a language for testing HTTP API based on JSON:
#lang riposte $productId := 41966 $qty := 5 $campaignId := 1 $payload := { "product_id": $productId, "campaign_id": $campaignId, "qty": $qty } POST $payload cart/{uuid}/items responds with 200 $itemId := /items/0/cart_item_id GET cart responds with 200
As a miniature scripting language, Riposte is much smarter than average dotfile. It hides all the intermediate code needed for HTTP transactions, and allows the user to focus on writing tests. It is still cleaning the house. But at least you can focus on the household you care about.
Why Racket?
Often, critics of LWPs ask: “Why make subject-oriented language? Is it easier to write a native library? ”
No, not easier if you have the right tool. Racket is unusual: it is designed from the ground up specifically for IOP. , DSL Racket , , . , , — Racket.
DSL Racket
source-to-source , DSL Racket. Racket DSL , C. Racket DSL. , . DSL , .
, — , DSL , . , , , .
?
DSL Racket, , DSL Racket.
. Racket.
Racket , , , .
Racket. :
- Racket , . . , , Racket , , , , , . ( . « » ).
- Racket macros are hygienic , that is, by default, macro-generated code saves the lexical context from which the macro is defined. In practice, this eliminates a huge amount of unnecessary body movements that are usually required for DSL (for more details, see the chapter "Hygiene" ).
Is it possible to implement DSL, say, in Python? Of course.
In fact, I wrote my first DSL in Python — and still use it in my work on type design . Well, it is. One time was enough. Since then I have been using Racket.Conclusion: Victory with YAOP
At this point, you may have one of two reactions:- « , , » . , . , , . . . , . .
- «, , c Racket » . , - Riposte , — ( ):
[ ] - Racket. , , - … : « API , ?» : « Riposte». , , [DSL], , . «» Racket. DSL , .
« Racket? Lisp?» , Lisp « , ».
: , . — . , . , .
, . , Racket, .