📜 ⬆️ ⬇️

The alphabet ... reflections on the topic ... Full

1. Alphabet. Associative links.

So much has been said about the alphabet that for a start I will quote Karl Bühler's Theory of Language
“The alphabet is an associative chain (mechanical sequence), and nothing more; but everyone has learned and knows it. Therefore, mapping sequences of any objects on the alphabet is a convenient correlation. We constantly use it in practice to organize. It would not be difficult to prove that in the system of signs that make up a natural language, there are many associative chains and interweaving, which from a psychological point of view are at the same level with the alphabetic chain, and which provide us with the same service in the comprehensive task of streamlining our knowledge about objects and the communication of this knowledge to others. ”

Therefore, each element of the alphabet can and should be assigned a value (which is actually done now) and close this question. However, not all so simple. It is logical to assume that the set of associations is an alphabet. If i is not equal to j and ai and aj: A, then AI is intersected by AJ = empty.
The mapping T of the set of characters S (objects of the class Symbol) to the alphabet in the language l: L (where L is the set of languages) is denoted by Tl. They (signs) are a kind of source of associations (through the relation Tl). The set of associations generated by the mapping of the signs of Tl will be denoted by Al (the alphabet itself of the language l). The set Al is finite, it can be numbered, which will be the association or alphabet code, but it will be categorically incorrect to call it the sign code. The association generated by the sign s in the language l is denoted by Tl (s). It is clear that Tl (s): Al. Multiple characters can generate the same association. For example, a large letter “A” and a small “a” give rise to the same association. So there may be s1: S, and s2: S such that Tl (s1) intersected with Tl (s2) is not empty. This is one moment. And the second point is that this association depends on the environment or for each language its own set of associations. Those. in English, French and Russian environments, the same sign can cause different associations. Usually, to implement the Tl mapping, it is enough to build the table of associations (it is also the Tl relation), but more complex algorithms can be used, such as, for example, in musical notation, cartography, or in the construction of electrical circuits.
.
2. Grouping characters.

However, not all categories of characters have the need to build their own table of associations. There are signs that cause associations common to all languages. True, even stronger is the statement: Different associations in different languages ​​cause only objects that we usually call letters. To emphasize this property we will unite them in the Letter group. And since the division into groups has already happened, we divide the set of characters into several more groups.
The group number (Digital group) in which signs 0,1,2,3,4,5,6,7,8 and 9 are included causing obvious and identical associations in all modern languages.
Control character group (Command group) in which all control and format characters are included. In modern standards, associations of this group are used, but signs are not provided that are mapped directly into control associations. For some strange logic, the display of T characters of this group is absent, and for a graphical representation, characters using an empty association are used and only in the reverse display is an empty association a character imitating the corresponding control association. Those. the sign that is mapped directly to the “line feed” association (code 10 in the KOI8 standard) is missing, and the sign depicting the line feed is displayed by the “sign association” code 182 in the KOI8 standard.
The Mark group includes brackets, quotes, commas, and more. The principal difference between this group of characters is that they do not participate in lexical analysis, but are independent lexemes. Those. they do not form a lexeme by participating in some sequence of characters, but participate by their own presence — by the absence of directly in the syntactic analysis.
The letter character group has another feature, which is the main purpose of the characters in this group; the sequence of these characters forms its own association, which we call the lexeme. Lexemes are separated from each other by signs of other groups, one of which is a space.
After the division of characters into groups, one can accordingly divide them into corresponding groups of their association. Note that for any i and j: L, if s: Letter, then Ti (s) = Tj (s) = As. This fact allows for the signs of these groups to create a single table of associations for all languages. And besides this, for all groups except Letter, the register property (big, small) does not make sense.
')
3. Data entry.

All the above listed groups are united by one controversial quality; all of them can be entered from the keyboard. In order to invent the keyboard, as well as standards for character encodings and keyboard layouts. We emphasize here that it is characters that are entered from the keyboard, not associations. And attempts to adjust the keyboard codes and association codes to conformity encounter stubborn resistance of the real state of affairs. But to provide for the introduction of all the signs necessary in practice, of course, is not possible. Therefore, we are simply obliged to expand the capabilities of the keyboard. The purpose of the keyboard is to directly match the key being pressed to a character, but to do so would always mean increasing the size of the keyboard to not reasonable. Therefore, the method of combining keys is used (simultaneous pressing of several keys) is convenient for the control group of characters. And then the question arises, is it not time to expand the standard of signs for functions that have already become standard (copy, delete, etc.) and add new signs for editing (new section, note, etc.)? I think quite real. And for this to come up with a graphic image for each of them (which is often already invented). The combination of characters is possible even with special sequences. The so-called compositional signs. This technique is widely used for emoticons (a sequence of colons and parentheses) or the ® sign and so on. Application for entering characters of composite characters provides ample opportunities for custom characters and emoticons.
I consider it a good idea to change the correspondence of signs and keys of the keyboard depending on the environment. Lebedev's keyboard capable of changing the image of characters on the keys is in good agreement with our principles.
4. Signs as objects.

4.1. Encoding. Character sets.

This has gone since the times when monitors and printers did not differ in graphics capabilities, and each letter was encoded "iron". It turned out that, for example, the letter “a” is English and Russian is coded differently, although this is one sign. There are many such examples. I do not see much sense in it. Moreover, there are not enough many really necessary signs. In addition, the conditional division into English and Russian letters clearly does not hold water if we use other languages ​​besides Russian and English. For example, Latin or French. Who then will tell me which language the letter “p” belongs to? .. And I lead to the fact that it is just a sign, and if you consider this sign to be an object, then it does not have the “Language” property. And this is fundamental. Reasoning logically this property "Language" must have a Word object! Those. sequence (group) of letters! And even then this property is not uniquely determined by the sequence of signs. It is possible that one sequence of characters may have a different meaning in different languages. And maybe the same. Example: “petitio principii”. What letters are these and what language is it? Or what would be quite obvious what language the space belongs to "" or colon ":" ?. It follows only one thing. The sign must be a sign. And if it belongs, by virtue of its exclusivity, to some alphabet of a language it is nothing more than such a fact. Usually, if we use the adjective "English" or "Russian", then not to the sign, but to the sound, in order to concretize the image of the sign. It follows that the generally accepted approach with different character encodings for languages ​​is not correct. Character sets that differ in or coincide with writing with other languages ​​apply only to the letter group; therefore, they can be encoded within the group by highlighting for this byte and having multilingual sets. And perhaps one set is enough for the whole group of Indo-European languages.

4.2. Large letters. Sorting.

The first monitors and printers were limited in graphics capabilities, so a separate encoding was provided for large letters, and the question was closed. The value of this code serves as a parameter for comparison operations, and the sorting problems that are easily case-sensitive and case-sensitive remain. If we consider a mark as an object, the question arises, what to do with the register? Is this a property or is it a new sign? In fact, the question is: Should the capital letters be considered the same sign or is it a separate sign? Answer: These are different signs denoting the same thing. It goes without saying that they can be grouped together. The question is just what will we combine? Here again the question arises, what is a sign? Is the image only? Or are we interested in the semantic load? Rather, the semantic load interests. Then under this semantic content it is necessary and combine the variants of the image. And if all the same image? Then from the image it is necessary to leave on semantic loading. So the real sign is this image and no more (remember the hieroglyphs). What is the point we put in this image is the second question. Well, if this is a sign, then the creation of a new object is quite logical and the creators of the coding standards did a very good job, defining a separate code for each large and small sign.
Let's think about the meaning of this property? Or is it not a property, but a separate sign? After all, we will not be able to say whether it is big or small about a single letter, except when it is distinguished by a graphic image. Those. in fact, it is large or small, this concept is relative and depends on the adjacent letters or the default font size. Therefore, this is an independent property that obviously affects the graphic image (and sorting) based on the size of the font or adjacent letters. Now the letter size can be changed as you like, so there is no need for a separate encoding for large letters, but it is necessary to provide such a property (register) for letters. Here we note that neither the control characters, nor the characters of the Mark group, much less the custom characters of this property. It does not make sense to them. Time to think, is it a class that is inherited from signs?

4.3. Sorting.

The necessity of sorting marks is usually not in doubt. But caused. First: If we consider a mark as an object, then the solution of this question is a special case of sorting objects, and in essence is a matter of defining comparison operations on a class of objects. Suppose we solved this question, but we need a criterion for defining comparison operations. With the current situation of the current character code and serves as such a criterion. The same code is a parameter for imaging, and these are completely different functional loads.
Secondly, the signs themselves are already sorted. In that situation, as is customary at the moment, the symbol code is the ordinal number of its position in the table associated with this symbol. In our case, the Symbol object is defined in some group and also has its own location index, which can hardly be an argument for comparison operations.
Thus, the character code carries three different functions: It is an array index, one of the parameters for a graphic image, and one of the parameters for a comparison operation. If we carry out the dynamic and custom formation of signs, then these are three different properties, and not one. Therefore, for comparison operations to associations
But apart from the question of how to perform a comparison operation with or without case-sensitive, there is another tricky question. If a letter does not have the “language” property, then how can they be compared letters from different languages? And then the answer suggests itself. But the need for sorting occurs when sorting tokens. Sort individual letters does not make sense. And in lexemes there is a language property, and the lexeme belongs to some class !!! That is, we came to the conclusion that the sorting does not apply to signs, but to the meaning. The same can be said about the Digital group. We sort not by signs, but by values!

5. Inverse mapping .

Reverse mapping is necessary from association to sign. As we have noticed, one association can be generated by different signs. But the fact is that, as such, the association applies only to internal needs, and is always generated by some kind of sign. With the modern approach, such a need arises when it is necessary to use managing associations, since they do not have their own sign. Those. where it is necessary to make a transition to a new line or somehow format the information for output, you have to write something on the topic chr (10) as in VB or / n as in C. We can safely put the necessary sign where it is is required. As expected, his appearance will cause a change in the text. However, it is necessary to make only one remark; if the control character is a text constant, then its control functions do not act. Those. a control character taken in text quotes loses its control properties.

6. Group Mark.

6.1. The "=" sign.

The history of this problem, in general, is old and I don’t understand why the standards for different signs with respect to logical relations and the assignment of meaning itself have not yet appeared. In principle, it is possible to construct a translator that will distinguish in context the various uses of this sign. But this entails both the overhead of resources during the broadcast, as well as limitations for some syntactic constructions, so that the translator uniquely identifies the context. In different languages, this issue is solved in different ways. In a class C language, assignment is usually denoted as “=”, and a logical relation with a double sign ==. In the languages ​​of the Pascal class, assignment is denoted by a double “: =”, a logical relation with one “=”. The VB language uses the same "=" sign for both applications by accepting some restrictions and defaults. One of which is the separation of the operators in a single line with a colon.
I think we would have come up with something from this if it were not for the third appointment for the same sign. The question arose in connection with the uniform format stated in the Lada system and the further development of the object paradigm. The "=" sign used to define the value (or even the expression) of an object's property has a completely different meaning in the operation of assigning a value to a dynamically created object (operator Dim). In fact, the difference is that when we create an object in the editor (for example, as it is done by graphical wizards, but this assignment has a textual appearance) and assign a property to it, for example, Top = 2, then this value 2 is stored in the storage format of the created element , and when this item is loaded, this value will already be in the corresponding memory location. There is no need to use the Move command to add this value, i.e. directly execute the command assignment Top = 2, and, consequently, generate the corresponding code by the translator. We face a similar situation in HTML. There, the assignment does not turn into an assignment command, but is some kind of value. If we assume that tags are created there objects, then no commands to assign properties to the translator is not generated. The corresponding values ​​are simply already in the appropriate places in binary format after the broadcast.
In concept, .Net is a kind of static property. Those. properties for access to which it is not necessary to create an object. But the .Net concept is about programs, i.e. only about objects that create classes. And programming languages ​​do not create (in the sense of how programs write) objects. And in the Lada concept, creating an object is the same process as writing a program (the program is also an object). Moreover, objects can be inside the program, both during writing, debugging and execution. This is very useful for maintenance, both for the design process of the program, and operation. Let us explain by example.

Example 1. Creation in the text of the program object "Deadline for the program."

Tag “Program Deadline”: Date
{
Value = Pointer "Arrival of TZ": Data. ¤Value + 2Month
}
Suppose somewhere there is a tag with the date of receipt T.Z. ... In this example, assignment of the value of the program execution period is performed. Technological assignment will occur at the translation stage (when objects are created), and it is then that the addition operation is performed, and the result of this addition will be stored in the Value property of the object “Program End Date”. This value will serve as information for project management. There is no need to start the program and create classes for accessing the “Program Deadline” object. Enough to keep the source text in the accompanying place.It is clear that for such an assignment there is no need to generate addition and assignment commands, to form addresses for data, to allocate space for them when creating an object, as is done in the program. The new value of the date is formed and assigned during the broadcast, and not during the execution of the program. Not to mention that the object being created may not be a program at all. And here is how something similar happens in the program.

Example 2. Calculation of the date of delivery of the program.

Dim DateTZ: Date
……
Dim Value: Date
Value = Date. ¤Value + 2Month

A completely different picture in this example. This is a common situation. Somewhere there is a variable with the date of receipt of T.Z. (DateTZ). As a result of the translation, Dim objects and assignment objects and expressions with operand addresses will be created. When the program is executed, the data area will be allocated (by Dim operators), and when it is time to execute this sequence of commands, the addition and then the assignment will be performed.
You can again leave the task of determining the assignment for contextual analysis by the translator, but then we will be deprived of the opportunity to analyze invalid assignments in static classes. And you can go further and abandon such a paradigm as static classes. Actually, their role is the performance of functions or methods, in some domain of definition. I see no reason to abandon the more intuitive categories, in order to generate a new concept, and even try to explain its purpose.
Thus, we are simply forced to introduce new signs in order to recognize the three different purposes of the "=" sign. I would like to have two arrows for different types of assignment and leave the "=" sign for logical relations.
( ) , , . for example

AB {Location, Size}

.

.
.

6.2.Exchange sign. "> <".

There is no such sign. We constructed it from two signs ">" and "<". But a useful operator. If it is necessary to change the values ​​of two variables, it is necessary to define a third variable in order to preserve the intermediate value. Of course, there is no such machine command. But we have long known that the format of the procedure (and the program) has in place a reserve, which is used precisely for this purpose (to save intermediate results) when calculating expressions. Why not use it for this operation. Yes.The lack of an appropriate sign is a big problem !!! And a bidirectional arrow would be very suitable. Or like this.

6.3.Relationship operations. "=>", "<=", "! =".

Just no words. Modeller constructor of some kind. But it is time to come up with the appropriate signs. I am particularly admired by the approach in the C language, where the exclamation mark is denied. Reading texts overloaded with signs that do not correspond to the semantic load is a very exciting experience. Are such poor graphics capabilities? After all, there have long been established designations for designation less and equal sign less and below the dash. For designation greater and equal sign more and below the dash. Inequality denoted by the crossed out sign is equal to       
It is appropriate to apply the so-called composite signs. If characters are considered as objects, then with lexical analysis (or even before lexical analysis during editing) it is possible to combine the composite characters into one. For example, the signs ":" and "=", in ": =". Or three points "." in "..." In the same way getting emoticons or other graphic elements is in one way or another signs. Compositional signs add an alphabet to the group, and they can partake in parsing along with standard ones.

6.4. & |. .

. ? . |. . , . ( ( (¬), () (,≡). , + * .
: () (). F , x, (x: F) — , F , (x: F) — , . , . , , (x: (x)  (x)) (x: y: (x)  ()  (x, ) ).

6.5.Progressive operations.

Standard notation has been added to perform many operations. These are summation and product by index, integral, differential, and many others. Perform these operations by defining them with some (and they have long been formed) signs is not a big problem, if the language allows functions of high order. Our language is as follows.

Example 3. Progressive Summation.

i = n∑mA [I] 2 or 

Example 4. Progressive product.

I = 0n-1A [I2 or 

Similarly, you can do with the integral, and do not forget about the differential and other things.

7. Group Mark.

«» « » « ». , . . , ? . , , ( ) , , - () . - , , . . , , Word. , . , . , , . ? ! , , , . ?
About the space. There were times when it was a sign with its position and standard dimensions, under which all the letters were customized. But now it is quite possible to have not a sign, but the size of not occupied space. Let with the minimum value for each font and font size.
I would also think about the space, which does not allow hyphenation and formatting. If you need to write an expression with the prohibition of formatting. For example, a formula or an example that should look like a special way where the formatting can disrupt the author’s idea by changing the look (and maybe the meaning) of the text. Or it would be useful where we use the sign “_” denoting a single meaning in a few words. For example: Access_to_function. In essence, this is the designation of a single concept or object. Of course, it would be more convenient to use a space instead of underscore, which, like a letter, does not allow breaking this sequence of characters when formatting. Those.I would like to have a space, which is rather a letter, but looks like a space (that is, it does not look like).
Thus, 4 spaces are proposed. One performing tab functions. The second one serves as a word delimiter and allows formatting (resizing a space upward from a certain minimum value). The third space, which allows resizing only when editing, and is not changeable when formatting, and the fourth, which is similar to the third, but not a separator. You can distinguish functionally spaces using the background color.
, , , ( ), , . .

8. .
:
. 

.
















.
True False .
, True False.

.

Source: https://habr.com/ru/post/21313/


All Articles