
What is the main difficulty of the formalization of natural language? The fact that we are accustomed to formalize it using the same language, which leads to bad infinity. Language is in itself a means of formalization, which mankind has long and unsuccessfully used.
We take the first definition:
Flight - an independent movement of an object in a gaseous medium or vacuum.')
It contains six terms that, in turn, require definition:
- independent,
- moving,
- an object,
- gaseous,
- Wednesday,
- vacuum.
Each of the terms has its definitions, through which we obtain new definitions that require a new definition, etc. In the end, it turns out that all definition terms used were given earlier, that is, we got a cycle. What is not the subject of our dreams, of course. You need to stop at one of the transactions, but ... When to stop? what to choose as a stop criterion? - That's what the damn questions are.
Recall why we need a language at all. To correctly reflect the reality around us in the process of communication. Surrounding reality, by the way, consists of physical objects. We are unable to perceive anything else: physical objects - this is what is initially perceived by man.
From this we can conclude that it is necessary to stop in the process of issuing definitions on terms denoting physical objects. The principle is this: what we see, we are reporting.
The logical trap also lurks here: to understand what we are trying to communicate, it is necessary to define the term.
Suppose someone, pointing a finger, exclaims in amazement:
- Hare!The “Hare” requires either substantive knowledge or definition — we find ourselves in the same unenviable position. But the unenviable position will disappear, if you take something elementary - for example, shout:
- Something white!What is white? Not a thing, of course, but its characteristic is color. A term that does not require a definition: white is white. The number of colors available to the human eye is limited - accordingly, the number of terms that do not require definition is also limited.
It is believed that a person has five senses (sometimes called more, but it does not matter):
- eyesight
- by touch
- smelling
- by hearing
- by taste.
The result of the functioning of each of the organs is a certain sensation, the values of which do not require definitions, since they correspond to undetectable elements of the surrounding reality.
The question is, what elements of the surrounding reality can be characterized as a simple set of five sensations? Elementary objects! The idea is that a complex object - the same “hare” - is decomposed into elementary parts, each of which is characterized by sensations. By adding the component parts back, we get the “hare” in the collection: an object that has a formal and, most importantly, a complete verbal definition.
Let's see how this is possible.
Here is a physical object. Pay attention that to observe (more precisely, to feel, because the object can be perceived not only visually, but also with the help of other senses) only a specific, that is, individualized object is available. When I see a hare, this is a very specific hare - this one, and not one of the hares at all.
As a rule, individualization of an object occurs through its naming, but, as is the case with the hare, not always (only if there is a need for this). Thus, depending on the situation, the term "hare" can mean:
- particular hare,
- the name of the class to which any hares belong.
These nuances must be distinguished - suppose that by abbreviations from “individ” (for designation, we take the first letter) and “class” (for designation, we take the last letter, because “c” in brackets is associated with the sign of copyright):
Hare (i).Text equivalent: hare - personal name;
Hare (s).Text equivalent: hare - class name;
If the hare had a unique name, it would become more obvious:
Stepashka (i).“Stepashka” cannot be the name of a class, but it does require an indication of belonging to one or another class. You never know who so called them? We have to denote class membership. Use for this the symbol "∈":
Stepashka (i) ∈ Hare (s).Now it is determined that Stepashka is one of the hares, but it is not defined what hares are. As mentioned earlier, the “hare” must be decomposed into its component parts, each of which must be characterized by characteristics that correspond to the sensations perceived by the person.
It is very difficult, mainly due to the three-dimensionality of the constituent parts of the object, therefore it can be performed only conditionally. But in principle, it can.
Suppose that the hare consists of the head, torso, paws and tail, and that the listed objects are elementary (in fact, of course not). Then, using the symbol "⊂" to denote the occurrence of the component in the material whole, we get:
head (s) && trunk (s) && 4 * paw (s) && tail (s) hare (s).Textual analogue: head, and torso, and 4 legs, and tail make up a hare.
Since objects are assumed to be elementary, you can specify characteristics for each of the sensations for them. In view of the combined effects of sensations on a person, definitions in space and time may be required.
We get an approximate set of characteristics:
• Colour,
• the form,
• smell,
• taste,
• surface (the result of touch),
• sound,
• location (spatial coordinate),
• relocation (as the difference between two locations),
• moment of time,
• duration (as the difference between two points in time),
• speed (as quotient between movement and duration).
The set, as I said, is approximate: only characteristics that correspond to sensations are unconditional, the rest is discussed. For example, it is clear that time as such a person does not perceive: it can be determined by the symbols on the gadget or by the position of the sun in the sky, but not directly by sensation. Similarly, the location is not set absolutely, but relative to other objects.
Now I will try to characterize the "head":
- shape: round,
- surface: hard.
The remaining characteristics are not defined.
That is, the head, if conditionally regarded as an elementary object, is something round and solid. Conventionally, of course, only conditionally. Language as a means of formalization, too, after all, gives approximate results: how can, for example, verbally describe a spot of complex geometric shape? No: you can not exactly describe. Therefore, in the conventional example, the head is approximately round and approximately firm - and a point on it.
If you agree, let's write in braces:
head (s) {shape: round; surface: hard}.That is, the specified object has the specified characteristics.
Of course, of course, heads can be not only round, but also different: for example, Vovochka from a bearded anecdote of the Soviet period has a square head. Nothing prevents the introduction of logical operands into our notation, in particular, the operand “or”:
head (s) {shape: round II square; surface: hard}.But the hare has a round head, not a square one, like Vovochka’s! Well, to hell with both of them, we introduce the implication:
head (s) {shape: round} if head (s) hare (s).Instead of a hare, it was possible to indicate a specific hare to Stepashka, thereby setting its individual characteristics:
head (s) {shape: round} if head (s) ⊂ Stepashka (i).
The terms used in the characteristics (“round”, “square”, “solid”, etc.) are undetectable: we feel them directly, therefore no verbal definition is required.
I denote this type of words by the symbol “a” - from “attribute”, like this:
round (a).I draw attention to the fact that individual objects and classes are nouns (these are entities!), Whereas characteristics are adjectives (they are also characteristics!). From the point of view of conformity of types to parts of speech, everything is completely legal.
The adjective “round” is an undetectable characteristic, but, let's say, the adjective “hare”, not corresponding to any of the sensations of a person, doesn’t fit the attribute.
It is obvious that the “hare” should be defined through the “hare”, which I have already done (by decomposing the “hare” into its component parts). That is, the term “hare” first appeared, and then the adjective “hare” appeared from it, meaning: referring to the hare, similar to the hare.
We get a new type, denoted by the symbol “d” - from “dependence”. Specifying the type, of course, is not enough - a reference to the parent term is needed. We introduce a new designation using the symbol "=>" to denote dependencies:
hare (d) => hare (s).Now the term "rabbit" is defined - through the parent noun "hare".
We have defined a dependent adjective through the parent noun. It happens the other way around: when a dependent noun is formed from the parent adjective. For example, “square” is an adjective denoting the shape of an object. In the light of the above, it becomes clear that the “square” originated from the “square”, but not the “square” from the “square”.
square (d) => square (a).Thus, in each group of single-root terms there is a parent term from which all the others originate.
Now I managed to deduce all the terms from the original undefined ones? There is still no - there remains a significant terminological group, not yet covered: those concepts that can be derived by means of formulas.
Take the verb - for example, “move”: we have not yet encountered verbs. What is "move"? I use not an academic definition, but one that, from my point of view, reflects the essence of the matter:
“Move” is when an object changes its location under the influence of another object.The formula is:
X (i) 1 # move (f) X (i) 3 {move: nonzero (a)}.I hasten to give the necessary explanations.
The formula consists of three parts, denoting the subject, the action and the object:
- X (i) 1 is a subject. "X" means any subject, individual, under the sequence number 1.
- # move f is an action. "F" is a formula from "formula". The grid indicates the word being defined (in this example, this is unnecessary, but could be required when pointing to a specific subject or object).
- X (i) 3 is an object. The rest is identical to the subject. In curly brackets indicate the characteristic that has changed as a result of the impact of the subject.
The rules are flexible: according to them, new concepts are easily constructed. A general unfilled structure is taken (subject - action - object):
X (i) 1 X (f) 2 X (i) 3.The necessary elements are replaced with specific terms, the characteristics are indicated, the element being defined is marked with a grid, and logical operands are used if necessary.
Let's practice a little, for example, with adverbs, which can also be expressed by formula concepts.
Let's take the adverb "carefully" - from my point of view, the parent in the group of same-root words ("careful", "careful", "carefully", "take care of"). The word denotes a characteristic, but not an object, but an action. I will give this conditional-primitive definition:
“Carefully” is when someone moves a thing slowly.Things are defined, "slow" - dependent on "slow", which is a characteristic of objects in speed.
slowly (d) => slow (a).And the term "move" has already been processed. Thus, there is everything necessary to define the term “carefully”:
X (i) 1 move (f) {# carefully (f)} X (i) 3 {speed: slow (a)}.Here “carefully” is defined by “move” and “slowly” and, like any adverb, refers to an action.
According to such rules it is possible to determine new formulaic concepts from previously obtained ones, and so on, including using implications, and possibly other logical methods. The more complex the abstract concept, the more complex and deeper the structure will be the formula. We will be able to get a formal definition of any term, and how much it turns out to be correct depends on us.
Naturally, the proposed language can be expanded - more than enough opportunities. For example, the designation of synonyms suggests itself:
hippopotamus (s) = hippo (s).There is no mention of other parts of speech: those used for emotional coloring of the sentence (interjection) or various technical needs (conjunctions).
You never know what else! It is important, however, the direction of thought, while the syntax of such a language is a matter of a purely secondary and applied.
I summarize.
We have the following types of words:
- i - individual objects: are determined by belonging to a class, are nouns;
- c - classes: determined by decomposition into components, to the level of elementary classes. Elementary classes are defined by characteristics. Both are nouns;
- a - characteristics of objects and classes. Are adjectives;
- d - dependent terms. Formed from the parent term. May be any part of speech;
- f - formula concepts. They are nouns, or verbs, or adjectives.
And the following word formation sequence:
- At the lower level are the characteristics of elementary objects, and through them - the classes: red, solid, round, etc.
- The combination of original characteristics makes it possible to assign a name to an elementary object: for example, all round and red objects growing on trees can be called apples. As a result, we obtain a term suitable for designation of both a class (apples as a whole) and an individual object (this is exactly an apple).
- The presence of individual objects allows us to assign unique names to them (the hare Stepashka).
- The initial terms are formed arbitrarily, if necessary (this beast could be called a hare, but you can be a rebbit, nothing would change from this).
- From the initial terms are formed dependent, as a rule, are other parts of speech.
- On the basis of terms that have received definitions, formulas can be compiled for defining subsequent terms, with complex logical conditions.
Probably, readers already have a question, what is all this for?
I encountered this problem by sculpting a chat bot. Testers, who managed to attract - just a few people! - they were, in my opinion, equally crazy: they asked questions, wanting to get answers to them. Naive! It was as if they did not know that before asking questions, information should be entered into the database. But even with the successful overcoming of this obstacle it turned out to be very problematic, in view of the invariance of human speech, to foresee the form of the question.
Nothing to enter into the database text:
"Birds are flying".Then the answer to the question can be obtained. It is difficult to wait for an answer to a more convoluted question, which is essentially a variation of the first:
"Do the cockatoo flap when flying?"For this you need to know many of the relationships between words, namely:
- Cockatoo belong to the class of birds;
- “Fly” and “flight” are words with the same root;
- you can fly with the help of wings;
- flapping their wings.
In principle, the first two points can be loaded into a chat bot from dictionaries (although, as I was convinced with inescapable sadness in my soul, you will not find dictionaries in the afternoon with fire). And to implement the last two points simply nowhere. In our heads, information is available in the form in which the doctor prescribed, but from there it cannot be pulled out. While the dictionaries with the required content are absent by definition: the miserable, which occasionally comes across, offers verbal formulations, whereas we need strict formal ones.
So I asked myself a question: how to formalize speech in order to give definitions to the rest on the basis of undefined terms so that it becomes possible to create a full-fledged Dictionary of lexical links. If successful, the chat bot will be able to answer the question of whether the cockatoo flaps its wings in flight.
To point the way is the only thing that is possible for me at the current life stage. At the same time, I am not sure that the ideas expressed here are absolutely original: attempts to formalize speech during the development of AI took place, and certainly repeated. However, the trick of my sentence is not just filling the base with phrases in a natural or artificial language (meaningful filling the base does not apply to the topic under discussion), but in defining any subsequent terms from a limited number of undefined concepts. I know nothing about attempts to implement this idea.
Actually, everything.
[joke]
Do not tell me, the Nobel Prize is issued by check, non-cash transfer or cash?
[/ joke]