Concepts: set, type, attribute

Mathematicians are too lazy to explain in everyday language what a real number is. It is difficult for a layman to read icons written by a mathematician, because their meaning is not clear to him. As a result, there is a gap between theory and practice. In the theory of mathematics, they know very well what types of objects are and what attributes are, but, going down to practice, we see that few practitioners understand what it is. There are many intuitive concepts, but each of them is more like religious dogma than knowledge. In this article, I tried to close the gap between mathematicians and applied scientists, explaining the basics of the theory of sets in simple language, without complicated icons. For example, are you familiar with the definition of an attribute? I suffered it myself because I could not find a formal definition of it. And only then Igor Katrichek sent me a link to E. Kindler's book “Modeling Languages” (1979, translation 1985), which defines the attribute:

In this article I will give my, more general definition of an attribute, so that you can easily imagine it.

In the last article, Structural Modeling. Requirements for the modeler, I said that several objects conceived by us as a whole exist in our consciousness, but we are not clearly understood. Mathematicians realized this and made it clear by introducing the concept of set for this. I also reminded that the concept of a set and the concept of an object are axioms that cannot be derived from other concepts. At the same time, the concept of an object is customary for us, and we have enough experience to work with them, but we get to know a lot at the institute while studying the fundamentals of mathematics, and the idea of them is not so obvious. For those who are looking for an opportunity to learn how to represent the set more clearly, I told you where we can find a good image - in the representation of structures. In this article I will continue the story about sets, and tell you what type and attribute are from the point of view of set theory. And most importantly, I will tell you how these concepts are reflected in the models that we build.

Sets in mathematics and physics

We perceive the world either as space or as time, but we cannot simultaneously present both. This imposes its own limitations on the language we use and the models we build.
')
For example, a mathematical set does not exist in time, as well as operations on it. This means that it cannot be said that the composition of the set changes over time.

To myself, this seems to be a counterintuitive and non-obvious requirement, but without it we will not be able to perform operations on sets and make comparisons. This means that if we want to describe a lot of sand grains in a heap of sand, then we have two ways to do this: for each new composition of sand grains, enter a new set, or consider the many temporal parts of the grains of sand in the studied heap. The temporal part of a grain of sand is understood to mean the time part of a grain of sand, which has an attribute: the beginning and the end, which simulate the presence of a given grain of sand in a heap. This set of temporal parts is also called the 4D representation, performed in the 4D paradigm. The composition of the grains of sand at a specific time can be obtained from this set by a time slice: select only those temporal parts of sand grains that are “relevant” at a given time, that is, those that appeared in the heap earlier and left the heap later than the chosen time .

This is how the composition of real “physical” sets is modeled. But for the current article such a representation will be quite complicated, and I will return to the usual representation of simple sets of “frozen” in time, that is, those that exist “out of time”.

Determining the composition of the set

A set is a lot, conceivable as a whole, where a lot is the composition of a set. Consider ways to determine the composition of a set. As we know, the composition of the set can be defined in two ways:

Direct listing of objects selected from a set.
Rules for the identification of objects selected from a set.

For example, suppose there are two objects in the room, among other objects: a white plate and a green marker.

The set A consisting of these two elements can be specified by enumeration: the white plate is included in set A and the green marker is included in set A. More than anything in the room is not included in set A.
You can do otherwise. You can stick a yellow sticker to the plate and the marker and make sure that there are no other stickers in the room. Then we can say that those and only those objects in this room that have a yellow sticker are part of set A.

The first way to determine the composition is to list
The second method is setting the identification condition.

During the discussion of the last article, I realized that not everyone clearly understands the difference between these two ways of determining the composition of a set. Therefore, I will tell you more about them.

The first method is based on a series of statements:

The plate is part of the set A
The marker is part of set A
No one else is part of set A

The second way is a statement in the predicates:

The one and only object in the room that has a yellow sticker is part of set A.

In the first method of describing the composition can participate any object models. In the second method of describing the object model must have one common attribute, the value of which determines whether the object is included in the set or not. That is, if there are no common attributes in object models, it is impossible to build identification conditions.

In the discussion of the article, it was suggested that the very occurrence in the set with the help of an enumeration also be made an attribute: “is included in the set A.” Thus, those objects that are included in the set A, have the value of this attribute "yes". Then it was proposed to make a sign on the basis of this attribute for selection into the same set A: those objects that have the value “yes” are part of the set A. The author of this venture did not notice that as a result of the logical conclusion from these two statements we get two tautologies:

The set of A includes those and only those objects that are part of the set of A and

An object is part of set A if and only if it is part of set A.

These obvious statements do not contain information about specific objects or about the set A. If I take a plate, then based on this statement, it will be impossible to determine whether it belongs to set A or not.

Therefore, the enumeration and the rule are two fundamentally different ways of describing the composition of a set, and in mathematics they are listed as two basic and completely different ways of determining the composition of a set.

By the way, at one time there was a long argument about the definition of what a function is. This dispute arose due to the fact that they could not make a decision on which identification rules to consider as correct, and which - not. As a result, the idea of Dirichlet was accepted that any rules would be considered correct. That is why I will not aim at the classification of all the rules, but I will consider only a few, which we will need in this context.

In textbooks, the identification rule is often referred to as the selection rule. The term “selection rule” is misleading because it involves some sort of selection operation. And this is a hint that the set can be replenished. But it is not. The set has a fixed composition. Therefore, it is better to speak not about selection, but about identification. We do not select elements into a set, we identify them as elements of a set.

Determining the composition of a subset

Let's see how we determine the composition of the many African elephants. I counted four different ways to do this.

You can define them by listing.
You can stick a sticker to the elephants, and say that those elephants that have a sticker stuck are considered African. This is the definition of the composition of the set through the attribute. Attribute will be considered the presence or absence of a sticker.
You can determine the composition through the intersection of two sets: the set of elephants and the set of animals living in Africa.
You can enter the concept of "African elephants."

Using OWL in our work, we have the opportunity to implement the three methods described above for defining a subset:

Explicitly list the objects included in the subset,
Determine the identification rule through any conditions on any attributes with different operations: from the very fact of having the value of an attribute to that value falling within a certain range
Set operations on other sets: for example, the set A includes only those objects that are part of the set B and are not part of the set C.

To understand whether we can implement the fourth method of identification using the type of objects, consider it in more detail.

Simulate a subset using the type

To determine the type of "African elephant" we need:

A group of objects from which we select objects for a sub-type. In this case, this group has a name - it is a group of elephants.
A unique property in which objects of a type differ from other objects of a group: they live in Africa.
Unique name for objects of this type

You can do otherwise and as a group take animals living in Africa. Then a unique property that distinguishes African elephants from other African animals will be that these animals are elephants.

Total, to give a definition of type, it is necessary:

Specify over-many objects.
Identify the distinctive features (differential properties) of the objects of this type from the objects of the group.
Specify the name of the objects of this type

Additionally, you can specify:

The reasons why this type of object was claimed (differential functional properties of objects of this type
Benefit from the introduction of this type of objects
Term history
Etc.

Objects of the same type differ from other objects of the superset by some unique property. This unique property can be modeled through any conditions on any attributes. But it is not necessary that all values of all attributes coincide, or that the composition of attributes of all objects of the same type be the same.

Knowing what a type is, you might think that the fourth way to select subsets is the same as the second. However, to determine the type, we need to additionally, at a minimum, specify a specialized name, and as an option, specify other attributes of the type, for example, indicate the reasons for identifying this type of objects, history of the term, and so on. With the second method it is impossible to do this. Therefore, the fourth method is different from the second and is not implemented yet in the modeling standards that I know of.

Concept of type

So, from the point of view of set theory:

A type is a method of extracting a subset from an over-set and assigning a new name to the objects of this subset.

If there is no over-set, then the type is considered axiomatic, non-derivable. As I said earlier, the concept of an object and the concept of a set are non-derivable concepts because it is impossible to specify over-set objects for them.

The difference between the type of objects and many objects

From the discussions of the article, I realized that there are people who believe that the type of objects and the set of objects are either related concepts or the same thing. I will try to explain why this is not so. A type is both a rule for identifying objects and the name of these objects. That is, the type simultaneously serves the specialization (or selection) of a subset from the set, and gives a new name to specialized objects.

Each type determines the composition of a set, but not each set has a type that defines its composition, for example, when we speak of a set whose composition is given by an enumeration of its elements, or when we speak of a set whose elements have no name.

It is clear that the rule specifying the set is not the set itself.

It seems to me that from all this, it is clear how the concept “type of objects” differs from the concept “set of objects”.

Simulation of the same type of objects

Often in the IP, similar objects are modeled using models containing the same set of attributes. Now you can see that this restriction is redundant, since objects of the same type can have different sets of attributes. This limitation is caused by the technical features of the implementation, but not by the requirements of the domain.

In the IP list of similar objects is replenished. This suggests a variable composition of the sets that we model. However, it is not. The list of objects that were registered in the IP is not an exhaustive list of the set. That is, models of not all elements of the set are stored in the IC, but only those that are currently registered. Therefore, when we make a request, its meaning is this: give me all the objects of this type that are currently registered in the IP.

Object life cycle

In addition to the fact that an object can be referred to a certain type of objects, there are two more points about which we should not forget:

Classification (the assignment of an object to a particular class, or type of objects) is always subjective. The same object from a different point of view may look different. If we build an extensible domain model, the use of which implies the presence of different stakeholders, then it should be possible to model the context and different points of view. In this case, from different points of view, the same object can be assigned to different types.
Accounting for the life cycle of an object involves not only taking into account changes to an object, but also taking into account changes in our perception of this object, since along with the process of synthesis and analysis, the process of objectification and de-objectivation is going on.

The process of objectification and de-objectification looks like this:

Objectification

Possessing the idea of types, we try to find objects of these types in the world around us. Found objects, as a rule, belong to the widest types. For example, if we are talking about an enterprise, then in the first step the objects found can relate to operations, functions and objects. Or if we are talking about plants, we first divide them into trees, grass and shrubs. Next, the object type is refined by testing various hypotheses. In the process of refinement, we are trying to find a type that tells us about an object enough that this knowledge can be effectively used in practice (we are trying to find a narrower type to which this object can be attributed). In the process of refining the object model is overgrown with new details. In parallel, we use our knowledge about this object in practice. If the application of this knowledge is successful, the object is considered to be correctly received and correctly classified (the type of object is chosen correctly).

Deobjectivation

However, everything is changing: ideas about the world around us are changing, new knowledge appears, and so on. As a result, it turns out that the object model ceases to satisfy the utility requirement. And then the object’s too narrow specialization becomes its own enemy. The object is subjected to reclassification (the application has become a requirement), and sometimes completely destroyed, as the ether, or caloric was destroyed. And then the cycle begins anew: the selection of objects, the refinement of knowledge about them, etc.

Examples from practice:

Objectification:

Let the client come to submit the application. As long as the application is not executed, we can know its type only with some probability. Therefore, the application of the widest type is registered first. Then, as details are clarified and in the process of its execution, the application model becomes cluttered with new attributes. After some time, it becomes clear to which type of applications the application is assigned and its final classification takes place.

Deobjectification:

Suppose we have a typical scenario of searching for information on the Internet. Suppose it says that whenever you need to find the necessary information, use such a search engine - a program to search for the necessary information. Let us use this program many times, each time performing a search operation. There were many such operations during the operation of this program, and all of them were classified as “information retrieval” operations. After some time, it turns out that the search program performs spyware functions, "merging" user data to those interested in this information. And then it turns out that those operations that were used by this search engine will now be reclassified from information search operations into data transfer operations to interested parties. But it may well be that we learn something else about this program and then we will have to review other operations in which she participated.

Requirements for modeller modeling types

We formulate the requirements for the modeller, which is intended for modeling types:

You must be able to simulate the same type of objects, the composition of the attributes of which does not match
You must be able to model the rules that allocate objects in one type
The need to model other attributes of the type: the name of the objects of this type, the history of this name, and so on.
You must be able to simulate different points of view on the same object.
You must be able to simulate the life cycle of an object
You must be able to simulate a change in our perception of an object over time.

How in the IT industry to implement these requirements without referring to the structure of the database? How, without referring to the data structure, take into account different points of view, add new types of objects, specify the type of objects, reclassify objects if necessary?

Object Modeling with OWL

There is one limitation that is present in OWL: the set and type of objects in it do not differ. Because of this, we have limited functionality for modeling object types. However, this functionality is much wider than what other modeling methods give us, because we have the following possibilities:

Adding a new set of objects to OWL is no different from adding a new object.
You can require that, if the type of the object is known, then the object model is created with the specified, known attributes in advance. In this case, after creating attributes, both attributes can be added or deleted. Example: when creating an application model, we may require to specify the attribute values (application number, application date, applicant, addressee). One need only remember that these attributes in OWL exist separately from object types. And one attribute can be used when modeling objects of different sets. This is a fundamental difference from common programming languages, where the attribute exists only within one type of object. Another attribute in another type, even if it is also called, will be another attribute.
You can demand the opposite: to determine the subset of the object being modeled based on the attributes of the object model and its belonging to the super-set. To do this, it will be written in the rule that if the model of an object belonging to a particular super-set contains such attributes and their values satisfy certain rules, then the object will automatically be assigned to a specific subset. So with the help of the rules, the so-called “duck classification” will be implemented. For example, if the application model has the value of the “Telephone number” attribute, and “Connection” is the value of the “Type of work” attribute, then the application will automatically be classified as a telephone number connection request.

Dividing a set into subsets

Let there be many objects. And let the task be to divide this set into seven subsets, each of which has its own color: “red objects”, yellow objects ”. Etc.

The division of a set into subsets can be done in different ways.

You can divide a set into disjoint subsets by distributing objects into subsets by enumerating them. Create seven subsets and list the objects that belong to each of the subsets.
For each subset, you can create your own subtype. Then the whole set can be divided into seven subsets by entering seven subtypes: “Type of red objects”, “Type of yellow objects”, etc. Each object can be attributed to one of the listed types and say, for example, like this: the object refers to the type of red objects.
You can separate the superset with the attribute and its values. For example, you can enter the attribute "Color" and its seven values: "Red", "Yellow" and so on. Then the name of the color will become an adjective for the object and will sound like this: a red object, a yellow object, and so on.

The first method in OWL is implemented by creating seven different classes and specifying the objects that belong to them.

The second method can be implemented in three different ways:

By creating separate subsets, united by one type, but the types themselves, as I said earlier, are not modeled. This method is no different from the separation method of listing.
With the help of the reference book “Types of color objects”, the values of which will be objects modeling the types: “Red objects”, “yellow objects”, etc.
With the help of an attribute with the name “object type”, the values of which will have a text form: “Type of red objects”, “Type of yellow objects”, etc.

The third way of dividing a set into subsets in an IC is modeled in two ways:

With the help of the “Colors” reference book, whose values are objects that model the values of attributes: red, yellow, etc.
With the help of the attribute with the name “Color”, the values of which will have a text form: “red”, “yellow” and so on.

It can be seen that the separation using types and attributes is modeled in two cases in the same way, but has different names. Indeed, the possession of an attribute value in OWL is modeled by the following triplet:

# object # attribute # value

Belonging to the class - so:

# rdf object: type # class

That is, it can be said that belonging to a class is simply expressed with the help of a special service attribute defined in the standard - rdf: type.

Notion attribute

We formulate the statement:

An attribute is a way of separating a set of objects into subsets. At the same time, each attribute value corresponds to a certain subset whose objects have an attribute with such a value.

Modeling Subsets with the Attribute

Each of the three methods of the previously listed methods for modeling subsets has its advantages and disadvantages depending on the context and the chosen implementation method.

If there are a few subsets, you can choose any and all of the listed methods of division into subsets and any implementation.

If there are many subsets (in the limit is infinite, for example, when each of the sets groups objects of the same length), then formally remain:

the third way to model the type and
the second way to model an attribute.

However, I wrote earlier that each type should be given a name. If there are many (infinite) subsets, then it is unrealistic to give a name to each of them. Therefore, we do not model such a division using types. We model such a division only with the help of an attribute whose value domain is one of the most common sets: the set of real numbers, the set that models the time scale, the set of natural numbers, the set of strings of finite length, and so on. Learn data types?

How the function is introduced on the set of subsets and not only about it, you can read here .

The third way to implement an attribute is good because it can be used to model a huge number of subsets (there are a lot of options for writing a line), but it’s bad that it’s not clear how to find out that objects belong to one set: “Red”, red ”, “Kra) sny_ are the values of the same set, or different?

There is a sea of literature on how best to model subsets, and I will not repeat here. Just remember that an attribute is a model of subsets, and a value is an indication of a subset.

Source: https://habr.com/ru/post/330196/

All Articles