
In this post I will try to explain the key points on my fingers and substantiate the advantages of the RDF model.
For more than 10 years, the concept of the Semantic Web, part of which is RDF has evolved, has been the subject of controversy and discussion, and today it is increasingly supported by the community in its applications.
However, for many it is still not at all clear:
- Why all this?
- How to work with it?
- What will it give me?
')
Most, at least briefly, saw the famous pie:

There are a lot of specifications, technologies, concepts - they already run up their eyes ... The lower level is as old as the world, academics are fighting over the upper level, trying to find a simple and universal solution to teach applications to evaluate how much you can trust the available statements obtained from the network. Ordinary developers can still not worry about it and wait another 5 years. The guys from w3c tirelessly polish standards just above the middle, some already polished, such as RDF and a lot of tools have been written for them for all major platforms and languages ​​so that you can immediately go and use. It is with them that one often comes across real applications, primarily with the RDF model.
It would be very good to understand why we need, what we give, and how to work with ontologies - but this is not about this time.
So, let's try to figure out what we get from using the model:- The logical conclusion of the new facts
- Semantic Search Provision
- Data model flexibility
- Extreme ease of data exchange between systems
If you have never studied formal logic before, then put it in your memory that having formally described facts you can automatically get a number of new facts that are not clearly defined ... this topic deserves special attention until we touch it.
Semantic search, why, because there is Google?Yes, you can enter a
"list of users of the habr" and you get anything just the wrong set of users. Why? - because Google will search for finding the word from the query in the text of the documents, and return the documents and not the facts.
And if he understood that we need the objects “user” of the “Habr” resource, and the formal descriptions of these objects would be available for indexing in the RDF model (for example, in the form of an RDFa record on the page so that the search engine could index them) a set of objects we were looking for.
Many object - “I can look like a couple of links, make a couple more clarifying queries and find what you need, why all this?” - the answer is that we don’t use paper files today, but prefer to type a few keywords into the search box And of course, it’s obvious to get information immediately. For some reason we go to work by cars, and by horses - because it is more convenient.
The question “How does RDF provide semantic search?” Is the answer: The RDF model provides formal descriptions. And where there is a formal description of the search agent can search for facts and knowledge.
Google is not looking for this today - why do I need to steam about it now? First of all, to get the advantages described further, and secondly, not to be “late” - our industry is “be quick or be dead”.
Let's talk in more detail about the two other benefits, they are much easier to understand, I think. But first, let's clarify a couple more points.
What is an RDF model?Immediately you need to understand RDF - a model, abstract, very simple, a bit in a vacuum. Just a directed graph with a few additions and reservations. But you can write it in different ways, usually the choice falls on one of the options: N3, N-Triples, Turtle, RDF / XML, RDFa and the specification used will have to be studied.
What is described: With the help of RDF, you can describe both documents, separate pieces of knowledge inside a document, and objects of the real world, for example, a specific living person
(here some ithniki fall into a stupor) .
It identifies everything with a URI. Moreover, the URI, though similar to the usual URL links, is slightly different, for example, you can define a resource — a real person and set the URI for it “http://example.org/people#Vasya Pupkin”.
Yes, you can write in Russian as Unicode, but what you need to understand is that it
is not a url - you cannot insert it into the browser and get a person - science has not reached this point yet.
Let's try to figure out where the flexibility of the model and data exchange between systems is.Let's see how people communicate:
They write down their thoughts (formalize) with the help of a certain language in many ways, in writing, orally, then the information rotates through different systems, is passed from mouth to mouth, stored, aggregated, etc. But in the end, it is read by a person who can interpret this language and gets a thought. How exactly a formalized thought is transmitted from one node to another in the described chain does not really matter to anyone in the end.
If you do not know Chinese, it will not hurt you to make ctrl-c ctrl-v from one place to another.
RDF works very similarly.
Information needs to be interpreted only when the application is formalized in RDF and is read by the agent at the other end. Between these two stages, anyone can handle it, exchange it in any way, without necessarily imagining what the meaning is, or understanding the meaning, only part of the statements.
For example, RDF statement (triplet subject-predicate-object)
<
www.example.org/index.html > <
www.example.org/terms/creation-date > "August 16, 1999".
To understand, you can rephrase
www.example.org/index.html has the property
www.example.org/terms/creation-date whose value is August 16, 1999 "
those. If I am writing an application, I must describe how to interpret
www.example.org/terms/creation-date in the application logic. Immediately a question of an attentive reader - after all, every time we create predicates denoting one, and for each application I want to integrate, add synonyms to the application logic?
The answer is NO! .. Firstly, it is recommended and advocated at every corner to use wherever possible generally accepted dictionaries, for most tasks that have already been developed and are actively used, and secondly, based on ontologies, you can set statements such as owl: SameAs which set the relationship between the two entities and you can logically deduce the desired synonym, and the application will not have to rewrite. For such purposes, you can use an external
service .
In this way, we can exchange with anyone, without making any extra gestures, including integrating with the n + 1 system, making no more efforts than we are for displaying our website to the n + 1 user! No need to program anything for this. All concepts that this system will need from yours will be able to receive and interpret.
For comparison today - it is necessary to explicitly link the fields of one system with another (often with each) using XML, XSLT - or fields derived from the API - and many of us know which one, this is a headache.
We have come to understand that data can be independent of the model of any particular application. Those. a set of facts lives on its own. We can add them, delete them, make requests to them, interpret them - but they are logically independent.
This fact has the following important advantage -
ease in changing the model .
Imagine what you need to do in an application that relies on a relational model. For drama, let us imagine that this should be done after the launch if you need to change the data model, for example, to associate with the object of the user some new entity, say the address (which is, by the way, not one line, but contains separate fields for the house, city, street, etc.) d.) What do you need to do with the database? Correctly correct a couple of tablets, maybe create new ones, add a connection between them, change procedures for accessing data, correct a web service and everything is ready. And of course you need to change the user interface.
A little uncomfortable to do it, does not it seem?
And how much easier it would be if all that had to be done was to add a couple of fields to the interface and perform the action of adding one new statement for each new field (this minimum can also sometimes be left if we have a more universally designed interface) ? a couple of lines of code ...
With the RDF model behind it, this operation will look like this. After all, all that is stored is a huge number of statements subject-predicate-object. Thus, changing the data model is no longer something that spoils the mood as soon as you think about it, isn't it great?