Semantic Web and Linked Data are similar to near space: there is no life there. To go there for a more or less long time ... I don’t know what they told you in childhood in response to "I want to become an astronaut." But you can watch what is happening and being on Earth; becoming an amateur astronomer or even a professional is much easier.
The article will discuss fresh, not older than a few months, trends from the world of RDF-repositories. The metaphor in the first paragraph is inspired by the epic size of the advertising picture under the cut.
They say that GraphQL claims to become the universal language of access to databases. What about the ability to use GraphQL to access RDF?
"Out of the box" provide this opportunity:
If the repository does not provide such an opportunity, it is implemented independently by writing the appropriate “resolver”. So did, for example, in the French project DataTourisme . Or you can already write nothing, but simply take HyperGraphQL .
From the point of view of the orthodox follower of Semantic Web and Linked Data, all this, of course, is sad, because it seems to be intended for integrations built around the next data silos, and not suitable platforms for that (of course, RDF storages).
Impressions of comparing GraphQL with SPARQL are twofold.
Trend complementary to the previous one.
If we talk more broadly about adapters to JSON sources, which allow more or less on-the-fly to represent JSON as RDF, then it is worth remembering the existing SPARQL Generate for a long time, which can be adapted, for example , to Apache Jena.
Summarizing the first two trends, one can say that RDF storages demonstrate full readiness for integration and functioning in conditions of “multivariant storage” (polyglot persistence). It is known, however, that this last is no longer in vogue, and multimodel is coming to replace it. And what about multi-model in the world of RDF-storage?
In short, no way. I would like to devote a separate article to the topic of multi-model DBMS. In the meantime, it can be noted that there are currently no multi-model DBMS, in which the main model would be graph (a variety of which can be considered RDF). Some small multi-model support for RDF storages of an alternative LPG graph model will be discussed in section V.
However, the same Gartner writes that multimodel is a sine qua non condition primarily for operational DBMS. This is understandable: in a situation of "multivariate storage," the main problems arise with transactionality.
But where are the RDF storages on the OLTP scale? I would say this: neither there nor here. To designate what they are intended for, some third abbreviation is needed. As an option, I would suggest OLIP - Online Intellectual Processing.
However, all the same:
Now let me introduce a new player in the market. From the creators of IBM Netezza and Amazon Redshift - AnzoGraph . A picture of an advertisement for a product based on it was posted at the beginning of the article. AnzoGraph positions itself as a GOLAP solution. How do you like SPARQL with window functions? -
SELECT ?month (COUNT(?event) OVER (PARTITION BY ?month) AS ?events) WHERE { … }
Above was already a reference to the announcement of Stardog 7 Beta, where it was said that Stardog was going to use RocksDB as the underlying storage system — the “key-value” storage, the Facebook fork of the Google LevelDB. Why is it worth talking about a certain trend?
First, judging by the article on Wikipedia , not only RDF storages are transplanted to RocksDB. There are projects on using RocksDB as a storage engine in ArangoDB, MongoDB, MySQL and MariaDB, Cassandra.
Secondly, projects are being made on RocksDB (that is, not products) on the relevant subject.
For example, eBay uses RocksDB in the platform for its “knowledge graph”. By the way, it's fun to read: SPARQL . As in a joke: how many knowledge graph we do, all the same it turns out RDF.
Another example is the Wikidata History Query Service , which appeared several months ago. Prior to his appearance, he had to access the standard Mediawiki API via MWAPI for historical information. Now much is possible in pure SPARQL. "Under the hood" there, too, RocksDB. By the way, made WDHQS, it seems, the person who imported Freebase in the Google Knowledge Graph.
Let me remind you the main difference between LPG graphs and RDF graphs.
In LPG, scalar properties can be hung on edge instances, while in RDF they can be hung only on “types” of edges (but not only scalar properties, but also ordinary connections). This limitation of RDF compared to LPG is overcome by one or another simulation technique. The limitations of LPG compared to RDF are more difficult to overcome, but LPG-graphs are more than RDF-graphs, similar to the pictures from the Harari textbook, so people want them.
Obviously, the LPG support task falls into two parts:
There are several possible approaches.
V.1.1. Singleton property
The most literal approach to harmonizing RDF and LPG is probably the singleton property :
:isMarriedTo
, predicates are used :isMarriedTo1
, etc.:isMarriedTo1 :since "2013-09-13"^^xsd:date
, etc.:isMarriedTo1 rdf:singletonPropertyOf :isMarriedTo
.rdf:singletonPropertyOf rdfs:subPropertyOf rdf:type
, but think about why you should not write simply :isMarriedTo1 rdf:type :isMarriedTo
.The LPG support task is solved here at the RDFS level. Such a solution requires an entry in the appropriate standard . Some changes may be required from RDF-repositories that support attaching effects, but for now Singleton Property can be perceived as just another modeling technique.
V.1.2. Reification Done Right
Less naive approaches stem from the realization that instances of properties are fully instantiated by triplets. Having the ability to say something about triplets, we will be able to talk about instances of properties.
The most solid of these approaches is RDF * , also known as RDR, born in the depths of the Blazegraph. From the very beginning, he chose AnzoGraph for himself. The solidity of the approach is determined by the fact that it proposes corresponding changes in RDF Semantics . The bottom line, however, is extremely simple. In Turtle serialization, RDF can now write something like this:
<<:bob :isMarriedTo :alice>> :since "2013-09-13"^^xsd:date .
V.1.3. Other approaches
You can not bother with formal semantics, but simply assume that triplets have some identifiers that are, of course, URIs, and make up new triplets with these URIs. You only need to give access to these URIs in SPARQL. So does Stardog.
In Allegrograph went intermediate way. It is known that there are triplet identifiers in Allegrograph, but they do not stick out when implementing triple attributes to the outside. However, even formal semantics is very far away. It is noteworthy that the attributes of the triplets are not a URI, and the values of these attributes can also be only literals. LPG adepts get exactly what they want. In the specially invented NQX format, an example similar to the one above for RDF * looks like this:
:bob :marriedTo :alice {"since" : "2013-09-13"}
Supporting in one way or another LPG at the model level, it is necessary to give the opportunity to make requests to data in such a model.
SELECT * { <<:bob :isMarriedTo ?wife>> :since ?since }
SELECT * { BIND (stardog:identifier(:bob, :isMarriedTo, ?wife) AS ?id) ?id :since ?since }
SELECT * { ("since" ?since) franz:attributesNameValue ( :bob :marriedTo ?wife ) }
By the way, GraphDB once supported Tinkerpop / Gremlin, while not supporting LPG, but in version 8.0 or 8.1 it stopped.
There have been no recent additions to the intersection of the "triplestore of choice" and "open source triplestore" sets. The new open source RDF repositories are far from being a good choice for everyday use, and the source code for new triplstors that I would like to use (the same AnzoGraph) is closed. Rather, we can talk about reductions ...
Of course, before open source is not closed, but some open source repositories are gradually no longer considered as worthy of choice. Virtuoso, which has an opensource edition, is, in my opinion, drowning in bugs. Blazegraph purchased by AWS and formed the basis of Amazon Neptune; now it is not clear whether there will be at least one release. Only Jena remains ...
If open source is not very important, but you just want to try, then everything is also less rosy than before. For example:
In general, space for an ordinary IT citizen is becoming more and more inaccessible, its development becomes the lot of corporations.
Source: https://habr.com/ru/post/451206/