Intelligent Information Networks and the Semantic Web

Information intellectual networks, Semantic Web, Web 3.0, AI ... These words increasingly began to appear in our everyday life.

The whole era of universal Internet ends. It begins to change before we begin to feel it. The barely-formed term Web 2.0 has already been replaced by another, incomprehensible and mysterious at first glance - Web 3.0, or simply “Semantic Web”.

About what it is and where our internet is moving, I wanted to talk in this article.

Now the network becomes personalized. "The Internet increasingly knows about us." In part, we ourselves contribute to this by distributing our personal information on social networks, using search engines, being authorized.
')
This means that soon, entering into the search bar “I want to get a haircut cheaply”, the user will receive an answer in the form of the nearest barber shop to his location as a clear answer to a clear question - we will not need to follow 10, 20, 50 links from search results of different search engines, getting upset once again that the next open tab is another expensive salon promoted by SEO specialists.

This applies to various spheres of human life and activity - from household to more global. For example, buying a car or an apartment, job search and others.

Moreover, the search engine will be able to determine what kind of car a user needs based on information about what test drives he is most interested in and which car sites he visits, in which area and in what price range you want to find an apartment, are you hungry? what food do you prefer and so on.

With the development of the semantic web after the collection of certain data about the user technology will allow you to create its socio-demographic portrait. The collected user data computers will be understood as a portrait of the person.

In many ways, this dynamic is facilitated by the desire to simplify services and make simplified user access to content. Having become fashionable lately, authorization via social networks (Vkontakte, Facebook), special services (OpenID, OAuth), commenting via social network widgets.

Our cellular networks tie in personal information.

Information is what will play a decisive role in the future Internet!

Promoted by major market players, NFC technology - providing the opportunity to make purchases using a mobile phone (including paying for travel on the subway, for example), increasingly connects our SIM cards, phones, bank cards, tightening our personal information into a single point.

Let's try to figure everything out, but for now let's start small things in order. For a start, let's go along with you consider intelligent information systems (IIS).

Intelligent Information Systems

IIS (intelligent information system) is an information system that is based on the concept of using a knowledge base to generate algorithms for solving problems of various classes depending on the specific information needs of users.

Features and signs of intellectual IP

Any information system (IC) performs the following functions:

perceives user input requests and the necessary source data;
processes the data entered and stored in the system in accordance with a known algorithm and generates the required output information.

From the point of view of the implementation of the listed functions, the IP can be considered as a factory, producing information, in which the order is an information request, raw materials are the raw data, product is the required information, and tool (equipment) is the knowledge by which the data is converted into information.

IIS communication skills characterize the way the end user interacts (interface) with the system.

Intellectual tasks are those related to the development of algorithms for solving previously unsolved problems of a certain type.

Intellect is a universal algorithm capable of developing algorithms for solving specific problems.

If during the operation of the IC it becomes clear that it is necessary to modify one of the two components of the program, then it will be necessary to rewrite it. This is explained by the fact that only the IP developer has complete knowledge of the problem area, and the program serves as a “non-thinking performer” of the developer’s knowledge. This deficiency is eliminated in intelligent information systems.

Disadvantages of IP and their removal in IIS

Weak adaptability to the information needs of the user.
Inability to solve poorly formalized problems.

These deficiencies are eliminated in the IIS, which have
the following characteristics:

developed communication skills;
the ability to solve complex, poorly formalized tasks (characterized by half qualitative and quantitative description, and well formalized tasks - completely quantitative description);
ability to develop and self-learn.

Classification of IIS

Class I: systems with an intelligent interface (communication skills):

Intellectual DB;
Natural language interface;
Hypertext systems;
Contextual systems;
Cognitive graphics.

Class II: expert systems (solution of complex problems):

Classification systems;
Pre-defining systems;
Transforming systems;
Multi-agent systems.

Grade III: self-learning systems (self-learning ability):

Inductive systems;
Neural networks;
Case-based systems;
Information storage.

Intelligent DB

Intelligent databases - differ from the usual possibility of accessing information on demand, which can obviously not be stored, but be derived from the existing database (for example, display a list of products whose price is higher than the industry-specific one).

The natural language interface assumes the translation of natural language constructions onto the machine level of knowledge representation. In this case, recognition and verification of written words using dictionaries and syntactic rules is carried out. This interface facilitates the access to intelligent databases, as well as voice input commands in control systems.

Hypertext systems are designed to search for text information on keywords in databases.

Contextual help systems are a special case of hypertext and natural language systems.

Cognitive graphics systems allow users to interact with IIS using graphic images.

Semantic Web

The HTML page describes how to present information visually in a Web browser and is difficult to semantic analysis by computers. For her, it is impossible to automate even such trivial tasks as finding people, projects, programs on the Internet.

The Semantic Web technology allows a computer to interpret information on the Web on an equal basis with people, for which a graph model of resource description RDF (Resource Description Framework) has been developed, which is a W3C specification.

With RDF, you can create any claims about any resources.

RDF graph model

Resource claims in the RDF model consist of triples.

Resources and properties are represented as URIs, and literals in Unicode format. URI allows you to uniquely identify resources on the Web, and Unicode solves the problem of multilingualism.

RDF schema is not an XML schema

The RDF scheme is described in the RDF statements.
Unlike the XML schema, it defines the resources (terms) of the subject area, and does not limit the structure of the RDF.

The semantics is fixed to the RDF schema resources in the W3C specification.
RDF schema example

An example of an RDF scheme described using RDF

Data semantics - what is it?

By data semantics we mean the possibility of a formal description of the meaning of the transmitted data, making them independent of applications. This is especially important in the context of the prospects for the development of the Internet that we are considering - the one who has the data wins. There may be a lot of applications, sites, services, but by themselves they will mean very little. Those who can provide their content in any user-friendly content will win.

What data can be used regardless of the services in which they are used today: data from databases, XML documents, applications in social networks? No, because their semantics is sewn up in the logic of the program and / or informally in the specifications. Only data provided with explicit semantics can be made truly independent of applications!

Why do you need RDF? What's wrong with XML?

Nested XML tags carry only syntax, but no semantics. If we consider the various possible forms of the statement “Ivan Petrov teaches computer science” in XML format:

<course name=""> <lecturer> </lecturer> </course> <lecturer name=" "> <teaches></teaches> </lecturer>

  <teachingOffering> <lecturer> </lecturer> <course></course> </teachingOffering>

An application that uses the first format cannot understand the other two formats and vice versa. Therefore, XML is only good as a format (syntax) for data exchange, but not as a model for describing data semantics! The same can be said about other popular formats (JSON, for example).

Where are the semantics in RDF?

At the level of the RDF model, semantics appears due to the use of ontologies OWL (Ontology Web Language), thanks to which a computer can understand how a resource or property known to it is associated with another resource or property unknown to it, respectively, and produce other logical conclusions over RDF statements.

Ontologies are based on the mathematical apparatus of formal logic (description logic, DL), a small subset of which is covered by the RDF scheme. DL is a computable subset of first order logic.

An example of using semantics

How will the following statements be interpreted by an application that understands only the resources of the foaf dictionary?

 <Pugofka:rybmyas_day#30032011> <Pugofka:semantic#Lector> “ ”. <Pugofka:semantic#Lector> <rdfs:subClassOf> <foaf:Person>

It will understand that Pugofka: semantic #Lector is foaf: Person and will output a new statement:

 <Pugofka: rybmyas_day#30032011> <foaf:Person> “ ”

Semantic repositories

It is assumed that large amounts of RDF data will be stored in semantic repositories and the SPARQL query language, an analog of SQL, will be used to access them.

An example of the query “display all projects created by Pugofka” in SPARQL:

 PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?title WHERE { ?project foaf:name “Pugofka”. ?project dc:title ?title}

As examples of the development direction can create new projects. For example, Clark & Parsia ( http://clarkparsia.com/ ) already has several serious projects in the field of the Semantic Web, and the beta test of the RDF database called StarDog is scheduled for the first days of April.

Semantic Web Levels

Evolutionary approach

Semantic Web is not a replacement of the existing Internet, but only its evolutionary development. RDF / XML is either embedded in HTML or accessible via a URL.

According to this principle, RDF data using RSS, FOAF (Friend Of A Friend), DOAP (Description Of Project) dictionaries are already widely used in WWW.

Sample FOAF code on LiveJournal user page

Sample FOAF code on LiveJournal user page

Semantic Web - Goals, Tasks, Examples

Semantic Web technology successfully solves the following tasks:

data independence from applications;
semantic data integration;
laying the foundation for the ubiquitous use of computer agents (services);
Data Mining;
Expert systems;
Single authorization issues *.

* If there is a resource with several possible authorization methods, and an account on the site to which third-party accounts are tied (VK, FB, Twi, OpenID, Oauth ...), then we can learn to uniquely identify that it is all the same user and to bind all the information about him.

Semantic Web is not created from scratch. It lays down the fundamentals:

graph model of semi-structured data representation (OEM, Lore);
formal logic (first order logic, knowledge base, frames);
WWW architecture (URI, Unicode, XML, HTTP);
public key cryptography.

Technologies that are involved in the Semantic Web

semantic search;
question-answer systems;
agents;
knowledge pooling (database integration);
ubiquitous / pervasive computing

Examples of software technology support

libraries for interpreting the RDF language stack for all popular programming languages (Jena, Redland, RDFLib);
ontology editors (Protégé);
reasoning systems on ontologies (Racer, KAON, FACT);
semantic repositories (Sesame, Kowari, YARS);
semantic browsers (Simile, Piggy Bank, Gnowsis, Haystack);
semantic data searchers (Swoogle);
converters from different data formats to / from RDF / XML (Aperture, RDFizers, D2R);
application programs (Bibster, FOAF Explorer);
Stardog, the RDF database;
Examples

Research areas

Foundations
1. Knowledge Engineering and Ontology Engineering
2. Knowledge Representation and Reasoning
3. Information management
4. Basic Web Information technologies
5. Agents
6. Natural language processing
Semantic Web Core topics
1. Infrastructure
2. Resource Description Framework and RDFSchema
3. Languages
4. Ontologies
5. Rules and Logic
6. Proof
7. Security and trust and privacy
8. Applications
Semantic Web Special Topics
1. Natural language processing and human language technologies
2. Social impact of the Semantic Web
3. Social networks and Semantic Web
4. Peer-to-peer and Semantic Web
5. Agents and Senatic Web
6. Semantic grid
7. Outreach to industry
8. Benchmarking and scalability

Tasks and problems of the Semantic Web:

indexing and searching for information;
development and support of metadata;
development and support of annotation techniques;
representing the Web as a large, interoperable database;
organization of machine data mining;
discovery and provision of web-based services;
research in the field of intelligent software agents.

Conclusion

The Semantic Web is a dynamic, constantly evolving concept, and not a set of comprehensive, working systems.

Web 3.0 is a very multifaceted and, currently, not yet formed concept. It can be viewed from different angles.

For example, from the point of view of machine data processing, the Semantic Web is the idea of storing data in such a way that it is specific and related, and there is also the possibility of their further automated processing, integration and multiple use in various services, applications, etc.

From the point of view of intelligent agents, the goal will be a more “machine-oriented” Web,
so that you can most effectively use search spiders (agents) for searching and processing information.

From the point of view of distributed databases, knowledge bases, the concept of the Semantic Web is to describe, add additional meta information that allows you to uniquely identify and compare information.

The concept of Web 3.0 implies a whole infrastructure.

From the point of view of serving users (content consumers) - the idea of Web 3.0 is to minimize actions by the user and, as a response to his request, issue a direct response to his request, which will take into account not only his request, but also his entire history, features (social psychological portrait), tastes, interests and many other factors.

From the point of view of search quality, the search is implemented not only by keywords or context, but also by content. Providing an accurate response to the user's request. In many ways, using a search engine as an expert system.

From a web services perspective, the Semantic Web provides access not only to existing static sites, but also to dynamic, applications, services, and other resources containing useful content.

Source: https://habr.com/ru/post/116574/

All Articles