📜 ⬆️ ⬇️

Search yesterday, today, tomorrow ...

If I may, I will start without intro and prehistory.

The search engine today (including the Internet search engine in the first place) is a program based on the mathematical apparatus, statistical, probabilistic, and other methods. In any case, he believes. Counts links, counts relevance, conversion statistics, takes into account many factors (location, age, etc., different situational information). This ultimately leads to a narrowing of the results and filtering the issue. And that, ultimately, there is a huge, unconditionally multilevel and, to date, fundamentally quite complex index to some database of information collected on the Internet. At the same time, the information base itself also has a rather complicated, multi-level structure, which is quite understandable today, but does not change the essence. Here, of course, there are caches, redundancy, parallelization, and so on, and so on, which gives each of us the opportunity to use, from my point of view, a very important resource. Just try to present today's Internet without searching. I am even ready to argue that achievements in the field of information retrieval are the main factor stimulating the growth of the Internet in principle.

However, what is a search engine? Search engine is a mediator between the one who published and those who want to see the published; between the thoughts of one person, converted into digital form of some electronic document, and the thoughts of another, presented in the form of a request. The search engine in this case is the communication channel with its interaction protocol, the interaction channel between people . This fact is extremely important: we are talking about a tool, of course a colossal, but a tool of human interaction in the vast majority of cases.

The other day I came across an article four years ago habrahabr.ru/post/31600 , in which the problem, or more precisely the idea, of a semantic search was considered, in connection with which there were objections, questions and answers.
')
1. Search quality today. What is its level? What are the prospects?
Theoretically, the maximum achievable quality of search, based on today's technologies, is when, at my request, I get one most relevant query article-response! That is, given the maximum possible number of factors, the mathematical apparatus of the search engine calculates this correspondence. At the same time, we must understand that the search engine will show what someone left. Having reached this theoretical level of our communication channel (search engine), we ask the second question: to what extent is the answer the answer mathematically from the point of view of reason? After all, we can get the perfect answer if the returned result was in fact the answer to someone asked, exactly our question. For my purposes, the level of today's search is quite sufficient. That is, I am quite comfortable and quickly find information that interests me. Increasing the relevance of the currently used architecture mainly, as far as I know, is achieved by increasing the parameters involved in the process, including the maximum amount of available data in the query for greater differentiation of the output.

2. Semantic search - what is it?
Search by content or search with meaning? I will not argue about definitions, but searching for content with an understanding of the meaning is a completely different technological platform. This is a completely different architecture. Where the system plays the role: “I study, I understand, they ask a question, form an answer, I answer”. All that I see now is the search for information in question-answer format, again, communication of people. Which reduces the search engine functions all to the same math.

This problem lies in the sphere of my interests, research in this and related fields and achieved results. We are in Kibikom, we are conducting a project of answer, in which the results obtained are tested in relation to the search field. However, the search is far from the only one requiring a different approach.

Working in this direction, I rethought many things, right down to the very concept of information, the principles of its organization, and processing. I do not like the idea of ​​presenting information in a special, machine-oriented way. This will not lead us to a "smart" computer, but rather will require a lot of programmers, as happened with today's programming (which I would like to say separately).

I am sure that the search for tomorrow is already a communication <-> machine. Where a machine is a completely different technological platform for which information will cease to be a meaningless array of bytes. I would like not only to live to these times, but also to make the maximum of my efforts!

Source: https://habr.com/ru/post/176399/


All Articles