I want to introduce
NLPub - a small knowledge base dedicated to computational linguistics in Russia.
Now no one is surprised by devices and applications that can understand and speak in human language. Such applications are based on natural language processing methods, which form a general direction at the junction of linguistics and artificial intelligence.
Why the vast majority of devices, applications and services does not work with the Russian language?
I often have to repeat this, but the reason is simple and tragic. The fact is that the solution of natural language processing tasks is associated with the use of specialized programs - analyzers that are in dire need of information resources - dictionaries, buildings, thesauruses, precisely because of which they are able to perform their function.
')
There is practically none of this in Russia, which paralyzes the work of commercial enterprises and academic groups, forcing them to reinvent the wheel or simply to abandon linguistic technologies.
The most useful thing that can be done momentarily is to help interested people to get started faster and to start using the few available technologies that are available at the moment.
To do this, you need to compile a catalog of available software with a description of functionality, write training materials, provide links to data, manuals and other information resources. That is why I created the
NLPub and invite everyone to join its development.
What information is collected as part of the NLPub?
Special attention is paid to the following topics:
- word processing tools available for both commercial and non-commercial use - tokenizers, morphological analyzers, syntax parsers, tonality analysis tools;
- resources - dictionaries, thesauri, corpus of texts necessary for solving fundamental and applied problems;
- events - thematic conferences and seminars for researchers and developers;
- education - educational institutions and professional retraining courses in the field of natural language processing and data analysis.
How can I help the project?
I see three available methods:
- to expand the knowledge base , providing readers with high-quality, correct and relevant material on the state of affairs in Russian computer linguistics;
- correct inaccuracies in the process of compiling and developing the knowledge base;
- talk about NLPub in various thematic communities, increasing public interest in the field of natural language processing (at least write about it in a blog, as I did ).
Who does this belong to?
NLPub is a non-profit project and has no affiliation with commercial companies. This in no way closes the path to it for commercial companies. On the contrary, posting information about their products is extremely welcome along with open and free solutions. Already today in the list of
tools you can find a lot of commercial products.
I will be happy to answer all the questions and comments set out both in the comments here and through
more private channels of communication.