⬆️ ⬇️

Language barrier and NLP. Why don't chat bots understand us?

People have long wanted to teach the car to understand the person. However, only now we are a little closer to the plots of science fiction films: we can ask Alice to turn down the volume, Google Assistant to order a taxi or Siri to set the alarm. Language processing technologies are in demand in developments related to the construction of artificial intelligence: in search engines, to extract facts, assess the tonality of the text, machine translation and dialogue.





We will talk about the last two directions: they have a rich history and have had a significant impact on language processing. In addition, we will deal with the basic possibilities of processing natural language when creating a chat bot, along with the speaker of our AI Weekend course, computer linguist Anna Vlasova.



How did it all begin?



The first conversations about the processing of natural language by computer began in the 30s of the 20th century with Ayer’s philosophical reasoning - he proposed to distinguish a reasonable person from a stupid machine using an empirical test. In 1950, Alan Turing, in the philosophical journal Mind, proposed such a test, where the judge must determine with whom he conducts a dialogue: with a person or a computer. Using the test, we set the criteria for evaluating the work of artificial intelligence; the possibility of constructing it was not questioned. The test has many limitations and shortcomings, but it had a significant impact on the development of chat bots.

The first area where language processing was successfully applied was machine translation. In 1954, Georgetown University, together with IBM, demonstrated a program of machine translation from Russian to English, which worked on the basis of a dictionary of 250 words and a set of 6 grammatical rules. The program was far from what could truly be called machine translation, and translated 49 pre-selected sentences to the demonstrations. Until the mid-60s, many attempts were made to create a fully-working translation program, but in 1966, the ALPAC ( Automatic Language Processing Advisory Commission ) announced machine translation as a futile direction. State subsidies ceased for a while, public interest in machine translation decreased, but research did not stop there.





')

In parallel with the attempts to teach the computer to translate the text, scientists and whole universities thought about creating a robot capable of imitating human speech behavior. The first successful implementation of the chat bot was the ELIZA virtual interlocutor, written in 1966 by Joseph Weizenbaum. Eliza parodied the behavior of a psychotherapist, highlighting significant words from the interlocutor's phrase and asking a counter question. We can assume that this was the first chat bot built on rules (rule-based bot), and it marked the beginning of a whole class of such systems. Without Eliza, such interlocutors as Cleverbot, WeChat Xiaoice, Eugene Goostman — formally passed the Turing test in 2014 — and even Siri, Jarvis, and Alexa would not have appeared.

In 1968, Terry Winograd developed the SHRDLU program in LISP. She moved simple objects on command: cones, cubes, balls, and could maintain context — she understood exactly which element to move, if it was mentioned earlier. The next step in the development of chat bots was the ALICE program, for which Richard Wallace developed a special markup language - AIML (English Artificial Intelligence Markup Language) . Then, in 1995, the expectations of the chat bot were too high: they thought that ALICE would be even smarter than a person. Of course, it was not possible for the chat bot to be smarter, and for some time the business in chat bots was disappointed, and investors for a long time avoided the topic of virtual assistants.



Language matters



Today chat bots still work on the basis of a set of rules and behavior scenarios, but natural language is fuzzy and ambiguous, one thought can have many ways of presentation, therefore the commercial success of dialogue systems depends on solving language processing tasks. The car must be taught to clearly classify all the variety of incoming questions and to clearly interpret them.

All languages ​​are arranged differently, and this is important for parsing. From the point of view of morphological composition, significant elements of the word can join the root sequentially, as, for example, in the Turkic languages, and can break up the root, as in Arabic and Hebrew. In terms of syntax, some languages ​​allow free word order in a phrase, while others are more rigidly organized. In classical systems, word order plays a significant role. For modern statistical methods of NLP, it does not have such a value, since processing occurs not at the level of words, but of whole sentences.

Other difficulties in the development of chat bots arise from the development of multilingual communication. Now people often communicate in non-native languages, use words incorrectly. For example, in the phrase “I have shipped two days ago, but goods did not come” in terms of vocabulary, we should talk about the delivery of physical objects, such as goods, and not about an electronic money transaction that a person says with these words in native language. But in real communication, the person will understand the interlocutor correctly, and the chat bot may have problems. In certain topics, such as investment, banking, or IT, people often switch to other languages. But the chat bot is unlikely to understand what it is about, since it is most likely taught in one language.



Success Story: Machine Translators



Before the advent of voice assistants and the large-scale distribution of chat bots, the most popular intellectual task requiring the processing of natural language was machine translation. Talk about neural networks and deep learning went already in the 90s, and the first Mark-1 neurocomputer appeared in general in 1958. But it was not possible to use them everywhere because of the low productivity of computers and the lack of sufficiently large linguistic buildings. Only large research teams could afford to engage in research in the field of neural networks.

In the middle of the 20th century, machine translators were far from Google Translate and Yandex.Translate, but with each new translation method, ideas emerged that are applied in one form or another even today.

1970. Machine-based translation (RBMT) was the first attempt to teach a machine to translate. The translation was obtained like a fifth-grader with a dictionary, but in one form or another the rules for a machine translator or chat bot are used now.

1984 Machine translation based on examples (EBMT) was able to translate even completely different languages, where it was useless to set some rules. All modern machine translators and chat bots use ready-made examples and patterns.

1990 Statistical machine translation (eng. SMT) in the era of the development of the Internet allowed to use not only ready-made language corpuses, but even books and freely translated articles. More available data improved translation quality. Statistical methods are now actively used in language processing.



Neural networks in the service of NLP



As the processing of natural language developed, many problems were solved by classical statistical methods and a set of rules, but this did not solve the problem of vagueness and ambiguity in the language. If we say "onion" without any context, then even a live interlocutor is unlikely to understand what is at stake. The semantics of the word in the text define the word-neighbors. But how to explain this machine, if she understands only the numerical representation? Thus was born the statistical method for analyzing text word2vec (English Word to vector) .





The vectors bow_1 and bow_2 are parallel, hence this is one word, and bow_3 is a homonym.



The idea is quite obvious from the title: present the word as a vector with coordinates (x 1 , x 2 , ..., x n ). To combat homonymy with the same words, a tag is added: “bow_1”, “bow_2” and so on. If the vectors "bow_n" and "bow_m" are parallel, then they can be considered in one word. Otherwise these words are homonyms. At the output, each word has its own vector representation in multidimensional space (the dimension of the vector space can vary from 50 to 1000).







It remains an open question what type of neural network to use for learning a conditional chat bot. In human speech, consistency is important: we draw some conclusions and make decisions based on what was said in the previous sentence or even the paragraph. Under these criteria, the recurrent neural network (RNN) fits perfectly, but as the distance between related parts of the text increases, the size of the RNN must also be increased, which causes a decrease in the quality of information processing. This problem is solved by the LSTM network (Long Short-term memory) . It has one important feature - the state of the cell, which can remain constant, or change, if necessary. Thus, information in the chain is not lost, which is crucial for the processing of natural language.

To date, there are a huge variety of libraries for processing natural language. If we talk about the language of Python, which is often used for data analysis, it is NLTK and Spacy . Large companies are also involved in the development of libraries for NLP, such as NLP Architect from Intel or PyTorch from researchers from Facebook and Uber. Despite such a great interest in neural network methods of processing language from large companies, coherent dialogues are based mainly on the basis of classical methods, and the neural network plays a supporting role, solving the problems of speech pre-processing and classification.



How can NLP be applied in business?



The most obvious use of natural language processing is machine translators, chat bots, and voice assistants — something we encounter every day. Most of the call center employees can be replaced by virtual assistants, since about 80% of customer calls to banks concern fairly typical issues. The chat bot will also calmly cope with the candidate's initial interview and record it for a “live” meeting. Oddly enough, jurisprudence is a fairly accurate direction, so even here a chat bot can become a successful consultant.







The b2c referral is not the only place where chatbots can be used. In large companies, the rotation of employees is quite active, so everyone has to help in adapting to the new environment. Since the questions of a new employee are fairly typical, the whole process is easily automated. There is no need to look for someone who will explain how to refill the printer, who to contact on any issue. The company's internal chat bot does a great job of it.



With the help of NLP, you can measure customer satisfaction with a new product with high accuracy by analyzing reviews on the Internet. If the program has identified a review as negative, then the report is automatically sent to the appropriate department, where real people already work with it.



The possibilities of language processing will only expand, and with them the scope of its application. If there are 40 people working in the call center of your company, you should think: maybe it is better to replace them with a team of programmers who will make you a chat bot?



You can learn more about the possibilities of language processing in our AI Weekend course, where, within the framework of the topic of artificial intelligence, Anna Vlasova will talk in detail about chat bots.

Source: https://habr.com/ru/post/422609/



All Articles