📜 ⬆️ ⬇️

And what really neural networks hide?

A few days ago there was an article on Habré What do neural networks hide? . She is a free retelling of the English article The Flaw Lurking In Every Deep Neural Net , and she in turn talks about a specific study of some properties of neural networks ( Intriguing properties of neural networks ).

In the article describing the study, the authors chose a somewhat sensational approach to the presentation of the material and wrote the text in the spirit of “a serious problem was found in neural networks” and “we cannot trust neural networks in security-related problems”. Many people shared the link to the post on Habré among my acquaintances; several discussions on this topic started on Facebook at once. At the same time, I got the impression that in two retellings, some of the information from the initial research was lost, plus many questions arose related to neural networks that were not considered in the original text. It seems to me that there is a need to describe in more detail what was done in the study, and at the same time try to answer the initial questions. The Facebook format for such long texts doesn’t fit at all, so I decided to try to format my thoughts in a post on Habré.

Content of the original article

The original article titled “The Intriguing Properties of Neural Networks” was written by a group of seven scientists, three of whom work in the neural network research department at Google. The article discusses two non-obvious properties of neural networks:


Let's try to figure out in more detail what these two properties mean.
')
Value of specific neurons

The first property will try to disassemble quickly.

There is an assumption that is popular with fans of neural networks, which is that the neural network inside it parses the original data into separate clear properties and at deep levels of the neural network, each neuron is responsible for some specific property of the original object.

This statement is usually checked by visual inspection:

  1. A neuron in a trained network is selected.
  2. Images from a test set that activate this neuron are selected.
  3. The selected images are viewed by a person and it is concluded that all these images have some common property.

What the researchers did in the article: instead of examining individual neurons, they began to examine linear combinations of neurons and search for images that activate a particular combination, general semantic properties. The authors succeeded - they conclude from this that the data on the subject area in the neural network are not stored in specific neurons, but in the general network configuration.

Generally speaking, I don’t really want to seriously discuss this part of the article, because it relates more to the field of religion than to science.

The initial reasoning that specific neurons are responsible for specific signs is taken from a very abstract argument that the neural network should resemble the human brain in its work. Where does the statement that the selected signs should be understood by a person, I could not find anywhere else. Verification of this statement is a very strange task, because it is easy to find a common feature in a small collection of arbitrary images, if there is a desire. And to perform a statistically significant check on a large volume of images is impossible, since the process can not be automated. As a result, we got a logical result: if it is possible to find common features on one set of images, then you can do the same on any other.


An example of images with the same property from the original article.

At the same time, the general conclusion in this part of the article seems to be logical - in the neural network, knowledge about the subject area is rather contained in the entire neural network architecture and parameters of its neurons, rather than in each particular neuron separately.

Blind net spots

The researchers conducted the following experiment - they set out to find objects that are incorrectly classified by the network and located as close as possible to the objects of the training set. For their search, the authors developed a special optimization algorithm, which departed from the original image in the direction of worsening neural network responses until the classification of the object broke.

The experiment resulted in the following:


Actually, about these blind spots and there is basically talk, so let's try to answer the questions that appear at the same time. But to begin with, let's look at a few basic objections that appear to people reading the description of the study:



What is the real news here?

The fact that a neural network may have blind spots next to the objects of the training sample is not really big news. The fact is that in the neural network no one ever promised local accuracy.

There are classification methods (for example, Support Vector Machines ), which, at the heart of their training, put the maximum separation of the objects of the training sample from the boundaries of class changes. In neural networks, there are no requirements of this kind; moreover, due to the complexity of neural networks, the resulting separation of the original set usually defies normal interpretation and investigation. Therefore, the fact that in networks one can find areas of local instability is not news, but confirmation of a fact that was already well known.

What is really new here is that the distortions that lead to errors retain their properties when switching to a different network architecture and changing the training set. This is indeed a very unexpected discovery, and I hope that the authors in the following papers will find an explanation for it.

Is neural network a dead end?

No, neural networks are not a dead end. This is a very powerful and powerful tool that solves a certain set of very specific problems.

The popularity of neural networks is based on two ideas:


Therefore, a neural network is a means of quickly obtaining acceptable solutions for very complex recognition problems. Nobody ever promised anything to neural networks (although there have been many attempts). The key words here are “fast” and “difficult tasks”:

Can you trust neural networks?

The main conclusion of the article discussing the original research was: “Until this happens, we cannot rely on neural networks where safety is crucial ...”. Then, in separate discussions, Google Car often pop up, for some reason (apparently because of the place of work of the authors and the picture of the car in the article).

In fact, neural networks can be trusted, and for this there are several reasons:

  1. The user (and not the researcher) of the neural network is important not where exactly it is mistaken, but how often. Believe me, you will absolutely not care, your automatic car did not recognize the truck, which was in its training base, or one that she had not seen before. The whole study is devoted to the search for errors in specific areas next to the training sample, while the overall quality of the work of neural networks (and ways to evaluate it) is not questioned.
  2. Any recognition system never works at 100%, there are always errors in it. One of the first principles that robotics will recognize is that you should never take action based on one individual sensor indicator, you always need to take a floating window of values ​​and throw out the freaks from there. For any critical system, this is also true - in any real task there is always a flow of data, even if at some point the system failed, the neighboring data will correct the situation.




So, neural networks in any critical system should be treated as another type of sensor, which generally gives the correct data, but sometimes makes mistakes and needs to be put to its errors.

What is important in this article?

It would seem that if no great revelations were found in the article, why did they write it at all?

In my opinion, there is one main result in the article - it is a well-thought-out method for a noticeable increase in the quality of the neural network during training. Often, when learning recognition systems, a standard trick is used, when for training, in addition to the original objects of the training set, the same objects with added noise are used.

The authors of the article showed that, instead, it is possible to use objects with distortions that lead to neural network errors and thus eliminate errors on these distortions, and at the same time improve the quality of work of the entire network on the test set. This is an important result for working with neural networks.

In the end, I can only recommend not to read articles with “sensational” headlines, but it is better to find primary sources and read them - everything is much more interesting there.

Source: https://habr.com/ru/post/225349/


All Articles