📜 ⬆️ ⬇️

Algorithms of Mind, or What's New Learning Machines Reveals Us About Ourself



"Science often follows technology, because discoveries give us new ways of thinking about the world and new phenomena that need explanation."

So says Aram Harrow (Aram Harrow), professor of physics at the Massachusetts Institute of Technology, in the article "Why is it time to study quantum computing" (
')
He believes that the scientific idea of ​​entropy could not be fully comprehended until thermodynamics of the steam engine technology required it. In a similar way, quantum computing emerged from attempts to simulate quantum mechanics on ordinary computers.

So what does all this have to do with machine learning?

Like steam engines, machine learning is a technology designed to solve specific classes of problems. Nevertheless, the results obtained in this area give us intriguing - perhaps fundamental - scientific assumptions about how the human brain functions, how it perceives the world around it and learns. Machine learning technology gives us new ways to understand the science of human thinking ... and imagination.

Not machine recognition, but computer imaging



Five years ago, Geoff Hinton, pioneer in-depth education, who is currently combining activities at the University of Toronto with Google, published the following video.



Hinton taught the five-layer neural network to recognize handwritten numbers from raster images. It was one of the forms of machine recognition of objects, which made handwriting suitable for machine reading.

But unlike previous works in the same field (in which the main goal was simply the recognition of numbers), the Hinton neural network could also carry out the process in the reverse order. That is, based on the concept of a symbol, she could recreate an image corresponding to this concept.



We see how the machine in the literal sense of the word imagines an image with the concept of "8".

Magic is coded in layers between entrances and exits. These layers are a kind of associative memory that performs the mapping in both directions (from the image to the concept and vice versa) within the same neural network.

Can human imagination work like that?

But behind the simplified technology created on the basis of the human brain model lies a broader scientific question: does the human imagination (visual imaging, visualization) work like this? If so, this is a significant discovery.

After all, isn't this what our brains do in a completely natural way? When we see the number 4, we think about the concept of "4". And vice versa: when someone says “8”, we can imagine the image of the figure 8 in our imagination.

Isn't all this a kind of “reverse process” that the mind passes from concept to image (or sound, smell, feeling, etc.) by means of information embedded in layers? Have we witnessed how this network created new images - and, perhaps, could create new internal connections in an improved version?

Concept and contemplation


If visual recognition and imagination are really just a reciprocal link between images and concepts, what happens between these layers? Can deep neural networks perform perception through contemplation or a similar process?

Let's first look into the past: 234 years ago, the Critique of Pure Reason was published, the philosophical work of Immanuel Kant, in which he asserts that contemplation is nothing but a representation of a phenomenon.



Kant opposed the idea that human knowledge may be the result of purely empirical and rational thinking. He argued that it is necessary to take into account knowledge through contemplation. Under the definition of "contemplation", he understands the concepts that were obtained through sensory perception, where descriptions of empirical objects or sensations are used as "concepts". Together they form human knowledge.

Two centuries later, Professor Alyosha Efros (Alyosha Efros) from the Computer Science Department at the University of California at Berkeley, who specializes in visual comprehension, noted: ". According to Efros, the use of words for teaching models imposes a language restriction on our technologies. Untitled contemplated phenomena are much more than words.

Here we see an intriguing correspondence between the tag in machine learning and the human concept, as well as between coding in machine learning and human contemplation.



In deep network learning, we find that activations in successive layers spread from lower conceptual levels of thinking to higher levels, as shown by a “cat recognition” study under the guidance of Quoc Le, working at Google and Stanford University. The image recognition network encodes raster images in the lowermost layer, then in the next layer, visible corners and edges are encoded, and simple forms in the subsequent layer, etc. These intermediate layers are not required to have any activations corresponding to high-level concepts (for example, such as “cat” or “dog”), but they still encode a distributed representation of the input sensory information. Only the last output layer has correspondence with labels set by a person, as it is subject to restrictions, in accordance with which it must comply with these labels.

Is this not contemplation?

Consequently, the coding and labels discussed above appear to be the same as Kant called "contemplation" and "concepts."
This is another example of how machine learning technology helps to understand the principles of human thinking. The network scheme presented above makes one wonder if it is not a largely simplified architecture of contemplation.

Controversy around the Sapir-Whorf concept


Efros noted: if there are far more phenomena in the world than words to describe them, then are our thoughts limited to words? This question is at the center of the Sapir-Whorf relativity linguistic hypothesis and the controversy as to whether language fully defines the boundaries of our knowledge, or we are naturally able to comprehend any phenomena, regardless of the language we speak.

In its strong form, this hypothesis states that the structure and vocabulary of a language affects the perception and understanding of a person’s world.

Some of the most impressive results can be obtained thanks to the color text presented here. The most impressive results are given by the color classification test. If you ask to find a square, the shade of which differs from all the others, representatives of the Himba ethnic group from northern Namibia, in whose language there are clear names for these two shades, then they almost immediately find it.



At the same time, all the rest of us have difficulties with this.

The theory is that if there are words describing different shades, our brain will learn to distinguish them, so over time these differences will become more “obvious”. In visual perception with the help of our brain, and not with the help of our eyes, language determines our perception.

We see with the help of our brain, not the eye.

In machine learning, we see something similar. When training with a teacher, we teach our models so that they determine the corresponding labels or categories of images (or other elements - text, sound, etc.) as accurately as possible. By definition, these models are trained much more effectively to distinguish categories with labels than other categories for which labels were not provided. When viewed from the point of view of machine-controlled learning, this consequence does not seem surprising. So, perhaps, we also should not be too surprised at the results of the above test. Language really affects our perception of the world around us, just as tags with machine-controlled learning influence the model's ability to distinguish categories.

But at the same time, we know that labels are not a prerequisite for the ability to distinguish categories. In Google’s Cat Recognition project, the neural network ultimately discovers the concepts of cat, dog, etc., without any need for the algorithm to learn the tags. After such training without proper control, whenever the network receives an image from a certain category (for example, “cats”), the same corresponding set of neurons is activated. After reviewing a large number of training images, this network found characteristics typical for each category, as well as differences between different categories.

In the same way, a baby who has been repeatedly shown a paper cup will soon be able to recognize its visual image even before he learns the words “paper cup” in order to associate the image with the title. In this sense, the strong form of the Sapir-Whorf hypothesis is not fully fair, because we can form concepts without words to describe them, which we are doing.

Machine learning under control and without it turned out to be two sides of the same coin in this controversy. And if we recognize them as such, the concept of Sapir-Whorf will not be a point of contention, but rather a reflection of human learning with and without a teacher.

I find this analogy incredibly fascinating - and we just started to understand something about this issue. Philosophers, psychologists, linguists and neurophysiologists have long been engaged in the study of this topic. When processing a huge amount of text, images or audio, the latest in-depth learning architectures demonstrate comparable to human or better classification results for images, language translation and speech recognition.

Each new discovery in the field of machine learning gives us the opportunity to learn something new about the processes occurring in the human brain. Speaking of our own mind, we get more and more reason to refer to machine learning.

PS Do you see which square is different from the rest? Write your version in the comments, using a clean picture for “awareness” and the picture below to determine the number of the square:



PPS There is a version that different monitors can visually “point” to a different square, and I (the translator) agree with it. For the most inquisitive minds, let me remind you that there are purely technical means of checking their assumptions.

Source: https://habr.com/ru/post/294212/


All Articles