Google puts machine learning at the forefront

If you want to build artificial intelligence into each product, you will have to retrain your army of programmers. Put a tick

Carson Holgate trains on ninja. But not in hand-to-hand combat - this she has already mastered. She is 26 years old, and she has a second black belt in taekwondo. This time, she has been training in algorithms - and for several weeks now she has been running a program that will give her strength even more than hand-to-hand combat gives. This is machine learning, MO. She works as a programmer at Google, in the Android division. Holgate is one of 18 programmers participating this year in the Ninja Machine Learning program, which pulls talented coders from their teams and enters the program in the Ender Game style. As part of the program, they are taught techniques for introducing AI, which should make their products smarter. Even at the cost of complicating their programs.

“Our slogan:“ Do you want to become a ninja in machine learning? ”, Says Kristin Robson, a product manager in the field of MoD in Google’s internal courses, who helped implement the program. “We invite people from Google to spend six months inside the MO team, stay next to the mentor, work on the MO half a year, make your project, run it and learn in the process.”
')
For Holgate, who came to Google almost four years ago with a degree in computer science and mathematics, this is a chance to master the hottest paradigm of the world of software. Using learning algorithms and large amounts of data, “teach” programs to perform tasks. For many years, MO was a specialty owned by a few, "elite". This time has passed, and it is believed that the MO, powered by neural networks that emulate the work of the biological brain, is the true way of endowing computers with human capabilities, and sometimes superman. Google is set to increase the number of this elite in its company, and hopes that this knowledge will become the norm. Programmers like Holgate, this program may allow to take a place in the forefront, and learn from the best of the best. “These people make incredible models, while having a PhD degree,” she says, without concealing her admiration. She has become accustomed to participating in a program that calls her students “ninja.” “I frowned at first, but I got used to it."

Given the huge number of employees of the company - almost half of the 60,000 work as programmers - this project is very small. But the program symbolizes the cognitive shift. Although MO has long been used in Google technologies - and the company has already become a leader in hiring experts in this field - in 2016, Google was simply obsessed with this topic. At the training conference at the end of last year, director Sandar Pichai explained the corporation’s intentions: “MO is the core, the path of transformation, through which we change the way we achieve our goals. We thoughtfully apply it in all products - be it search, advertising, YouTube or Play. We are still at the beginning, but you will see that we will systematically use machine learning in all these areas. ”

Obviously, if Google is going to embed MO in all products, it will need programmers specializing in these technologies, which are a sharp rejection of the usual programming style. As the author of the popular MO-manifest " Main algorithm ", Pedro Domingos [Pedro Domingos] writes: "MO is something completely new, it is a technology that creates itself." Successful writing of such systems includes the task of determining the correct data set, choosing the right algorithmic approach and choosing the right conditions. And then, which is especially hard to do for a programmer, you need to trust the system that performs the work.

“The more people thinking about solving problems in this way, the better for us,” says the company’s leader in MO, Jeff Dean, who takes the same place in Google’s software development as Tom Brady [ https://en.wikipedia.org/wiki/Tom_Brady] among quarterbacks in the NFL. In his opinion, now out of 25,000 Google programmers, only “a few thousand” are well versed in MO. Well, maybe ten percent. And he wants to bring this figure to a hundred percent. “It would be great if every programmer had at least some knowledge in the field of MO,” he says.

Will it ever happen?

“We're going to try,” he says.

* * *

For many years, John Giannandrea has been Google's primary promoter of machine learning. Today, John will take over in her post as head of the search department - eloquent evidence of the new course of the company. But, having come to the company in 2010 (as part of its acquisition of a database of people, places and other data from MetaWeb, now integrated into Google Search under the name of Knowledge Graph), he did not have much experience in MO or neural networks. In 2011, he was amazed at the news from the Neural Information Processing Systems (NIPS) conference on neural networks. It seemed that every year at NIPS the new company told what it had achieved with the help of the Ministry of Defense, and this decision was far superior to previous attempts, be it translation, voice recognition or machine vision. Something amazing was happening. “From the first time, it seemed to me that some completely incomprehensible things were being discussed at this conference,” he says. - But this area at the intersection of scientific research and industry has just skyrocketed over the past three years. Last year, I think, 6,000 people attended. ”

Jeff dean

Improved neural network emulation algorithms, coupled with the computational power that is increasing due to Moore’s law and the exponential growth of data on user behavior collected by Google or Facebook companies, launched a new era of MO domination. Jinnandrea joined the group of people who were convinced of the need to position the MO as the company's central technology. This group includes Dean, co-founder of Google Brain - a project to work with neural networks, originating in the long-term research department of Google X (which is now known simply as X).

Google grabbed hold of the bear on the DoD, not just for the sake of a shift in programming technology. This is a fascination with technologies that will give computers unprecedented opportunities. The front edge of these technologies are the "deep learning" algorithms, built on the basis of neural networks inspired by the architecture of the brain. Google Brain is an attempt at machine learning, and DeepMind, the company involved in AI, and Google bought in January 2014 for half a billion dollars, also concentrated efforts in this area. It was DeepMind that created the AlphaGo system, which defeated the champion of the game of go, surpassing the expected capabilities of computers, and stirring up mistrust of cars to people who are afraid of smart computers and killer robots .

Jinnandrea does not believe that “robots will kill us all,” but he confirms that MO-based systems will change everything - from medical diagnoses to driving cars. And although these systems will not replace people, they will change humanity.

As an example of MO abilities, he cites Google Photos. One of its defining capabilities is the supernatural and frightening ability to find an image of an object specified by the user. Show me the border collie images. “When people see his work for the first time, it seems to them that something completely different is happening - the computer doesn’t just calculate preferences or offer video for viewing,” says Jinnandrea. He understands what is in the picture. ” As a result of the training, the computer “recognized” what the border collie looks like and he will find photos of collie puppies, old dogs, long-haired and trimmed collie. It is available to man. But a person cannot view a million examples and identify ten thousand breeds. A system with MO may. If she learns one breed, she can use this technology to identify the 9,999 remaining ones. “This is exactly innovation. For such narrow areas of knowledge, the capabilities of learning systems become apparent, which some people describe as superhuman. ”

Of course, the concept of MO has long been studied in Google. Its creators all their lives believed in the capabilities of AI. MOs have already been implemented in many products of the company, although some versions of MOs do not use neural networks (before, MOs relied on a simpler statistical approach).

Internal courses of the company MO were conducted more than ten years ago. In 2005, Peter Norvig, who was in charge of the search at the time, suggested that researcher David Pablo Cohn [David Pablo Cohn] check whether Google could implement a course on MO that was read at Carnegie Mellon University. Kohn concluded that only the Google employees themselves can conduct such a course, since the scale of the company's activities surpasses all others (perhaps with the exception of the Ministry of Defense). He reserved a large audience in building 43 (then it was the headquarters of the search team) and read two lectures every Wednesday. Even Jeff Dean was present on a couple of them. “It was the best course in the world,” says Cohn. “They all outnumbered me in programming.” The course was so popular that it began to spin out of control. People from the office in Bangalore spent the night at work to remotely listen to lectures. After a couple of years, some employees made short videos based on lectures. Live lectures stopped. Kohn believes that this event could have been the forerunner of the Massive open online courses (MOOC) technology. In the following years, the company held other, irregular and incomparable in scale courses on MO. Kohn left the company in 2013, just before the DoD at Google "suddenly became so important for everyone."

Nevertheless, this understanding waited in the wings until 2012, when Dzhinnandrea had the idea of “gathering a bunch of people involved in these things” and placing them all in one building. The Google Brain project, which came out of department X, joined the get-together. “We raised a lot of teams, assembled them in one building, bought them a great new coffee machine,” he says. “Those who worked on computer perception, recognition of sounds and speech, now communicated with those who worked on language tasks.”

Gradually, the results of the work of programmers involved in MO began to appear in popular Google services. Since so far the main fields of application of MO are vision, speech recognition, speech synthesis and translation, it is not surprising that this technology has become a part of the voice search, translator and photo services. It's amazing what MOs are trying to build everywhere. Jeff Dean says that as his and his team’s understanding of MO principles improves, they raise the bar for using this technology ever higher. “Before, we could use MOs in several auxiliary components of the system,” he says. Now we use MO in exchange for whole sets of systems, and we don’t try to build our MO for each of the parts. ” If he had to rewrite the Google infrastructure, and Dean is known as one of the creators of such technology-influencing systems as the Big Table and MapReduce , most would have been obtained not by programming, but by learning.

Greg Corrado, co-founder of Google Brain

MO allows you to endow products with such properties that it was difficult to imagine before. One example is the “smart answer” in Gmail, launched in November 2015. It began with a conversation by Greg Corrao, co-founder of Google Brain, and Gmail engineer Bálint Miklós, from Gmail. Corrado was already working with the Gmail team on spam recognition and letter classification, and Miklos suggested something radical. What if using MO, automatically creating answers to letters to save users from unnecessary trouble for typing text on tiny keyboards? "I was extremely amazed, because this proposal looked insane," says Corrado. “But then I thought that with the help of the predictive technology of our neural network, this can be realized.” And if there was even one chance, we had to try. ”

Google increased its chances of maintaining close contact between Corrado with the team and the Gmail developers. Such an approach is being used more and more often; experts on MO are scattered over groups of developers of various products. “MO is both an art and a science,” says Corrado. “It's like cooking: of course, chemistry plays a role, but to make something interesting, you need to learn how to combine the available ingredients.”

Traditional methods of language recognition using AI required the inclusion of language rules into the system, but in this project, according to the principles of the modern IO, enough data was entered into the system so that it would learn independently, as children do. “I was trained to speak not from a linguist, but by listening to other people's conversations,” says Corrado. But what made the project feasible is the ability to clearly define the parameters for success. They did not want to create a virtual double Scarlet Johansson , who would have flirted in chat rooms. They needed plausible answers to real letters. “Success was achieved when the machine created an answer that people found useful enough to use,” he says. As a result, the system was trained, noting whether the user chose the proposed answer.

After the start of testing the function, users noticed a strange glitch: she often offered users inappropriate romantic response options. “One of the errors of the system was a very funny tendency to offer the option“ I love you ”when the algorithm was confused, says Corrado. “It was not a bug in the software - the mistake was that we asked her to do it.” The program somehow studied the subtle aspects of human behavior. “If you get confused, you can say“ I love you ”- and this is a good protective strategy. ”Corrado was able to help the team reduce the romantic fervor of the program.

Smart Reply, released last November, became a hit - Gmail users are now constantly receiving three response options to a letter, which they can send with one touch. Sometimes they get right to the point. One of the ten replies sent by users of the mobile version of Inbox, created by the machine. “I'm still surprised that it works,” laughs Corrado.

Smart Reply is only one peak in the dense graph of Google projects, in which MO well showed itself. Perhaps the turning point came when machine learning became an integral part of the search - Google’s main product, and the source of virtually all revenue. The search has always been based on AI, but for many years the sacred algorithms offering us “ten blue links” in response to a query were considered too important to be given to MO algorithms. “Since the search was such an important part of the company, the ranking has evolved greatly. People were skeptical of the idea that some significant difference could be made, ”says Jinnandrea.

This was partly due to cultural resistance — the guru-hackers, loving control over everything, refused to accept the machine learning approach in the Zen style. Amit Singhal, a long time chief search officer, was himself a follower of Gerald Salton, a legendary computer scientist, whose founding work on document sampling inspired Singhal. He helped transform the Brin and Page code into something scalable to meet the requirements of the new computer era. (So he got into the bloodhound clan). He obtained amazing results by operating methods of the 20th century, and suspiciously allowed students into a complex system that was the heart of Google. “In the first two years at Google, I was on the search quality improvement team, and tried to use machine learning to improve rankings,” says David Pablo Cohn. - It turned out that Amit's intuition was the best in the world, and the search turned out better if we embodied in the code what was in his head. We could not find anything better than his approach. ”

By the beginning of 2014, MO specialists decided that the situation should change. “We had conversations with the ranking team,” says Dean. “We said that we should at least try, and see if our approach would have any advantages.” Their experiment turned out to be the central point of the whole search: how well the document in the output matches the search query (which is measured by the number of user clicks). "We just said, let's try to calculate another metric using a neural network and see if it will be useful."

It turned out that there will be, and this system is now built into the search under the name RankBrain. It was launched in April 2015. Google, as usual, doesn’t go into details of how it improves search (something related to long key phrases? The best interpretation of ambiguous queries?), But Dean says that RankBrain works “with every request "And affects the ranking of" perhaps not in every query, but in many. " Moreover, it is very effective. In the hundreds of “signals” used by the search engine to calculate the rank (the signal is one of the parameters like geolocation or the match of the query with the header), RankBrain is in the third most important place.

“It was important for the company that we, with the help of MO, improve the search system,” says Ginnandrea “And many followed our work.” Pedro Domingos, a professor at the University of Washington, author of the “Main Algorithm”, puts it differently: “There has always been a struggle between“ snoops ”and supporters of the MoD. The latter still won.

The new Google task is a paradigm shift in the minds of programmers, so that everyone, if they were not specialists, would be at least familiar with MO. , , Facebook, . , Google . , Google , , . , - Google. « , , Google»,- . : Google , .

, . , Google, .

. - . , , , .

. « – . ,- . – , . . ».

« , , ,- , , , . – . , – ».

Google, . « , ,- . – , Google».

, Google , , , . TensorFlow – , . Google Brain [Rajat Monga], TensorFlow , . TensorFlow 2015.

, Google , , , , . , TensorFlow Facebook, Torch 2015. , TensorFlow, Google , . , - TensorFlow 75 000 .

Google - . , , , — Tensor Processing Unit. , , , Graphics Processing Units . ( , ), - . , TPU Google . « RankBrain»,- .

Google , , , . . – « » «- TensorFlow», , . Google , , . « »,- .

. Google Brain Residency, Google Brain. « »,- , . , , - 27 Google, , , Google' .

- , , «», Google , , .

, Google - . « , LSTM (»Long Short Term Memory", ), , , , ,- . – , - ".

, Android . , , -. , , - – Google, , . , .

« , — , — ».

Source: https://habr.com/ru/post/369437/

All Articles

Google puts machine learning at the forefront

If you want to build artificial intelligence into each product, you will have to retrain your army of programmers. Put a tick

More articles: