📜 ⬆️ ⬇️

How to track every roll in the NBA?



As a child, Kirk Goldsberry was an ardent basketball fan. But then, in the 1980s, he lived near the University of Pennsylvania, which meant that his house was quite far from Philadelphia, where you could watch the “76ers” games on TV. So, pondering what kind of team he should support, he stopped at Dominica Wilkins and his Atlanta Hawks. They were 750 miles away from him, but thanks to the TBS miracle channel, Goldsberry could follow their games as if he was originally from Georgia.

Goldsberry received a bachelor’s degree in geoscience from the University of Pennsylvania, and then a master’s and a doctorate in geographical sciences from the University of California at Santa Barbara, where he wrote a thesis on interactive road-based Internet maps. He tried to find a way to visually display data about movement in space and time - to make the numbers visible. Maps and space formed the Goldsberry world view. More precisely - maps, space and basketball.
')
Throughout the entire training period, Goldsberry did not just watch basketball, he also took part in amateur games to keep himself in shape. And when he played, he began to think about basketball and how it differs from other sports. Analytics — an analysis of the game and its results with the help of statistics — began to complement more traditional methods of evaluating games and conducting training, such as watching videos or developing physical data.

The beginning of this revolution was laid in baseball, as Michael Lewis asserts in his book “Moneyball”. From a statistical point of view, baseball, by and large, is a fairly simple game. It is based on a uniform sequence of personal confrontations of the striker and the pitcher, and in each draw there are given initial and final positions (the statistician would call each of these draws a “state”). Given this fact, as well as the abundance of data on each draw available to researchers, it is possible to make calculations for each of these situations in the game and estimate the probability of the next event. If the team has a runner on the first base with one strike out, then the probability of earning a point in this inning is 28%, and so on.


Taken from Mark McCluskey’s book “Faster, Higher, Stronger: How Sport Science Creates a New Generation of Super Sportsmen, and What We Can Learn From Them”

But Goldsberry realized that the principles of Moneyball are not valid on the basketball court. Unlike static actions with successive states in baseball, basketball is a continuous game. Players move from attack to defense, from a position under the ring to a double ward. In baseball, if a player takes the position of the left fielder, you roughly imagine what area he will be responsible for in defense. In basketball, if a player takes a forward position, at any moment he may be anywhere on the court. There are no states in this game, so it is impossible to statistically determine the probability of a particular result. Analysts believed that to assess each individual event, as they succeeded in baseball, is simply impossible.

In other words, basketball was similar to one of the cards compiled by Golberry - a complex, intricate stream of information that has no beginning or end. But this did not mean that it could not be analyzed. On the contrary, Golsberry guessed that he needed only a suitable data type. “From my basketball experience, I know that I have strengths and weaknesses that vary depending on my location on the court, and I think other players think the same way,” he says. Instead of focusing on the numbers that determine the state of baseball, Goldsberry began to pay attention to the location and movement of objects, in particular players and the ball. The task was to display this as a map.

From this point of view and due to the large amount of data, he was able to do something more than just determine how people imagined this game. He was able to detect the hidden laws of basketball, illuminating dark corners, the existence of which no one even knew. To know the essence of baseball, you probably need a statistician who understands relationships and probabilities. But in order to understand basketball, one must also have good spatial thinking. You need a cartographer. More precisely, you need Kirk Goldsberry.

In 2011, when Goldsberry had a little free time from teaching in Michigan and Harvard, he began to create his own cartographic system. But there was the problem of obtaining relevant information. Tracking the constant movement of 10 players is not so easy. He began to browse fan sites and sports programs, and eventually found statistics on each roll made in the NBA. There was little data: only who made the throw, from where did he hit the target. But it was already something.

The data was not so confidential, but not publicly available: Goldsberry dug out them on the Web. To be precise, on ESPN.com, he found a throwing card with the statistics of each game (box score). Then he found the source files and extracted all the information. “They laid out these data sets, but did not use the opportunities that I saw in them,” Goldsberry says.

As a result, he collected a database consisting of the spatial coordinates of each roll made from 2006 to 2011: more than 700,000 rolls. And here Goldsberry-cartographer and Goldsberry-basketball fan began to work together. “I wanted to find a way to make this data tell us something new, for example, where Kobe is good and where not good,” he said. Goldsberry wanted to do more than just analyze. He wanted to show it to people, “to share with players, fans and the media.”

Throws from different parts of the world


Noting the location and frequency of each throw in the NBA, Kirk Goldsberry can create a map of the advantages and weaknesses of any basketball player when playing in an attack. Below are throwing cards for two potential members of the Hall of Fame - Ray Allen and Dirk Nowitzki.


Throws from an average distance are not the strength of most players, with the exception of Dirk Nowitzki, who prefers to throw on the right side of the front line.


Even the best sniper in the history has relatively weak spots, for example, throws from the left side.

He divided the area of ​​1,284 square feet, from which the throws were made - more precisely, the area is only near the three-point line and closer - to the cells, as in a computer strategy. Then he took advantage of the data and created maps showing where the player was shooting, how often and how accurate they were.

Goldsberry called his system CourtVision, and it demonstrated the differences of players that no one had noticed before. Ray Allen, one of the association's top snipers, has some very “hot” zones behind the three-point mark, and there are very few attempts to make a jump shot from an average distance. Kobe Bryant, the current star of the Los Angeles Lakers team, made a lot of shots from all points of the site, but there are places, he throws clearly less effective (for example, the front line, from which it’s harder to throw). Goldsberry formed nothing more than a visual characteristic of the basketball game in attack, simple and easy to understand. It went far beyond what the analyst or trainer could only guess from outside the site. The longer CourtVision cards have been studied, the more valuable findings have been made.

Goldsberry presented his work in 2012 at the Sloan Sports Analytics Conference, the annual meeting of MIT statisticians and coaches [Eng. Massachusetts Institute of Technology - Massachusetts Institute of Technology], and just shocked the entire basketball world. For the first time, fans could see which types of shots were played by their favorite players, as well as the relative value of these throws. The CourtVision system did not take into account, for example, who the defender was or what else happened on the court, but nevertheless, she provided the team leadership with a potentially effective method of evaluating players, allowing them to monitor their productivity and relevance to their playing style. After the speech, Mark Kyuban, owner of the Dallas Mavericks team, and R. K. Buford, general manager of San Antonio Spurs, approached Golberry and asked to tell more. He said: “It was one of those moments when you think:“ Wow! If I do everything right, I can turn it into something more than a simple hobby that I worked on at night and on weekends. ”

The work of Goldsberry attracted the attention of Brian Kopp, at that time the executive vice president of Stats, located near Chicago. Stats was founded in the 1980s by a group of baseball researchers who collected the most complete game statistics. Today, it is a giant company that provides statistics on professional sports events in the United States to teams, leagues and the media. In 2012, Stats also studied basketball, working on a new data collection method called SportVU. Shortly after that presentation at the 2012 conference, Kopp contacted Goldsberry and asked if he would like to look at something.

SportVU is built on a computer-controlled optical technology developed by Israeli scientists to track missiles. In 2005, the Israelis found it to be used in sports, installing three cameras above the football field to watch the game and transfer data to a central computer. Due to the parallax effect and other tricks in computer image processing, the system could track all objects on the field, from players and the ball to the referees team, and determine their location in three-dimensional space 25 times per second. In 2008, Stats acquired SportVU with the goal of developing a device of six cameras for basketball.

The fixture was not cheap: every NBA team that wanted this information had to pay about $ 100,000 to install cameras and computers in their arena. By the end of the 2012–2013 season, only 15 teams bought it, and the data were incomplete: only about half of all games were recorded. But this data seemed to have enormous potential. In September 2013, the NBA signed an agreement to install the system in each league arena.

“Brian called me and asked:“ Do you want to work with this data? ”Says Goldsberry. “I had the lucky chance of accessing data that few people saw outside the NBA.” It was a jackpot, a goldmine; these were data that was much more detailed than what he had obtained from ESPN.com, and giving a complete description of each moment of possession of the ball, as well as where and how the players moved to make a throw. As soon as he got them, he was able to answer any questions. Want to know what distance a player runs over a match? Just spit. Wondering who is the best passer in your team? Easy. How does the performance of your pick-and-rolls differ [eng. pick and roll - a combination, played by two players, in which one player of the attacking team puts a barrier (pick) for the player with the ball, and then, after both their defenders move towards the last, starts moving to the ring (roll) and gets a pass to an open throw] from the average in the league, when the player starts moving less than 15 seconds before the siren? SportVU could answer this question.

But one of the main discoveries that Goldsberry so eagerly wanted to share was the opportunity to understand one of the most unpleasant aspects of sports - defense. For decades, teams relied on simple numbers - the number of interceptions and block shots - to determine the value of the player in defense. System SportVU allowed to see a more general picture. Goldsberry could already determine objectively the best way to play against the opponent's pick-and-rolls, or which players were especially good at intercepting and disrupting an attack.

A year after his first presentation, Goldsberry returned to MIT with data from SportVU and a new look at basketball defense. This time the room was crowded, and not only by his fellow scientists, but also by managers from all over the NBA.

Ring protection


The most important area on the court from the point of view of defense is the area near the basket, but some players act in it more effectively than others. Using spatial data indicating the location of the defender, Goldsberry can determine who reduces the percentage of the enemy hit, and who can not stop the opponent.






NBA players throw an average of 49.7% of shots when in this area they are met by a defender.

At first, Goldsberry noted that the area right around the basket should be protected as the apple of an eye. In this zone, the attacking players score the most goals. Therefore, Goldsberry considered how well the defenders could prevent opponents from collecting points within a radius of five feet from the basket. The average NBA defender allowed his opponent to score from such a short distance in 49.7% of cases.

He identified two types of protection. In the first case, the defenders blocked or interfered with the opponent's throw, that is, reduced the “effectiveness of the throw”. According to this indicator, the center team “Indiana Pacers” Roy Hibbert and center “Milwaukee Bucks” Larry Sanders were leading, which allow the opponent to throw only 38% of the goals. At the same time, Louis Scola, at that time playing for Houston Rockets, and now for Phoenix Suns, and David Lee from Golden State Warriors acted badly in defense, allowing them to score in 61% and 62% of cases, respectively. . This fact was curious, but not shocking. In a sense, he was the reverse side of the data on the attack, which he presented a year earlier.

The second approach to defense was more subtle and more surprising. As it turned out, some players reduced the frequency of their opponents' shots, not only their effectiveness. This could only be shown by the Goldsberry data: comparing the average percentage of hits with the same indicator when a region was defended by a specific defender, he could determine when the number of hits decreased. The leader on this indicator was Dwight Howard, because of which the team threw in the direction of the ring less than 9%. Goldsberry called it the Dwight effect (by the way, that’s what he called his speech). According to Goldsberry, when Howard defended the ring, his rivals dropped less from close range and were content with a much larger number of shots from an average distance - the least effective type of throws in the NBA.

One of the NBA managers who attended the Goldsberry speech was Daryl Moray. Moray is the general manager of Houston Rockets, a team he turned into one of the most progressive-minded people in the league, investing a lot of time and energy in analytics and sports science. In addition, Moray is a graduate of the MIT Sloan school and one of the organizers of the event; he is still one of the conference chairs. Maybe this is a coincidence, but maybe not. But it is important to note that four months after Kirk Goldberry's performance, Moray signed a long-term contract with Dwight Howard.

Each discussion of the use of statistical analysis in sports is referred, as if it were acted upon by gravity, to the Moneyball book. Partly due to the fact that the book is simply amazing, and its hero, general manager of the Oakland Athletics team, Billy Bean, is an excellent character; and also due to the fact that the skill of Michael Lewis as a story-teller made it easier for readers to understand statistics. Moneyball is a story that explains the principles of sports analytics to the general public.

Nevertheless, the statistical methods underlying the Moneyball effect were not new to Bean. From early explorers like FK Lane in the 1910s and Allan Roth in the 1940s, to Earnsho Cook and his fundamental work called “Baseball interest” [eng. Percentage Baseball], written in 1964, the game has a recent, but strong tradition to analyze information. And since the mid-1970s, the former guard of the cannery, Bill James, systematized knowledge of the game in his self-published book, “Bill James Baseball Abstract”.

Therefore, the bean's genius was not so much in statistics as in his actions. For the first time in history, he managed to create an organization that benefits from long-established and well-known statistical information. This means that the advantage over competitors did not arise because of new theoretical principles in the game; it came about through their use.

Today, when, thanks to new technologies, terabytes of data about players and tactics are accumulating, the next huge advantage over competitors will go to computing devices and analysts who can understand the meaning of all signals. Take a look at the statistical tsunami generated by the SportVU system in the NBA. "Without exaggeration, we can say that 85% of teams do not know how to use this data," says Goldsberry. “This idea will lead to radical changes in the NBA, but I’m not sure that this will happen until the teams understand the importance of such areas as machine learning and visualization in the near future.”

Who are these 15% team managers who know exactly how to use this data? These are the next Billy Bina. This year, at a conference at MIT Sloan, Goldsberry gave a presentation on how to become a champion three times in a row. This is because Goldsberry simply divided basketball games into moments and moments, and then carried out the same analysis as previous generations of sports analysts conducted in relation to states in baseball. Due to this, Goldsberry and his team could now determine the value, that is, the number of points scored, of any movement on the court: from the pass under the ring to the passage with the lead.

This type of analysis opens up new possibilities for evaluating everything that a player does. “You can see which players benefit the team and which do the harm,” says Goldsberry. “It’s like a new microeconomic in basketball.”

For Goldsberry, this is no longer just a hobby. He used his accomplishments when writing analytics articles for the Grantland sports website, and although he doesn’t admit it, it is rumored that several NBA teams have consulted with him. And he still works at Harvard, where he organized a circle for students who call themselves "Basketball XY" [eng. XY Hoops], in honor of the mathematical notation of a coordinate system. “The idea is not mine, it came to my students,” Goldsberry admits. “As if I were the Foo Fighters, and they are the new pop group.” I'm almost in the past. ”

The main work of Goldsberry and his team was the "Stochastic model for predicting the results of ball possession in basketball at different levels of resolution" [Eng. “A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes”]. But to the general public, the article is better known under the title “Databall”.

Source: https://habr.com/ru/post/366215/


All Articles