Kaggle organizes competitions for people involved in data mining. Two contests are now open. One for $ 3 million, which aims to improve the system by which people are placed in hospitals. The second is with a prize pool of $ 3,000 to improve the measurement of the ellipticity of galaxies, which will allow better measure the so-called "dark matter" in the universe.
Already passed about
two dozen contests . Participants share their experiences on a
blog . Below is a translation of the
post by Tim Salimans about his experience of participating in a contest predicting the results of chess meetings based on past results.
The participants of the competition were provided with the results of more than 1.84 million meetings of more than 54.000 players. The task was to predict the results of 100.000 games between the same players over the next three months. The competition was attended by 189 teams.
')
Further from the first person.
My name is Tim Salimans, I am a graduate student in Econometrics at Erasmus University Rotterdam. For my work, I constantly work with data, models, and algorithms, and Kaggle contests have proven to be an interesting way to use these skills in a social and competitive environment. The competition from Kaggle, Delloitte and FIDE in predicting the results of chess competitions was the first in which I participated, and I was very lucky that I won first place. At the same time, I used the Kaggle-in-class platform to conduct an econometrics course competition where I was an assistant. Both competitions were very interesting. In this post, I am not telling the technical details of a chess competition. If you are interested in technical details, including my code, see my
web page .
Chess rating
Before you begin to solve a new problem, it’s good to see what other people have done before you. Since Kaggle has already held a
competition to improve the rating system, it was logical to see the winners blog posts. After reading these posts and various academic literature, I realized that the rating system in chess assumes that the characteristics of each player can be described by one number. The estimated result of the match between the two players then is a certain difference between their ratings. Yannis Sismanis, the winner of the first competition, used the
logistic curve for this purpose and evaluated the rating, minimizing the regularized version of the model (for details on his approach, see the
article posted on arxiv). Jeremy Howard, the runner-up, instead used the
TrueSkill model, which uses the normal distribution function and evaluates the rating via the Bayesian inference.
I decided to start with TrueSkill, and expand it by limiting each player's rating to the ratings of his recent opponents, much like Yannis Sismanis did in the first competition. In addition, I introduced weights into the algorithm, which allowed me to make more recent matches more significant for the rating. After several experiments using the excellent
Infer.NET package from Microsoft, I wrote all the code in Matlab.
Using the match schedule
Predictions of my main model were high in the rating of the competition, but still were not sufficient to take first place. At this point, I realized that the match schedule also contains useful information for predicting results, some other participants also noticed this. In chess, most tournaments are held according to the Swiss system, when players in each round meet with other players with similar results in previous rounds. In the Swiss system, if player A has met with better opponents than player B, it probably means that player A has won more matches in this tournament than player B.
In order to use such information from the schedule, I shenirovany predictions for the last 1.5 years, using the window in 3 months. Then, I took two post-processing steps using these predictions and real match results. The first step used standard logistic regression, and the second step used variation of logistic regression with weighted local weights. The most important variables in the post-processing process were:
- basic model predictions
- player rating
- number of matches each player played
- rating of each player's rivals
- variation of the quality of the players encountered
- average prediction of winning percentage in all matches in one month for each player
- Random forrest predictions for these variables
This post-processing has significantly improved my rating in the tournament and I am well off from other participants. Later, other participants made similar improvements and the last weeks of the competition were very interesting. After a long weekend, I went to the rating page and found that the PlanetThanet team walked around me. Having slightly corrected my approach, I managed to re-enter the first place. After that I had to go to a conference in the USA. Upon arrival, I found out that I was being walked around again, this time by Shang Tsung. Only by sending the latest prediction from a hotel room in St. Louis, I was finally able to get the first place.
Conclusion
The greatest contribution to the victory was made using the match schedule. Although interesting in itself, it is not ideal for the original purpose of the competition - to improve the rating system of chess players. To my relief, the data set that Jeff Sonas laid out later showed that my model makes good predictions without using this information. In conclusion, I would like to thank both the organizers and the participants of the competition for the excellent event.