This summer I was fortunate enough to take part in Vision and Sports Summer School 2015 (VS3 2015) in Prague. I want to tell about my impressions in my article, and maybe even motivate someone to apply for participation in this school.
In the beginning a few words about yourself. I have been studying for doctoral studies at the Technical University in Brno at the Department of Computer Graphics and Multimedia for a year already. My research work is connected with computer vision and service robotics. Specifically, I am engaged in the study of object recognition algorithms in the 3D scene using the so-called “point clouds” obtained from cameras like Microsoft Kinect. For me, participation in school is a good opportunity to deepen my knowledge of basic aspects and possibilities of using computer vision in various areas of life, as I am just starting to study this area of ​​IT. I learned about summer school this spring from the announcement of one of its organizers, who lectured at our faculty. The organizers send out invitations to participate in universities (at least they sent them to our university).
Among the participants there are people from different countries: Great Britain, France, Ukraine, Slovenia, the Czech Republic. These are mainly students of specialized fields, although there were undergraduate students. Also in the summer school were seen and participants from Russia.
About the Summer School
Already in the fifth year, VS3 organizes the Center of Machine perception, which is located in the Cybernetics Department of the Faculty of Electrical Engineering at the Czech Technical University (Prague). Specifically, in the face of the organizers in the school are Ondrej Hum, Jiri Matas (both from ) and Vittorio Ferrari (University of Edinburgh, United Kingdom).

')
In the photo Vittorio Ferrari in a black T-shirt from the left, Ondra Hum - in the center.
I went through school during the week (more precisely, 6 days) from August 17 to August 22 on the territory of . Lectures were held in the building of the Faculty of Civil Engineering. All the main objects of the event can be viewed
here .

The goal of the school is to acquaint people in one way or another with computer vision, with current achievements and current tasks in this area. The school is also a pleasant opportunity to meet with experts and practitioners well-known in this field and compete with them in sports activities. Information about the school can be found on the
official page .
Specialists from computer vision and machine learning from renowned universities from around the world were invited as lecturers:
- Jiri Matas, Czech Technical University (Prague, Czech Republic);
- Krystian Mikolajczyk, University of Surray, Guildford, UK;
- Vittorio Ferrari, School of Informatics of the University of Edinburgh (UK);
- Raquel Urtasun, University of Toronto (Toronto, Canada);
- Christoph Lampert, Institute of Science and Technology Austria (IST Austria);
- Carsten Rother, TU Dresden,
- Daniel Cremers, TU MĂĽnchen (Munich, Germany),
- Ondrej Chum, Czech Technical University (Prague, Czech Republic).
The school is organized in the format of daily theoretical and several practical classes (in the morning) and sports activities (in the second half). On Saturday, a workshop was organized in which each of the lecturers presented current achievements and trends in their field (benchmarks). Thus, every day of the summer school turned out to be saturated from 9 am to 6 pm, only on Saturday everything ended at 2 pm The full school program for this year is available
here .
In addition to lectures and sporting events, a barbecue was organized on Wednesday outside the campus.
Registration and participation
To register at the school, you must fill out the registration form on the website in the Registration section and pay the registration fee after receiving confirmation of participation from the organizers. I as a PhD student paid 10,175 kroons. Dates of registration and payment are listed on the official school page. The registration form of the current year can be found
here . Here you can see the size of the registration fee.
There is an opportunity to ask your department for support in participating in this school through compensation of the registration fee. I and several of my colleagues from the faculty took advantage of this opportunity.
The summer school provides several options for accommodation at the time of activities in hotels. You can see the accommodation options this year
here or choose your own (good, prices for hotels and hostels in this part of Prague are reasonable).
How was the training
The program of the first day began with registration at 8 o'clock. Each participant received a confirmation of participation, a school program for each day, coupons for lunch in the dining room CHVUT and an individual branded water bottle (for sports events).


The school’s lectures began with an introductory word from the organizers of Ondra Huma and Vittorio Ferrari. After that, throughout the whole week, there was a series of lectures on various topics related to computer vision.
Krystian Mikolajczyk told about the extraction and matching (matching) of local features for solving various computer vision tasks: from object recognition and panorama creation to SLAM orientation technology in robotics. Here, special attention was paid to the issue of detector invariance with respect to various types of transformation, in particular, to affine transformation and scaling.


Ondrej Hum spoke about the solution of the problem of searching for similar images based on a set of signs (bag of words) in an image, in particular, about using the “nearest neighbors” (K-mean) method in the space of signs. At the end of the lecture, Ondra Hum showed an interesting project, developed by his colleagues from CHVUT, which searches for similar images for a given image fragment. This application allows the user to select an interesting detail with a frame in the image (for example, a sculpture on the facade of the cathedral) and returns all relevant images that may contain this detail from the same perspective, from different vantage points, at different scales and even more detailed. Application developers also managed to perform 3D reconstruction of architectural objects based on a collection of images for a better and more “smart” search for similarities.


As part of the MRF / CRF for Computer vision lecture, Carsten Roser spoke about the Random fields (Markov random and conditional random fields), their application for solving problems of interactive image segmentation, denoising (image noise reduction) and stereo matching.


Christoph Lamper talked about structured prediction models (Structured prediction models), described standard regression, probabilistic graphical models, such as factor graphs, probabilistic inference, and structural SVM.
Vittorio Ferrari spoke about the use of Weakly supervised learning (WSL) techniques for teaching visual models in solving semantic segmentation problems, on using Convolutional neural networks in WSL, and compared Weakly supervised learning to Full supervised learning.
Daniel Kremers within the framework of the lecture of Variational methods & Geometric reconstruction described how using variational methods to optimize the solution of some computer vision problems using the example of object segmentation in an image, 3D reconstruction and building maps using the SLAM techniques.


Jiri Matas told about visual tracking in the video, about various methods for detecting an object for tracking and directly tracking, as well as about learning techniques in the process of tracking.
Raquel Urtasan talked about the fundamentals of deep structured learning, described the concept of Convolutional neural networks and their application in the tasks of classification, object localization and semantic segmentation, and also spoke about the use of graphical models (CRF, MRF) in combination with CNN.
With the program of the Saturday workshop anyone can read the
link . Most of all, I remember the performances of Christian Mikolajczyk and Raquel Hurtasan. Christian Mikolajczyk spoke about the automatic annotation of tennis games by tracking the trajectory of the ball and recognizing the actions of players. Raquel Urtasan spoke about the latest projects in the field of autonomous driving: vehicle localization, path planning and 3D reconstruction of city streets based on stereo cameras. Here are some photos from the workshop lectures.





As for practice, two practical exercises were conducted.
The first practical lesson was devoted to the topic of Karsten Roser’s lecture - MRF / CRF for Computer Vision. The lesson was held in a computer audience on computers with Windows Server 2012 installed in the MATLAB program. Tasks were the next plan. Task 1 was devoted to interactive image segmentation. It was necessary to study the logic of the script for image segmentation and explore the effect of various parameters of the algorithm on the segmentation result. The script took an annotated image with the selection of the background area and the object using blue and red brush strokes, respectively (the pixel colors under the strokes were used by the algorithm). Task 1 also required a script change in order to optimize the segmentation for best results. The second task was to solve the problem of noise reduction and required to find the most optimal values ​​of the parameters to obtain the best result. The practical task allowed not only to superficially understand the theory of the use of random fields on practical problems, but also to gain experience in MathLab.




The second practical task was devoted to the extraction of signs and the search for similar images with their help. Led class Ondra Hum. Two tasks were given. The first assignment concerned Bag of words and Inverted files, which were presented at a lecture on Monday. It was necessary to independently implement a script to search for images for a given database represented as a matrix: the rows of the matrix represent the bag of words representation for one document (values ​​of attributes), and each column contains one word (attribute).

The task was given instructions on what sequential steps to perform and what functions and MATLAB data types to use. Also to help participants were given slides from the lecture. In the beginning, it was necessary to build a matrix for the database based on the existing data structure and the weights of all words. An interesting point was the calculation of the idf parameter - the weight for each visual word using the formula:
idf(X) = log(# documents / # documents containing X)
Here it was necessary to calculate the number of documents containing the word X.
After constructing the database matrix, it was necessary to perform a search query for similar images for a given fragment. The result of the query was as follows:

Thus, the tasks were given not just to run and study the logic of the algorithms, but also required some algorithmic skill to find a method to solve the problem.
Sports events
In the last month before the school, each participant was sent a message from the school organizers with a request to choose the sports of interest. When organizing the school schedule, the organizers distributed all the participants into groups for each day so that every day several groups were organized by sports. I had such a plan: Monday - badminton, Tuesday - archery, Wednesday - table tennis, Thursday - volleyball, Friday - soccer (football).
This is what a gym for football, badminton and volleyball looks like.

Conclusion
The school is over, but a large number of emotions and memories remain for a long time. What I would like to say in conclusion about the summer school? As the program of lectures showed, machine learning in computer vision is becoming a promising trend: from graphic models (CRF and MRF) to deep learning with the rapidly growing popularity of convolutionary networks. What pleased me in particular was the growing number of developments in the field of stereo vision, such as 3D reconstruction and visual navigation for autonomous cars. In my opinion, there was not enough practical training here. Nevertheless, the practices were introduced to the fairly rich programming language MATHLAB, which has very powerful and practical features such as constructing scarce matrices. I also learned about several good books on CV that interested me and which I also advise readers:
- David A. Forsyth, Jean Ponce - Computer Vision: A Modern Approach
- Kenichi Kanatani - Understanding Geometric Algebra: Hamilton, Grassmann, and Clifford for Computer Vision and Graphics.
- Richard Szeliski - Computer Vision: Algorithms and Applications.
Each participant finds something useful for himself in this school and, I am sure, participation in it is not in vain. Information about the organization of the school every year becomes available in April. As soon as information about Vision and Sports Summer School 2016 appears, I will write a short announcement about the upcoming school. Thank you for your attention and wish good luck to all those who wish and send an application for participation in future VS3 schools!
Ps. There may be some inaccuracies in describing the compressed content of the lectures, since I am not strong in machine learning techniques and could have misunderstood something in the lecture notes.