📜 ⬆️ ⬇️

ImageNet 2013 Image Recognition Competitions

In December 2013, the annual ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) visual image recognition competition was completed, sponsored by the ImageNet project, which is a huge database of images. Currently, the database has more than 14 million images.
Competitors solved three problems described under the cut.
Task 1. Detection on images of real scenes of objects belonging to 200 different categories . As a training sample, images were presented in which for each object depicted on them (from among the 200 categories presented), its class and the bordering rectangle (an example of such an image is shown in the figure below) is known.


The training sample for solving this problem consisted of 395909 images, and the test sample of 40152 images. Class labels and bordering rectangles for test sample objects are known only at the testing stage.
In assessing the quality of the detection algorithms, correctly recognized and localized objects were taken into account (the overlap area of ​​the known fringing rectangle and the fringing rectangle proposed by the algorithm for this image object should be more than 50%). If the object was not detected in the image or was found in more than one instance, then when assessing the quality of the algorithm, this was punished.
As a result, the winners were the team whose algorithm achieved the best accuracy in most categories. In the first place was the team UvA-Euvision (the combined team of the University of Amsterdam and the company Euvision). She was the winner in the recognition of objects in 130 out of 200 categories. The second place was taken by NEC-MU (NEC company together with the University of Missouri) with 25 categories. Presentation of the results of this team here .

Task 2. Classification of objects in 1000 categories . The training sample consisted of 1.2 million images, and the test sample of 150 thousand images. For each test image, the recognition algorithm should issue 5 class marks in descending order of their reliability. When calculating the error, it was taken into account whether the most reliable mark corresponds to the mark of the class of the object actually present in the image, which is known for each image. The use of 5 tags is intended to exclude the “punishment” for the algorithm in the case when it recognizes objects of other classes in the image that can be implicitly represented (see the figure below for an example).

')
The winner in this task was a student of Jeffrey Hinton - Matthew Zeiler (Matthew Zeiler), the second place was taken by the NUS team from the National University of Singapore, and the third - the ZF team, consisting of Matthew Seiler and Rob Fergus .
Matthew Sailer has set himself the task of understanding what exactly affects the quality of recognition when learning convolutional neural networks (CNN). He developed the concept and technology of Deconvolutional Neural Networks to understand and analyze the operation of calibration neural networks. Matthew analyzed the neural network - the winners of last year's ILSVRC2012 competition. As a result, he developed recommendations for the development of the neural network structure, which won the task of classifying 1000 categories of objects. For all interested: a video presentation by Matthew Ziler and the website of Matthew Ziler .

Task 3. Classification and localization of objects of the same 1000 classes. For each image, the algorithm must produce, in addition to 5 object class labels, fringing rectangles for each label of the intended object class. In this part of the competition there were only two participants: the OverFeat - NYU team, which won the first place (one of the participants of this team was the legendary Yann LeCun ), and the VGG team (Visual Geometry Group, University of Oxford), which took the second place, respectively. It should be noted that OverFeat - NYU in the solution of task 2 took only 4th place, and in task 1 its results were not taken into account, since she used additional, external, graphic images when training her classifier. OverFeat - NYU also used a convolutional neural network as a classifier. The presentation of this team is here .

Source: https://habr.com/ru/post/206342/


All Articles