📜 ⬆️ ⬇️

GoTo Data Science Challenge 2: Summer School Grants

We are announcing a competition for grants in the framework of the data analysis and machine learning direction of GoTo summer schools . We invite schoolchildren and younger students to participate. As an assignment, Quora kaggle competition is proposed, in which it is necessary to build a model to identify duplicate questions.


image


Under the cut description of the conditions of the problem, links to useful materials and an example of a simple solution.



The model, by definition, of essentially the same questions, can be used in forums, technical support, online consultations, etc., for example, in order not to produce the same topics or automatically answer popular questions. In fact, quite a useful story.


In the first approximation, this problem can be solved in the formulation of a binary classification — learn a couple of questions to predict whether they are duplicates or not. Then the standard machine learning setting begins to work - learning with the teacher. Marked pairs for training are provided by the organizers of the competition, and it is enough for us to perform two steps: generate signs in pairs of questions, and then select them and train the classifier.


One of the easiest solutions is to believe that questions are duplicates if they consist of almost the same words (the bag of words model). Then the feature description for one question is the vector of the frequencies of occurrences of words.
An example of a solution with such features and logistic regression can be found at here .


Further development of the solution is limited only by your imagination:



More examples can be found below:



To apply for a grant you need to perform the following steps:



Following the results, several participants will receive full grants (free participation), which have shown decent results - partial grants. Read more in the letter that will be sent after registration.


About schools:


image


On June 13 - 26, July 1 - 14, August 16 - 29, GoTo summer design schools for high school students and younger students interested in applied programming, data analysis, bioinformatics, information security, Internet of things with robotics will be held 100 km from Moscow. A participant in each school is given the opportunity to implement a project or conduct research, and lecturers at the best universities and experts from leading companies supervise the projects.
As part of the selection, competitions are held for free participation in each area: application programming, hardware, data analysis, information security, bioinformatics. Announcements of the rest of the contests will roll out soon.


All questions or suggestions can be sent to school@goto.msk.ru .


')

Source: https://habr.com/ru/post/327206/


All Articles