Today we decided to prepare for you a brief news note about new projects of scientists and programmers of the ITMO University. Let us dwell on social media mining and the tasks associated with determining the geographic preferences of users of popular social networks.
Objective: to identify locations (museums, restaurants, cafes, attractions and places for recreation), which are mainly of interest to local residents. With the help of the results obtained, expand the list of the most sought-after and attractive places in the city and diversify tourist guides. ')
To solve the problem using social media mining, the social network Instagram was chosen. The project team, consisting of employees of the Institute of High-Tech Computer Technologies ( NII NKT ), explains his choice with a fairly active user base of this social network and transparent behavioral patterns that allow to refine the results of the analysis.
One of the first steps towards data analysis was to compile a tourist profile and screen out the relevant users. Among the main signs of "tourist" behavior were highlighted: the accuracy of Instagram-publications from the central part of the city (for example, in St. Petersburg, tourists mainly publish photographs of places on Nevsky Prospekt) and a limited time window of presence in the city (according to the official tourist statistics, the length of stay in the city of guests usually does not exceed a couple of weeks).
The task was to find places of which tourists practically do not know. Therefore, in order to obtain “insider” information, it was decided to discard the most well-known tourist locations. Their popularity and attendance at such places as the Kazan Cathedral, the Hermitage and Pulkovo Airport do not raise doubts, therefore these and other places in demand by tourists were deliberately excluded from the study.
It is worth noting that the results of the analysis ( Yandex-map of popular places in St. Petersburg by categories) were presented at the profile conference and were published as a scientific material in the journal Procedures Computer Science.
Predict user geographic preferences using Twitter, Instagram and Foursquare
Objective: to recommend users to locations using cross-analysis of information from three social networks at once.
To accomplish the task, a group of scientists chose a learning model with a teacher. Here, it was necessary to take into account not only geotags reflecting specific places recommended by Foursquare users, but also text data (Twitter) plus visual preferences based on Instagram.
In the course of the work, the possibility of refining the recommendations was implemented through the use of behavioral information from the most similar users. Profiling was carried out using clustering on a multi-layer graph, which included data from three social networks.
In simple terms, such a system can recommend the user the most suitable sports facilities if he himself is interested in sports and publishes relevant tweets or publications on Instagram.
The work was carried out by scientists of ITMO University together with colleagues from Singapore. For this purpose, a dataset was compiled for residents of New York, Singapore and London, and the results of the study were presented at the International ACM Conference and are reflected in the article “Cross-Domain Recommendation ".