📜 ⬆️ ⬇️

Twitter accidentally discovered a network of 350,000 bots. Botmaster pierced on geotag

According to official data, Twitter’s active audience exceeds 313 million people. It would be more accurate to say, 313 million users, because how much of the users are living people is not known for certain. About 500 million messages are generated on Twitter daily, and it is very difficult to follow and analyze such a huge flow of information for suspicious activity.

For many years, Twitter has been the subject of attention of scientists. Previously, researchers studied the Twitter user graph and tried to create an individual user influence model . A very promising area of ​​research is considered to be the analysis of the flow of tweets in real time - for predicting trends, public sentiments, election results , prompt identification of important events, foci of viral infections , earthquakes and typhoons .

Twitter bot is a Twitter user account that operates with or without minimal human participation. Botnet - a group of accounts created and managed by one person, called the botmaster . This is not a bad business, since a botmaster can sell his services for spreading spam, selling followers and manipulating public opinion. In previous years, the researchers studied in detail how the influence of the bot increases (promotion before work), how they penetrate the twitter environment to establish connections with live users and merge with the total mass of users, how bots are used for propaganda . Twitter bots are traded on the black market , as well as Facebook, YouTube, Gmail, Linkedin and other accounts. A thousand accounts on Twitter (confirmed by mail, with avatars, a modified theme and biography - that is, with all the stuffing), at a Russian broker buyaccs.com are $ 60.

What are twitter bots for?


Spam Sending to other users a large number of sponsored links, distribution of malicious software.
')
Distribution of fake "trending" news . Since bots are evaluated by real-life accounts of living people using Twitter’s algorithms, they are also evaluated in calculating “trending” topics and hashtags. This allows you to create fake trending topics that were not originally popular on Twitter, but are included in the list of trends, and from there to real bloggers and in the media.

Manipulation of public opinion . If a network of bots is not identified on time, a botmaster can post a large number of positive / negative messages on a specific topic that will distort the results of public opinion polls conducted on Twitter by researchers, commercial and government organizations.

Astroturfing . The technology of creating an artificial public opinion by posting numerous tweets, designed as completely independent opinions of individuals, masking the sponsor of astro-turfing.

Fake followers . For a fee, thousands of bots at the command of a botmaster can subscribe to client tweets to give his account greater importance due to the greater number of subscribers.

Pollution Twitter Streaming API . There are suspicions that bot messages can be organized in such a way that they fall into a filtered sample of the Twitter Streaming API, which is used by many for data mining, with a probability of up to 82% instead of the expected 1%.

Network of over 350,000 bots on Twitter


Twitter itself and independent researchers have developed a number of advanced technologies for identifying bots in a social network, including using machine learning to calculate the Levenshtein distance between tweets, etc.

In the majority of Twitter botnet studies, selective data sets were used to compile a random traverse of a graph or based on the Twitter Streaming API. In both cases, such samples will be distorted. In the first case - in the direction of users with a large number of friends / followers. In the second case - in favor of more active users.

Instead, researchers from the Computer Science Department at University College London have compiled a sample set of identifier (ID) data for twitter accounts ( pdf ). To study, they took 1% of Twitter users, that is, every hundredth. For all, profiles were retrieved through the API, then non-English-language profiles were filtered. As a result, there remained a sample of 6 million English-language accounts.

The study of this data set gave a very interesting result. 843 million tweets from this sample were published, of which approximately 20 million were geotagged. It turned out that the geographical location of geotags in general correlates with population density, except for two large areas in Europe / Africa and North America, evenly filled with non-zero number of tweets with geotags in this area, including the seas, deserts and permafrost regions. The distribution of geotagging tweets in both rectangles is absolutely uniform, 50% of tweets are published in North America, 50% of tweets in Europe.


Color corresponds to the number of tweets. The geographic location of geotagging on Twitter generally correlates with the population density in the world, except for two large rectangular areas in Europe / Africa and North America, evenly filled with non-zero tweets, including seas, deserts and permafrost regions

And all these tweets belong to a specific range of Twitter IDs, as shown in the graph.



This botnet was named Star Wars , because the bots actively published quotes from the movie "Star Wars".



Another characteristic feature of a botnet is the publication of tweets only from smartphones under Windows (or a library / software for publishing is defined as a smartphone under Windows).



For further study of this and other botnets, researchers are asking Twitter users to report bot detection on thatisabot.com .

Source: https://habr.com/ru/post/400951/


All Articles