Researchers from Microsoft are working on an ambitious new project,
Strider Search Defender, whose goal is to automatically track and neutralize spammers on search engines.
A new experimental project was created by integrating two previous projects of a similar focus - Strider HoneyMonkey and Strider URL Tracer. It implements fundamentally new methods for neutralizing spammers.
Search spam is a relatively new threat that has become especially dangerous in recent years, since its number now exceeds all acceptable norms. For example, according to
statistics from the Automattic Kismet service, about 93% of all blog comments are created by spammers. The new Microsoft project will allow to fight this scourge. Spammers will be evaluated using context analysis and URL tracking.
On sites with high rankings, spammers create so-called doorway pages, pages that are optimized specifically for search engines for specific keywords. These doorways are being promoted through fake blog comments and spamming blogs. The purpose of spammers - as high as possible to raise the doorway in search results. Doorway or redirect the user to another site, or contain promotional links with pay-per-click. Algorithm spammer action is depicted in the
diagram .
')
The new system Strider Search Defender tries to detect spammers earlier than the search robot. It analyzes traffic on sites and automatically calculates doorway pages. The program works as follows. For starters, she is “fed” a list of famous doorways. After that, a special module Spam Hunter makes relevant requests to search engines, calculating forums and blogs that contain links to these doorway pages. These forums and blogs are used as bait to get new spamming postings necessary for the program to analyze. The program collects other links published in these forums and blogs, and sends them to check.
Detected “suspicious” links are transmitted to the Strider URL Tracer program, which emulates the behavior of a regular browser. She comes to these links and registers cases of redirects. After such an automatic scan, researchers can determine which sites are associated with a large number of doorways.
During testing, Spam Hunter collected more than 17,000 blog addresses on the BlogSpot platform and sent them to the Tracer URL for verification. As a result, a
list of the 25 largest spammers BlogSpot (large page). Here are the sites that are most often redirected from doorways. It is their customers — spam customers — that need to be blocked in the search engines. It also turned out that 45% of the sites on the BlogSpot platform redirect to one of six resources: se-arch.com, speedsearcher.net, abcsearcher.com, eash.info, paysefeed.net or veryfastsearch.com.
According
to the researchers , the new system has already been used to automatically remove spam content from the MSN Search search index.