For quite a long time, Yandex has been providing a free service for detecting spam in messages called “
Yandex.Chisty Web ”, however, it still remains unpopular.
In this post, I will demonstrate the basic methods of working with the Yandex.Clean Web API using the example of a simple PHP class.
So, the service supports four methods - spam detection, receiving a CAPTCHA, checking a entered CAPTCHA and appealing a spam detector solution. We will consider working with the first three methods.
For convenience, we arrange all this in the form of a simple static class.
')
class YandexCW { public static $api_key = '12345'; const check_data_url = 'http://cleanweb-api.yandex.ru/1.0/check-spam'; const get_captcha_url = 'http://cleanweb-api.yandex.ru/1.0/get-captcha'; const check_captcha_url = 'http://cleanweb-api.yandex.ru/1.0/check-captcha'; }
We proceed to the implementation of the class methods. The Clean Web API accepts GET and POST requests depending on the required method, and gives the result in XML format. Therefore, we first write an uncomplicated private method in our class for sending requests and reading responses. We will use SimpleXML to read responses, but we won’t use CURL - good, the standard
file_get_contents function allows you to make both GET and POST requests using
contexts .
private function xml_query($url, $parameters = array(), $post = false) { if (!isset($parameters['key'])) $parameters['key'] = self::$api_key; $parameters_query = http_build_query($parameters); if ($post) { $http_options = array( 'http' => array ( 'method' => 'POST', 'content' => $parameters_query ) ); $context = stream_context_create($http_options); $contents = file_get_contents($url, false, $context); } else $contents = file_get_contents($url.'?'.$parameters_query); if (!$contents) return false; $xml_data = new SimpleXMLElement($contents); return $xml_data; }
This method will significantly simplify our work with the API - it automatically inserts the key, forms the context for file_get_contents if we need to make a POST request, and also returns the answer already in the form of a SimpleXML object. I think the code does not need more detailed commenting. So let's go directly to the methods for working with the API.
Check for spam messages
First of all, we will implement a method for sending the message content to Yandex and then checking it for spam. However, before you simply bring the code, you need to clarify something. According to the
description of the check-spam method , it can take the following parameters regarding the content of the message:
- ip is the IP address of the sender.
- email - The email address of the sender.
- name - The name of the sender displayed in the message signatures.
- login - The name of the user account on the resource.
- realname - user name taken, for example, from his registration data.
- subject-plain - The subject of the post in text / plain format.
- subject-html - The topic of the post in text / html format.
- subject-bbcode - BBCode post subject.
- body-plain - The content (body) of the comment or post in text / plain format.
- body-html - The content (body) of the comment or post in text / html format.
- body-bbcode - The content (body) of the comment or post in BBCode format.
The set of data sent for verification can be arbitrary, except that from the
body and
subject parameter family only one type can be specified - either
plain , or
html , or
bbcode . There are also no required parameters. Therefore, we will transfer all this data to our method not by parameters that go sequentially, but by one array with an arbitrary data set.
public function is_spam($message_data, $return_full_data = false) { if (!isset($message_data['ip'])) $ip = $_SERVER['REMOTE_ADDR']; $response = self::xml_query(self::check_data_url, $message_data, true); $spam_detected = (isset($response->text['spam-flag']) && $response->text['spam-flag'] == 'yes'); if (!$return_full_data) return $spam_detected; return array( 'detected' => $spam_detected, 'request_id' => (isset($response->id)) ? $response->id : null, 'spam_links' => (isset($response->links)) ? $response->links : array() ); }
This method will allow us to send data for verification with automatic substitution of the user's IP address. Depending on the second parameter, the function may return either just
true or
false , or an array with detailed information containing a list of links suspected to be spamming, as well as a request
id generated by Yandex. He is, by the way, further useful.
Getting captcha
Yandex offers us to take advantage of its own “captcha” and I must say that this solution has obvious advantages - firstly, the load on our server is reduced, and secondly, the concern about “breaking resistance” of CAPTCHA falls on Yandex’s shoulders. The method will be extremely simple:
public function get_captcha($id = null) { $response = self::xml_query(self::get_captcha_url, array('id' => $id)); if (!$response || !isset($response->captcha)) return false; return array('captcha_id' => $response->captcha, 'captcha_url' => $response->url); }
As can be seen from the last but one line, the method returns the captcha ID and a link to the image itself.
The link usually has the following form:
u.captcha.yandex.net/image?key= CAPTCHA ID
It is better to use both output parameters so that the protection does not break if Yandex changes something in the link format.
CAPTCHA Check
Finally, the third class method will be used to validate the CAPTCHA value entered by the user.
In order to use it, we will have to give it the “captcha” id, issued by the previous method, as well as what the user entered. It would also be helpful to send the id of the request that we received when we sent the message for verification, but this is not necessary.
public function check_captcha($captcha_id, $captcha_value, $id = null) { $parameters = array( 'captcha' => $captcha_id, 'value' => $captcha_value, 'id' => $id ); $response = self::xml_query(self::check_captcha_url, $parameters); return isset($response->ok); }
Examples of using
For a complete check of the “Clean Web” system, you can
download a simple demo script. Before checking do not forget
to get your API
key "Clean Web" and specify it in the script!
You can also
download the class separately or see its
full code in the browser.
Verifying Form Content:
Special features
Most parameters when calling API methods are optional.
For example, you can not use the spam check, but simply connect your Yandex CAPTCHA, in the same way that ReCAPTCHA is connected.
Read more at
api.yandex.ru .