⬆️ ⬇️

Yandex Search API for .Net

Good day!

In the process of working on a single .Net project, it became necessary to obtain Yandex search results for a specific query. In this article I would like to tell you about the experience with the Yandex API from the .Net environment.

Of course, initially I didn’t want to implement the Yandex API wrapper myself (I was hoping that someone had already done it before me), so I spent about 20 minutes trying to figure out that ready-made solutions that would suit me , not. As a result, there was nothing for me to do but write it myself, since the Yandex API documentation is publicly available and quite detailed.



For my needs, the Yandex.XML service was perfect, which accepts a search query and returns an XML response with the search results. In order to use it, you must have a standard registration on Yandex and get an access key. It should be noted that the key is strictly tied to the IP of the user to whom it was received. Limiting the number of requests per day was initially 10, but after confirmation of the phone number rises to 1000.

After receiving the key, the task was divided into 2 stages: sending a request and receiving a response from Yandex.XML, and parsing this answer.



Query structure



In order to make a request to Yandex. XML it is necessary to understand what elements this request consists of.



< query > - .

< sortby > - :

< rlv > - ,

< tm > - .

< maxpassages > - ( - 5, -2).

< groupings > - < groupby > .

< groupby > - , :

< attr = > - , :

< d > - .

<> - .

< mode = > - :

< flat > - .

< deep > - .

< groups-on-page = > - ( 100).

< docs-in-group = > - .



* This source code was highlighted with Source Code Highlighter .


')

With the structure of the request sorted out, now is the time to send it. There are two ways to do this: the POST method and the GET method. I will tell about each separately.



POST method



The essence of the post method is that the request is formed in XML format, written to the stream and sent to the service. So as not to be unfounded, I will give the code in which the above described happens:



ServicePointManager.Expect100Continue = false ;



/* , IP,

API.*/

string url = @"http://xmlsearch.yandex.ru/xmlsearch?

user=**********&

key=**********************************"
;



// XML

string command =

@"<?xml version=" "1.0" " encoding=" "UTF-8" "?>

<request>

<query>- </query>

<groupings>

<groupby attr="
"d" "

mode="
"deep" "

groups-on-page="
"10" "

docs-in-group="
"1" " />

</groupings>

</request>"
;



byte [] bytes = Encoding .UTF8.GetBytes(command);

// , .

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

request.Method = "POST" ;

request.ContentLength = bytes.Length;

request.ContentType = "text/xml" ;

// XML-

using ( Stream requestStream = request.GetRequestStream())

{

requestStream.Write(bytes, 0, bytes.Length);

}



//

HttpWebResponse response =(HttpWebResponse)request.GetResponse();



* This source code was highlighted with Source Code Highlighter .




The first line is necessary because the HttpWebRequest object, when sending a request by the POST method, adds the HTTP header “Expect: 100-Continue” to it, which misleads some services (including Yandex.XML) and causes an error. ”(417) Expectation Failed ".



GET method



A GET method differs from a POST method in that the request is a simple string, not an XML format string. When it is received, the service generates XML from the request text (converting each element of a string request to an XML attribute) and sends the answer (in XML format). Code example:



//, IP.

string key = "***********************************" ;

// .

string user = "*****************" ;

// .

string url = @"http://xmlsearch.yandex.ru/xmlsearch?

query={0}&

groupby=attr%3Dd.

mode%3Ddeep.

groups-on-page%3D10.

docs-in-group%3D1&

user={1}&

key={2}"
;

// .

string completeUrl = String .Format(url, searchQuery, user, key);

//, .

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(completeUrl);

// .

HttpWebResponse response = (HttpWebResponse)request.GetResponse();



* This source code was highlighted with Source Code Highlighter .




So, the response from the service received. Now you need to extract from it all the necessary information about the search results. There is a powerful mechanism for working with XML documents in the .Net environment, so there are no special problems with this. To begin with, we will create an object through which we will work with the XML response based on the response received by the HttpWebResponse:



XmlReader xmlReader = XmlReader.Create(response.GetResponseStream());

XDocument xmlResponse = XDocument .Load(xmlReader);



* This source code was highlighted with Source Code Highlighter .




It remains only to parse the received XML response. But before you do this, you need to know the structure of this answer. It is fully described in the documentation for Yandex.XML . I can only say that I did not need general information about the search results in the project, and it was planned to work with separate search results as with separate structures. In each search result I was interested in the following:



• < url > - URL

• < title > -

• < headline > -

• < modtime > -

• < saved-copy-url > -



* This source code was highlighted with Source Code Highlighter .




First, a structure was prepared in which the search results were recorded:



public struct YaSearchResult

{

//url

public string DisplayUrl,

//saved-copy-url

CacheUrl,

//title

Title,

//headline

Description,

//modtime

IndexedTime;



public YaSearchResult( string url,

string cacheUrl,

string title,

string description,

string indexedTime)

{

this .DisplayUrl = url;

this .CacheUrl = cacheUrl;

this .Title = title;

this .Description = description;

this .IndexedTime = indexedTime;

}

}



* This source code was highlighted with Source Code Highlighter .




In the XML document itself, each search result is called a group and has the following structure:



<group>

<categ attr= "" name= "" />

<doccount> </doccount>

<relevance priority= "" />

-<doc id= "" >

<relevance priority= "" />

<url> </url>

<domain> </domain>

<title> </title>

<modtime> </modtime>

<size> </size>

<charset> </charset>

+<passages>

+<properties>

<mime-type> </mime-type>

<saved-copy-url> </saved-copy-url>

</doc>

</group>



* This source code was highlighted with Source Code Highlighter .




And, finally, the parsing method itself (+ at the beginning, an auxiliary method of pulling out the values ​​of their XML):



// doc,

// GetValue , ,

public static string GetValue( XElement group, string name)

{

try

{

return group.Element(«doc»).Element(name).Value;

}

// ,

// .

catch

{

return string .Empty;

}

}




* This source code was highlighted with Source Code Highlighter .




public static List <YaSearchResult> Search( string searchQuery)

{



// YaSearchResult, .

List <YaSearchResult> ret = new List <YaSearchResult>();



// XML' "group" -

var groupQuery = from gr in response.Elements().

Elements( "response" ).

Elements( "results" ).

Elements( "grouping" ).

Elements( "group" )

select gr;



// group SearchResult

for ( int i = 0; i < groupQuery.Count(); i++)

{

string urlQuery = GetValue(groupQuery.ElementAt(i), "url" );

string titleQuery = GetValue(groupQuery.ElementAt(i), "title" );

string descriptionQuery = GetValue(groupQuery.ElementAt(i), "headline" );

string indexedTimeQuery = GetValue(groupQuery.ElementAt(i), "modtime" );

string cacheUrlQuery = GetValue(groupQuery.ElementAt(i),

"saved-copy-url" );

ret.Add( new YaSearchResult(urlQuery, cacheUrlQuery, titleQuery, descriptionQuery, indexedTimeQuery));

}



return ret;

}



* This source code was highlighted with Source Code Highlighter .




Sheet with search results received!



Conclusion



If someone is interested in the topic of working with Yandex API from .Net, then in the next article I can write how to work with the API of Yandex geocoding / reverse geocoding.



Finally I would like to say that the plans are to write a beautiful full wrapper for Yandex API for .Net, not only Yandex.XML but also for other services.

Source: https://habr.com/ru/post/116479/



All Articles