Evaluation of the competitiveness of search queries for variations in search results

Evaluation of the degree of competition for the key query is one of the sacred tasks of search engine optimization. Hope to find a request unseen by competitors with a good conversion is akin to finding a philosopher's stone. We will try and make our contribution to this alchemical party.

It turns out that the degree of competition can be almost instantly assessed only by comparing the search results for two interrelated requests - without analyzing puzomerok, competing sites, statistics on click cost and perepepachivaniya mountains of information.

')

Background

The difficulty of moving to the TOP on a given key request is rightly associated with the number of competitors fighting for first places. There are several ways to assess the degree of this “difficulty”. Here are the most common ones.

Analysis of ranking factors . This method consists in analyzing the search results and the sites included in it: the total number of sites, the amount of contextual advertising, the average number of links, the degree of optimization of each site, etc. Then this information with the help of the weight function is reduced to one KEI (Keyword Effectiveness Index) indicator, by which the queries are compared. The main problems of applying this method consist in the choice of the composition of indicators, the methods of their (automatic) measurement and weights. Just note that Yandex uses 800 non-trivially calculated parameters for ranking pieces.

Analysis of contextual advertising rates . This method boils down to estimating and comparing auction rates for the cost of a click in contextual advertising systems (Yandex Direct, Google AdWords). The link is clear - the “more interesting” the request, the more advertisers compete for the first places, the higher the bid. But pricing principles vary in context and search engine optimization, which can affect the accuracy of the estimates.

Comparison with competitive budgets . This information is available in numerous automatic promotion systems (SeoPult, Rookee) as statistics of its users. But the problem is that for medium-frequency and low-frequency queries such statistics may not be enough, so it is often possible to see the standard minimum amount as estimates of the cost. In addition, the main (if not the only) component of such a budget is the reference budget. And links now play a lesser role.

Idea

But there is another interesting method based on some properties of the query language of search engines. In this language, there are usually such concepts as broad and accurate queries. Their meaning is that in response to a wide request, information can be obtained in all word forms and with any order of words, and in response to an exact one, it can be obtained exactly in the form in which the request is formed.
For example, in the notation of the Yandex search language, a broad query would look like

  [buy a car]

And exactly how

  ["!buy a car"]

The main requirements for search engine optimization sites are the presence of accurate entries of the search query in the texts and markup of the site, as well as their use as anchor external and internal links. As a result, a site optimized for a specific query differs from a non-optimized by exactly the exact occurrences of the search query.

If the request is highly competitive, then the difference in the results of the issuance of the wide and accurate requests will not differ significantly, since there are quite a lot of optimized sites and there is plenty to choose from in the search algorithm. If the query is low competitive, then the lack of optimized sites will compensate for the rest of the search algorithm - those where it can find words close to the search query, but perhaps in a different morphology and in a different order.
As a result, the lower the competition, the more diverse the search results are, “looser”, and the easier it is for a new candidate to enter it. The higher the competition, the less variety, and the greater the likelihood that it will be filled with the same people involved, squeezing between them will not be easy.

As an example, let's take a look at a few different queries from automotive topics (in order to guarantee different degrees of competition, of course) and see what happens to them in Yandex.
The wide and accurate forms of these requests, as well as the frequency of broad requests, are as follows:

 [crossover] ["! crossover"] (239,714)
 [Mitsubishi Outlander] ["! Mitsubishi! Outlander"] (73,760)
 [Mitsubishi Outlander] ["! Mitsubishi! Outlander"] (68,149)
 [Mitsubishi Outlander] ["! Mitsubishi! Outlander"] (128)
 [mitsubishi outlander] ["! mitsubishi! outlander"] (41 392)

Here's what the ranking results for these queries look like. Here the color indicates the sites that match in the issuance of a wide and accurate options.

The naked eye immediately shows the difference between [mitsubishi outlander] (41 392) and [mitzubishi outlander] (128) - that is, between requests with obviously different competition.

Payment

Visually, the idea is clear - the greater the difference between search results in a wide and accurate form of a search query - the lower the competition. The variety is lower - the competition is higher. But how to calculate it now? How to measure the extent of this diversity?

Let us use for this the expression for the distances between the ratings obtained in the paper Assessment of variability of search results .
Examples of calculations for this expression can be found here .

So, we have the following formula for calculating the weighted relative distance between two ratings R ^' and R ^"

Here:
N - rating length (TOP5, TOP10, etc.);
| S | - the number of elements in the set S = R ^' U R ^{' '} , that is, the total number of unique objects in two ratings;
n ^' _i and n ^{' '} _i are the positions of the i- th element in the rating R ^' and R ^'' , respectively, and if the object is not in the rating, then its position in this rating is taken as N +1.

The higher the variety of issue, the greater the distance between the ratings. Therefore, as the degree of competition, we will use the opposite of the distance:

Cn = 1 - d

For our inquiries, we obtain the following values of the degree of competition (as a percentage) by rating TOP100.

The question now is, to what depth to view the search results. To answer this question, the dependence of the degree of competition on the length of the rating will help us, presented in the following graph.

As can be seen from the graph, already at the TOP20-TOP30 level, it is possible to obtain fairly accurate estimates of the degree of competition.

Approbation

What method of assessing the degree of competition more accurately reflects the cost of achieving the top search ranking? I do not know the answer to this question. I don’t know the answer to simpler questions - how to calculate the actual cost of SEO in general. Or at least the actual costs of optimization for a specific (separate) key request.
How, then, can we estimate the accuracy of the prediction of a quantity that we cannot measure?

The situation is in dietology: you can come up with a bunch of diets, but it’s realistic to estimate what influenced the lifespan. And just as in dietology, you can probably get an answer only from the experience of many generations.

In the meantime, the only option to verify the accuracy of the proposed method is common sense and comparison with analogues. Well, in terms of simplicity and convenience, he has no competitors. After all, he can give estimates even for newly appeared key phrases (attention to those who make money on trends)!

As for the analogues, the comparative results of assessing the degree of competition using some of the methods listed earlier are presented below.

Here:
Cn is the estimate of the degree of competition obtained by our method.
AVG Context - estimate obtained from click cost statistics in contextual advertising systems (Yandex Direct and Google AdWords). The values are averaged and normalized (the maximum cost per click is taken as 100%).
AVG Links - an estimate obtained in automatic promotion systems (SeoPult, SeoPult PRO, Rookee) based on the values of the recommended budget. The values are averaged and normalized (the maximum budget is taken for 100%).

Although the values differ, but the main thing is that the ranking order when using Cn and AVG Links estimates is the same. It is difficult to say which of these two estimates is more accurate, but in the SeoPult PRO source data on three requests out of five, there was no statistics (the system’s minimum possible budget was suggested). So there is every reason to believe that our algorithm can handle this task better.

As for the forecast for contextual rates of AVG Context , it is clearly out of the general trend. Use this method with great care.

Conclusion

The simplicity of the proposed method is obvious. The accuracy of the calculations seems to be quite good. Plus the ability to get estimates in situations where other methods are powerless.
What else is needed to adequately meet the ~~old age of~~ competitors?

Source: https://habr.com/ru/post/240795/

All Articles