For a long time putting the numbers back and forth, we (I and my wife Katya) were looking for the perfect formula, but everywhere we ran across the human factor. And the truth is, what is more important to count - the number of votes or rating and how to correlate them?
As a result, we decided to approach the issue not from a mathematical point of view, but from a philosophical one:
- So, people vote more for the advertised film, usually a film with a large budget. The higher the budget, the higher the technical quality of the film.
- Yes, in general, the assessment of the film is a subjective concept, but we are interested in the Top for all and, therefore, should focus on popularity (popular - common, generally understood).
- The more votes for the film - the more objectively we consider the assessment.
As a result, we came to the conclusion that both the estimate and the number of votes are approximately equal in importance factors, and the decision came by itself.
To equalize these two values - you must bring them to the same scale. The maximum number of votes is always different, but the rating (oh yeah!) Is from 1 to N. To simplify, we take N = 10. Consequently, the task has been reduced to bringing the number of votes of the film to a percentage of the maximum possible among all films.
Then I will talk about the implementation of the approach on Mysql - since mathematicians have already solved the problem, and the rest, I hope, it is interesting to touch the ready.
So, create a table
CREATE TABLE IF NOT EXISTS `films` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(255) NOT NULL, `raiting` float NOT NULL, `count_votes` int(11) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
Add 4 entries from examples

You probably already guessed that to calculate the percentage of votes we need the max function and the logarithm.
If not, everything is very simple. If we have already accepted the equivalence of the number of votes and the assessment and decided to bring the scale of votes to a scale from 1 to 10, then it suffices to use the logarithmic scale
ru.wikipedia.org/wiki/%D0%9B%D0%BE%D0%B3%D0%B0%D1%80%D0%B8%D1%84%D0%BC0D0%B8%D1%87 % B5% D1% 81% D0% BA% D0% B8% D0% B9_% D0% BC% D0% B0% D1% 81% D1% 88% D1% 82% D0% B0% D0% B1So:
select @a:=POW(max(count_votes), 1/10) from films; select id,name,raiting, count_votes, ((LOG(@a,count_votes))+raiting)/2 as actual_raiting from films order by actual_raiting desc ;
We get the root of the 10th power of the maximum number of votes for the subsequent calculation of the logarithm. So we get the share of the number of votes of a particular film from the maximum, reduced to 10. Add up with the average rating and divide by 2 - so we correlate them.
Welcome to the results:

A film with two votes has the least objective rating and is lower than the first two, but a movie with 500 votes is too bad. Movie number 4, despite the small number of votes, significantly ahead of rivals in the rating.
Thus, we have created weight categories of films (by popularity), and in each weight category we sort them by rating. Films with a roughly equivalent number of votes are ideally sorted by rating.
So, we got rid of the threshold and made the films play honestly, while giving a chance to the less advertised but cool films to get up in their weight category.
Chef's Compliment: Actor Rating
It's time to explain for the main image of the article. This is a cast rating.
After solving the first problem - here the solution was found instantly according to the same rules: the more an actor appeared in big films, the more popular he plays and the better he plays. Also, his game affects the rating of the film.
So, we took the maximum number of films from an individual actor, counted the number of films and their average rating for each actor and applied the same formula.
This is how Top Actors began to look.

For those who are interested to see the result:
http://vk.com/droptv