📜 ⬆️ ⬇️

Computer algorithm selects the most significant authors of the past



January 1 of each year is not only the day when we all suddenly find ourselves next year (someone has such a transition accompanied by a typical headache). It is also the day when many works of various authors are shared. 50 or 70 years (depending on the country) the rights to this or that work belong to the authors or publishers. After this time, the work can be freely used: reprint, digitize and even change. By "authors" refers to writers, poets, artists, and other representatives of the art world.

And so, every year thousands, tens of thousands of works become accessible to all. And only a small number of printed works are digitized. That's because the capabilities of the digitizer teams are limited. Plus, choosing the most significant authors is very difficult.
')
In order to simplify the selection process, Allen Ridder of Dartmouth College (New Hampshire) created a computer program, an algorithm capable of assessing the significance of various authors. To start the program, you need to enter the date (year), and the machine will select the most significant authors (according to the machine evaluation), whose works are not subject to copyright.

In other words, with the help of the algorithm, you can choose the author and the works that need to be digitized first. Estimation of significance is carried out on a large number of factors, including the mention of authors in the wiki, the quoting of the authors, the number of views of articles on Wikipedia with a description of the biography of the author or his works and other data.

The algorithm uses two databases. The first is a list of millions of books from the University of Pennsylvania. The second is Wikipedia, as mentioned above.

The author called his assessment system “public domain ranking”; the algorithm can be tested on the same site . The algorithm allows ranking of all authors whose mention is in Wikipedia (English-language). The results are interesting. For example, the writer Virginia Woolf gets 1081 points out of 1011304 possible. And the artist Giuseppe Amisani , who died in the same year as Virginia Woolf, received 580,363 points. The smaller the number of points - the more significant the author.

According to this assessment, organizations such as Project Gutenberg must digitize the work of Virginia Woolf, and then - the artist's paintings.

Among the most significant authors, whose works will be available for digitization on January 1, 2015, the first place, according to the algorithm, is Thomas Sterns Eliot . Also available will be the work of Winston Churchill, Malcolm X and some other famous people.

According to the developer of the algorithm, the estimates of the machine often coincide with the estimates of people. But, of course, this assessment cannot be taken as an axiom, especially since the ranking uses only one source of information about the author - Wikipedia.

By the way, many years ago I read a science fiction story on a similar problem. Then the writer, unrecognized by anybody, created a machine for evaluating the writings and poets. Of course, he hoped that his poem would get the first place. But no, the first place went to the “Handbook of Radio Engineering”. Machines such machines.

Source: https://habr.com/ru/post/363241/


All Articles