⬆️ ⬇️

Started restoring deleted articles from Wikipedia

In September last year, I reported on the intention of the staff of the open wiki “ALL” project to start recovering data deleted in Russian Wikipedia. Until that moment, deleted articles, pictures, templates, and some pages of interest were downloaded to independent repositories.



Since ALL is an encyclopedic project, administrators do not save there any spam or self-publicity of contact users, but articles about real people, events, companies. To begin with, a selection of articles was made according to a specific algorithm that eliminates obvious vandalism.





For example, many articles about the fictional universe have been restored.

According to the bot programmers, the algorithm was as follows.

* Articles with the same name are not currently in Wikipedia (i.e. it was not recreated as a separate article - only as a redirection);

* In the comments on the deletion of pages there is not one of the keywords like "vandalism" or "copyright infringement", which show that the page most likely does not represent any value.

')

The bot worked last fall, selecting a list of approximately 100,000 articles satisfying these conditions.



At the beginning of this year, the fill bot was finally launched. He swept ALL articles - it received more than 2 thousand deleted articles from the Russian Wikipedia. On the way there are at least several thousand interesting articles. You can read their full list at the link above. At the same time, these articles do not exhaust the unique content of ALL: there are many articles about people , schools and, for example, iconic songs .



Articles have templates, categories and images.

Source: https://habr.com/ru/post/165985/



All Articles