📜 ⬆️ ⬇️

Restoration of topics from the Yandex search engine cache

Good day, I want to share a module for retrieving the contents of the topic of the site from the Yandex cache for cms livestreet.

I am developing My Equipment site and the other day there was a tragedy, the database fell.
Some members of the community already had this. And as it turned out, my hoster does not make backups, but I did not have them. So I had to get the topics in two months from the search engine cache.

As I didn’t go deep into the engine code, I decided to get acquainted with it at the same time.
')
The tasks before me were the following:
1) Receive and parse Yandex search results for pages stored in the cache
2) Parse the contents of the text of pages saved in the Yandex cache
3) Save the contents of the pages in the database and at the same time restore the old url of the pages so that users and search engines find them at the old addresses.

Well, it seems so far so simple and clear. Having opened an example of creating my own module, I started writing.

The first thing I decided to do was to see how you can save the page with the specified identifier. It turned out there is no such function. Then I decided to fix the source. Thank God that the code is open. I use Livestreet version 0.3 of this manual will be for this version of the engine.

1) Go to the topic module (classes \ modules \ topic \ Topic.class.php)

Find the declaration of the function AddTopic, I have this 44 line
public function AddTopic(TopicEntity_Topic $oTopic)


Change to
public function AddTopic(TopicEntity_Topic $oTopic,$needId=null)

What would you be able to insert records with a given ID

2) It is necessary to slightly change the module mapper for topics (classes \ modules \ topic \ mapper \ Topic.mapper.class.php)
We are looking for the AddTopic function (I have this 44 line)
public function AddTopic(TopicEntity_Topic $oTopic) {
$sql = "INSERT INTO ".DB_TABLE_TOPIC."
(blog_id,
user_id,
topic_type,
topic_title,
topic_tags,
topic_date_add,
topic_user_ip,
topic_publish,
topic_publish_draft,
topic_publish_index,
topic_cut_text,
topic_forbid_comment,
topic_text_hash
)
VALUES(?d, ?d, ?, ?, ?, ?, ?, ?d, ?d, ?d, ?, ?, ?)
";
if ($iId=$this->oDb->query($sql,$oTopic->getBlogId(),$oTopic->getUserId(),$oTopic->getType(),$oTopic->getTitle(),
$oTopic->getTags(),$oTopic->getDateAdd(),$oTopic->getUserIp(),$oTopic->getPublish(),$oTopic->getPublishDraft(),$oTopic->getPublishIndex(),$oTopic->getCutText(),$oTopic->getForbidComment(),$oTopic->getTextHash()))
{
$oTopic->setId($iId);
$this->AddTopicContent($oTopic);
return $iId;
}
return false;
}


Change it to
public function AddTopic(TopicEntity_Topic $oTopic,$needId=null) {
if($needId==null)
{
$sql = "INSERT INTO ".DB_TABLE_TOPIC."
(blog_id,
user_id,
topic_type,
topic_title,
topic_tags,
topic_date_add,
topic_user_ip,
topic_publish,
topic_publish_draft,
topic_publish_index,
topic_cut_text,
topic_forbid_comment,
topic_text_hash
)
VALUES(?d, ?d, ?, ?, ?, ?, ?, ?d, ?d, ?d, ?, ?, ?)
";
if ($iId=$this->oDb->query($sql,$oTopic->getBlogId(),$oTopic->getUserId(),$oTopic->getType(),$oTopic->getTitle(),
$oTopic->getTags(),$oTopic->getDateAdd(),$oTopic->getUserIp(),$oTopic->getPublish(),$oTopic->getPublishDraft(),$oTopic->getPublishIndex(),$oTopic->getCutText(),$oTopic->getForbidComment(),$oTopic->getTextHash()))
{
$oTopic->setId($iId);
$this->AddTopicContent($oTopic);
return $iId;
}
}else
{
$sql="select count(*) as cnt from ".DB_TABLE_TOPIC." where topic_id='".$needId."'";
$aRow=$this->oDb->query($sql);
//echo $needId;
//print_r($aRow);
if($aRow[0]['cnt']>0)
{
return false;
}

$sql = "INSERT INTO ".DB_TABLE_TOPIC."
(blog_id,
user_id,
topic_type,
topic_title,
topic_tags,
topic_date_add,
topic_user_ip,
topic_publish,
topic_publish_draft,
topic_publish_index,
topic_cut_text,
topic_forbid_comment,
topic_text_hash,
topic_id
)
VALUES(?d, ?d, ?, ?, ?, ?, ?, ?d, ?d, ?d, ?, ?, ?,?d)
";
if ($iId=$this->oDb->query($sql,$oTopic->getBlogId(),$oTopic->getUserId(),$oTopic->getType(),$oTopic->getTitle(),
$oTopic->getTags(),$oTopic->getDateAdd(),$oTopic->getUserIp(),$oTopic->getPublish(),$oTopic->getPublishDraft(),$oTopic->getPublishIndex(),$oTopic->getCutText(),$oTopic->getForbidComment(),$oTopic->getTextHash(),$needId))
{
$oTopic->setId($needId);
$this->AddTopicContent($oTopic);
return $needId;
}
}
return false;
}


3) Return to the function in the classes \ modules \ topic \ Topic.class.php AddTopic file
and we change it to the first line with
if ($sId=$this->oMapperTopic->AddTopic($oTopic)) {

on
if ($sId=$this->oMapperTopic->AddTopic($oTopic,$needId)) {


Now we have the opportunity to insert a topic with the necessary identifier, if there is an entry in the database with the same identifier, then it will not be overwritten.

Now everything is ready to install the module. Just want to note that the module is not localized and all the text is directly sewn into the templates. It also works for a very long time, so as not to attract the attention of Yandex, but it is customizable.

To install it:
1) Download it from here.
2) Read the readme file from the downloaded archive.
3) Install the module and restore the lost topics.

If possible, he will get all the topics that are in the search engine cache. Place katy, set the desired dates, the names of blogs and users. To use this module you need: curl, mb_string, iconv. I hope it will be useful not only for me. Thanks for attention.

Source: https://habr.com/ru/post/78699/


All Articles