📜 ⬆️ ⬇️

Own search by hand rutracker.org - implementation on Yii2

Inspired by this publication.

Here is a description of how to implement a search by rutracker.org distribution on your own hosting / localhost.


')
Preliminary agreement:



After reading the previous topic on this topic, I was, frankly, slightly disappointed with the implementation offered by the author. Actually, that's why I did everything myself.

The whole project is on github , the whole code can be viewed there, I will only give excerpts here, for understanding the point.

The project implemented automatic import of csv-files from this distribution (started from the console), and a search by name / category / subcategory of distribution.

Details

If you want to use the whole project as it is, here is a brief instruction:

  1. clone repository (git clone github.com/andrew72ru/rutracker-yii2.git )
  2. go to the project folder, install the components (composer install)
  3. initialize the environment (./init)
  4. create a database, configure access to it in common / config / main-local.php
  5. start the migration (./yii migrate)
  6. configure your web server to access the project (root directory is frontend / web)
  7. download distribution
  8. create a frontend / runtime / csv directory
  9. Put the latest distribution files in this directory. The entire distribution is divided into folders, they are named dates, I took the folder with the last date
  10. run in the console ./yii import / import


On my server, the import lasted about six hours - there are more than one and a half million entries in the distribution table, do not be surprised.

DB schema
Table for categories:

CREATE TABLE `categories` ( `id` int(11) NOT NULL AUTO_INCREMENT, `category_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `file_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=37 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; 


Subcategories table:

 CREATE TABLE `subcategory` ( `id` int(11) NOT NULL AUTO_INCREMENT, `forum_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=1239 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; 


Distribution table:

 CREATE TABLE `torrents` ( `id` int(11) NOT NULL AUTO_INCREMENT, `forum_id` int(11) DEFAULT NULL, `forum_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL, `topic_id` int(11) DEFAULT NULL, `hash` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL, `topic_name` text COLLATE utf8_unicode_ci, `size` bigint(20) DEFAULT NULL, `datetime` int(11) DEFAULT NULL, `category_id` int(11) NOT NULL, `forum_name_id` int(11) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `topic_id` (`topic_id`), UNIQUE KEY `hash` (`hash`), KEY `category_torrent_fk` (`category_id`), KEY `torrent_subcat_id` (`forum_name_id`), CONSTRAINT `category_torrent_fk` FOREIGN KEY (`category_id`) REFERENCES `categories` (`id`) ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT `torrent_subcat_id` FOREIGN KEY (`forum_name_id`) REFERENCES `subcategory` (`id`) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=1635590 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; 


The table with distributions is somewhat redundant (the forum_name column is no longer needed, implemented as a link), but I did not delete it, so that you can refer directly to it and not use the JOIN.


Models

Models are used generated through gii with almost no changes. I don’t think it’s worth giving them all here (see github), except for the one used to search through Sphinx.

TorrentSearch.php
 namespace common\models; use Yii; use yii\helpers\ArrayHelper; use yii\sphinx\ActiveDataProvider; //   use yii\sphinx\ActiveRecord; //   yii2-sphinx /** * This is the model class for index "torrentz". * * @property integer $id * @property string $size * @property string $datetime * @property integer $id_attr * @property integer $size_attr * @property integer $datetime_attr * @property string $topic_name * @property string $topic_id * @property integer $topic_id_attr * @property integer $category_attr * @property string $category_id * @property string $name_attr * @property integer $forum_name_id_attr */ class TorrentSearch extends ActiveRecord { /** * @inheritdoc */ public static function indexName() { return '{{%torrentz}}'; } /** * @inheritdoc */ public function rules() { return [ [['id'], 'required'], [['id'], 'unique'], [['id'], 'integer'], [['id_attr'], 'integer'], [['topic_name', 'topic_id', 'category_id'], 'string'], [['name_attr'], 'string'], [['id', 'size_attr', 'datetime_attr', 'id_attr', 'topic_id_attr', 'category_attr', 'forum_name_id_attr'], 'integer'], [['size', 'datetime', 'topic_name', 'name_attr'], 'string'] ]; } /** * @inheritdoc */ public function attributeLabels() { return [ 'id_attr' => Yii::t('app', 'ID'), 'name_attr' => Yii::t('app', 'Topic Name'), 'id' => Yii::t('app', 'ID'), 'size' => Yii::t('app', 'Size'), 'datetime' => Yii::t('app', 'Datetime'), 'topic_name' => Yii::t('app', 'Topic Name'), 'size_attr' => Yii::t('app', 'Size'), 'datetime_attr' => Yii::t('app', 'Torrent Registered Date'), 'category_attr' => Yii::t('app', 'Category Name'), 'forum_name_id_attr' => Yii::t('app', 'Forum Name'), ]; } /** *    * * @param $params * @return ActiveDataProvider */ public function search($params) { $query = self::find(); $dataProvider = new ActiveDataProvider([ 'query' => $query, ]); $this->load($params); $query->match($this->name_attr); $query->filterWhere(['category_attr' => $this->category_attr]); $query->andFilterWhere(['forum_name_id_attr' => $this->forum_name_id_attr]); $dataProvider->sort = [ 'defaultOrder' => ['category_attr' => SORT_ASC, 'datetime_attr' => SORT_DESC], ]; return $dataProvider; } /** *    (forum_name)    * * @param null|integer $id * @return array */ public static function subsForCat($id = null) { $query = Subcategory::find(); if ($id != null && ($cat = Categories::findOne($id)) !== null) { $subcatsArr = array_keys(self::find() ->where(['category_attr' => $id]) ->groupBy('forum_name_id_attr') ->indexBy('forum_name_id_attr') ->limit(10000) ->asArray() ->all()); $query->andWhere(['id' => $subcatsArr]); } return ArrayHelper::map($query->asArray()->all(), 'id', 'forum_name'); } /** *     ,    * * @param null|integer $id * @return array */ public static function catForSubs($id = null) { $query = Categories::find(); if($id != null && ($subCat = Subcategory::findOne($id)) !== null) { /** @var TorrentSearch $category */ $category = self::find()->where(['forum_name_id_attr' => $id])->one(); $query->andWhere(['id' => $category->category_attr]); } return ArrayHelper::map($query->asArray()->all(), 'id', 'category_name'); } } 



Import

The main idea is that we first import categories (the category_info.csv file), then distributions (category _ *. Csv files), in the course of importing distributions we take subcategories from them and write them into a separate model.

Import controller
 namespace console\controllers; use common\models\Categories; use common\models\Subcategory; use common\models\Torrents; use Yii; use yii\console\Controller; use yii\helpers\Console; use yii\helpers\VarDumper; /** *      csv- * * Class ImportController * @package console\controllers */ class ImportController extends Controller { public $color = true; /** *  * @return int */ public function actionIndex() { $this->stdout("Default: import/import [file_path]. \nDefault file path is frontend/runtime/csv\n\n"); return Controller::EXIT_CODE_NORMAL; } /** *    * * @param string $path * @return int */ public function actionImport($path = 'frontend/runtime/csv') { $fullPath = Yii::getAlias('@' . $path); if(!is_dir($fullPath)) { $this->stderr("Path '{$fullPath}' not found\n", Console::FG_RED); return Controller::EXIT_CODE_ERROR; } if(is_file($fullPath . DIRECTORY_SEPARATOR . 'category_info.csv')) $categories = $this->importCategories($fullPath); else { $this->stderr("File 'category_info.csv' not found\n", Console::FG_RED); return Controller::EXIT_CODE_ERROR; } if($categories === false) { $this->stderr("Categories is NOT imported", Console::FG_RED); return Controller::EXIT_CODE_ERROR; } /** @var Categories $cat */ foreach ($categories as $cat) { if(!is_file($fullPath . DIRECTORY_SEPARATOR . $cat->file_name)) continue; $this->importTorrents($cat, $path); } return Controller::EXIT_CODE_NORMAL; } /** *   * * @param \common\models\Categories $cat * @param $path */ private function importTorrents(Categories $cat, $path) { $filePath = Yii::getAlias('@' . $path . DIRECTORY_SEPARATOR . $cat->file_name); $row = 0; if (($handle = fopen($filePath, "r")) !== FALSE) { while (($data = fgetcsv($handle, 0, ";")) !== FALSE) { $row++; $model = Torrents::findOne(['forum_id' => $data[0], 'topic_id' => $data[2]]); if($model !== null) continue; // Subcategory $subcat = $this->importSubcategory($data[1]); if(!($subcat instanceof Subcategory)) { $this->stderr("Error! Unable to import subcategory!"); $this->stdout("\n"); continue; } $this->stdout("Row {$row} of category \"{$cat->category_name}\" "); $this->stdout("and subcategory \"{$subcat->forum_name}\": \n"); if($model === null) { if(isset($data[4])) $data[4] = str_replace('\\', '/', $data[4]); //   ,   ,       //          , //    ,    if(!isset($data[0]) || !isset($data[1]) || !isset($data[2]) || !isset($data[3]) || !isset($data[4]) || !isset($data[5]) || !isset($data[6])) { $this->stderr("Error! Undefined Field!\n", Console::FG_RED); \yii\helpers\VarDumper::dump($data); $this->stdout("\n"); continue; } $model = new Torrents([ 'forum_id' => $data[0], 'forum_name' => $data[1], 'topic_id' => $data[2], 'hash' => $data[3], 'topic_name' => $data[4], 'size' => $data[5], 'datetime' => strtotime($data[6]), 'category_id' => $cat->id, ]); } $model->forum_name_id = $subcat->id; if($model->save()) { $this->stdout("Torrent \t"); $this->stdout($model->topic_name, Console::FG_YELLOW); $this->stdout(" added\n"); } $this->stdout("\n"); } } } /** *   (forum_name) * * @param string $subcat_name * @return bool|Subcategory */ private function importSubcategory($subcat_name) { $model = Subcategory::findOne(['forum_name' => $subcat_name]); if($model === null) $model = new Subcategory(['forum_name' => $subcat_name]); if($model->save()) return $model; else { VarDumper::dump($model->errors); } return false; } /** *   * * @param $path * @return array|\yii\db\ActiveRecord[] */ private function importCategories($path) { $file = $path . DIRECTORY_SEPARATOR . 'category_info.csv'; $row = 1; if (($handle = fopen($file, "r")) !== FALSE) { while (($data = fgetcsv($handle, 0, ";")) !== FALSE) { $row++; $this->stdout("Row " . $row . ":\n"); $model = Categories::findOne($data[0]); if($model === null) { $model = new Categories([ 'id' => $data[0], 'category_name' => $data[1], 'file_name' => $data[2] ]); } if($model->save()) $this->stdout("Category {$model->id} with name '{$model->category_name}' imported\n"); $this->stdout("\n"); } } else return false; return Categories::find()->all(); } } 



Import is better to run in screen, so that you can close the console. You can, of course, redirect the output to a file and read later, at your leisure.

Sphinx

For debian - apt-get install sphinxsearch
I have installed Sphinx version 2.2.9

/etc/sphinxsearch/sphinx.conf
 source torrentz { type = mysql sql_host = localhost sql_user = webmaster #   MySQL sql_pass = webmaster #   MySQL sql_db = rutracker #      sql_port = 3306 sql_query_pre = SET NAMES utf8 sql_query_pre = SET CHARACTER SET utf8 sql_query = SELECT id, id AS id_attr, \ size, size AS size_attr, \ datetime, datetime as datetime_attr, \ topic_name, topic_name AS name_attr, \ topic_id, topic_id AS topic_id_attr, \ category_id, category_id AS category_attr, \ forum_name_id, forum_name_id AS forum_name_id_attr \ FROM torrents sql_attr_string = name_attr sql_attr_uint = id_attr sql_attr_uint = size_attr sql_attr_uint = datetime_attr sql_attr_uint = topic_id_attr sql_attr_uint = category_attr sql_attr_uint = forum_name_id_attr } index torrentz { source = torrentz path = /var/lib/sphinxsearch/data/ docinfo = extern morphology = stem_enru min_word_len = 2 charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42C->U+430..U+44C, U+42E..U+42F->U+44E..U+44F, U+430..U+44C, U+44E..U+44F, U+0401->U+0435, U+0451->U+0435, U+042D->U+0435, U+044D->U+0435 min_infix_len = 2 } indexer { mem_limit = 512M } searchd { listen = 0.0.0.0:9306:mysql41 log = /var/log/sphinxsearch/searchd.log query_log = /var/log/sphinxsearch/query.log read_timeout = 5 max_children = 30 pid_file = /var/run/sphinxsearch/searchd.pid } 



Indexing is started by the command

 indexer --config /etc/sphinxsearch/sphinx.conf --all #    

 indexer --config /etc/sphinxsearch/sphinx.conf --rotate --all #     


That's all.
In the web interface, the standard Yii2 GridView, search through standard filters.

What would be worth completing

You can develop it endlessly, if you want. First of all, you can make a selective import of categories / subcategories, a more correct dependent list of categories / subcategories in GridView, API for remote queries and then in general everything that comes to mind.

Maybe I'll do it at my leisure.

PS I welcome comments and additions on the code, but please do not bother to write “php sucks, write on ... <insert any other language>” - we have already discussed all this for a long time.
Also, comments / additions on sphinx config are welcome, and I want to remind once again - I saw it for the first time in my life and used it only because the author of the original topic wrote about it. Well, for the experiment, of course, but what about :)

Source: https://habr.com/ru/post/274449/


All Articles