📜 ⬆️ ⬇️

Taming SphinxSearch with an Elephant

image

Good afternoon, hrobrozhiteli!

I present to your attention an extension for PostgreSQL that allows you to send search queries to Sphinx from PostgreSQL and receive the results of these queries.

Implementation details and link to the repository under the cat.

Foreword


One large web project required the use of the full text engine Sphinx . And everything would be fine if it were not required to sort the results of the queries from Sphinx according to the cleverly specified rules inside the database.
')
And then the obvious question arose: how? Here at least two options suggest themselves:


After weighing all the pros and cons of these options, the sphinxlink extension was implemented. The idea of ​​implementation was honestly spied from the dblink extension.

Expansion functions and implementation details


As in dblink, everything is managed using SQL functions written in C.
The client interaction protocol with Sphinx corresponds to the mysql interaction protocol, which means that we can use the most common mysql-client library to connect to Sphinx.

To connect to the Sphinx server, use the sphinx_connect () function:

SELECT * FROM sphinx_connect('myconn'); sphinx_connect ---------------- OK (1 ) 

Connections are created in the static memory area of ​​the backend as an HTAB structure, so when closing a session, all connections created to Sphinx will be closed.

To view the list of open connections with Sphinx and connection parameters, use the sphinx_connections () function:

 SELECT * FROM sphinx_connections(); conname | host | port ---------+-----------+------ myconn | 127.0.0.1 | 9306 (1 ) 

And now how to send search queries and get results.

Retrieving search query results in tabular form is implemented in the function sphinx_query () . It is important to specify the name of the columns and their data types for the correct formation of the result.

For example, create an index in Sphinx with the following structure:

 DESC myindex; +---------+--------+ | Field | Type | +---------+--------+ | id | bigint | | content | field | | title | string | +---------+--------+ 

The content field will contain the texts of the documents (for example, the stories of A.P. Chekhov), the title field is the title of the story, the field id is the text identifier (the service field in Sphinx).

 SELECT * FROM sphinx_query('myconn', 'SELECT weight(), * FROM myindex WHERE MATCH('''')') AS (weight integer, docid integer, title text); weight | docid | title --------+-------+------- 1680 | 3 |  (1 ) 

To request meta-information on the last request, use the sphinx_meta () function:

 SELECT * FROM sphinx_meta('myconn'); varname | value -------------+-------- total | 1 total_found | 1 time | 0.000 keyword[0] |  docs[0] | 1 hits[0] | 1 (6 ) 

This feature is made for convenience. A similar effect can be achieved by sending a SHOW META request using the sphinx_query () function.

To close an unnecessary connection, use the sphinx_disconnect () function:

 SELECT * FROM sphinx_disconnect('myconn'); sphinx_disconnect ------------------- OK (1 ) 

Implementation plans


The current implementation of the module has several disadvantages:


Traditionally for similar posts, leave a link to github: sphinxlink .

Comments and suggestions are warmly welcome. Thanks for attention!

Source: https://habr.com/ru/post/347700/


All Articles