📜 ⬆️ ⬇️

Make the simplest filter by product properties using ElasticSearch on Symfony2

I was inspired to write this article by the lack of a ready-made step-by-step guide on the Internet “how to implement a filter of products on ElasticSearch”, and the task to do this was clear and unshakable. It was possible to find fragmentary reference information, but not a cookbook for solving the most trivial tasks.

I focus your attention on symfony2, because I will use FOSElasticaBundle, which allows you to describe the mapping of elasticsearch indexes in convenient yaml configs and attach Doctrine ORM entities or Doctrine ODM documents to them. Proposed indexes are populated from related doctrina entities with the help of a single console command. In addition, it includes a vendor library for constructing search and filtering queries. Search results are returned as an array of entity objects or a Doctrine ORM / ODM document bound to a search index. Learn more about FOSElasticaBundle, traditionally, on the githaba: github.com/FriendsOfSymfony/FOSElasticaBundle

Using a bundle allows you to completely abstract from manipulations with pure JSON, encode and decode something with json_encode and json_decode functions, climb somewhere using Surl. Here, only the OOP approach!
')
A bit about SQL data schema

Since my products are stored in a relational DBMS, I needed to implement the EAV model for their properties and values ​​(for more information: en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model )

As a result, I got this data scheme:
image


base dump: drive.google.com/file/d/0B30Ybwiexqx6S1hCanpISHVvcjQ/edit?usp=sharing
On it we will create doctrinal entities and they will map them in ElasticSearch.

Mappim EAV model in ElasticSearch

So, first install FOSElasticaBundle. In composer.json you need to specify:

"friendsofsymfony/elastica-bundle": "dev-master" 


Update the dependencies and prescribe the established bundle in AppKernel.php:

 new FOS\ElasticaBundle\FOSElasticaBundle() 


Now we set the following settings in config.yml:

 fos_elastica: clients: default: { host: localhost, port: 9200 } indexes: test: types: product: mappings: name: ~ price: ~ category: ~ productsOptionValues: type: "object" properties: productOption: index: not_analyzed value: type: string index: not_analyzed persistence: driver: orm model: Vendor\TestBundle\Entity\Product provider: ~ listener: immediate: ~ finder: ~ 


To fill the index created above with data you need to run the console command php app / console fos: elastica: populate. As a result, FOSElasticaBundle will fill the index with data from the database.

Note: Inside the product in the form of an embedded object, we put the characteristics and their values. In order for everything to work as it should, you should specify the type: “object” instead of type: “nested” for the collection of the characteristics of productsOptionValues. Otherwise, the characteristics will be stored as arrays as described here: www.elasticsearch.org/guide/en/elasticsearch/guide/current/complex-core-fields.html#_arrays_of_inner_objects and the filter will not work correctly. It should also be noted that the filtered fields should not be analyzed for what the index: not_analyzed line is responsible for. Otherwise, problems will arise when filtering strings containing spaces.

Now you can see the list of products with the characteristics embedded in them at localhost: 9200 / test / product / _search? Pretty In my case, the server's response looks like this:
gist.github.com/ArFeRR/3976778079d64d5a72cd

Render filtering form


My form itself looks like this:


In the controller, we will execute requests for all properties and products, declare an empty filter array and transfer all this to the TWIG template:

 $options = $entityManager->getRepository("ParfumsTestBundle:ProductOption")->findAll(); $products = $entityManager->getRepository("ParfumsTestBundle:Product")->findAll(); $request = $this->get('request'); $filter = $request->query->get('filter'); return $this->render('ParfumsTestBundle:Default:filter.html.twig', array('options'=>$options, 'products' => $products, 'filter' => $filter)); 


Here you should group by property names to avoid duplicating them on the form, but to save space, I do not do that. Write a DQL request to your entity / document repository yourself. FindAll request for goods is needed to display the entire list of products, if nothing is selected on the filter.

And here is the twig itself:
 {% extends "TwigBundle::layout.html.twig" %} {% block body %} <h1></h1> <form> <ul> {% for option in options %} <li> {{ option.name }} <ul> {% for value in option.productsOptionValues %} <li> <input type="checkbox" value="{{ value.value }}" name="filter[{{ option.name }}][{{ value.id }}]" {% if filter[option.name][value.id] is defined %} checked="checked" {% endif %} /> {{ value.value }} </li> {% endfor %} </ul> </li> {% endfor %} </ul> <input type="submit" /> </form> <h1></h1> <table> {% for product in products %} <tr> <td>{{ product.name }}</td> <td>{{ product.price }}</td> <td> {% for option_value in product.productsOptionValues %} {{ option_value.productOption }} : {{ option_value.value }} <br /> {% endfor %} </td> </tr> {% endfor %} </table> {% endblock %} 


We process the filtering form

Let's get to the fun part.
Now we will need to construct a search query (or, more precisely, a JSON filter), which will be passed to ElasticSearch for processing. This is done using the Elastica.io library built into the FOSElasticaBundle (more: elastica.io )
So, in the action of your controller, we process the filtering array received from the form:

 if(!empty($filter)) { $finder = $this->container->get('fos_elastica.finder.parfums.product'); $andOuter = new \Elastica\Filter\Bool(); foreach($filter as $optionKey=>$arrValues) { $orOuter = new \Elastica\Filter\Bool(); foreach($arrValues as $value) { $andInner = new \Elastica\Filter\Bool(); $optionKeyTerm = new \Elastica\Filter\Term(); $optionKeyTerm->setTerm('productOptionValues.productOption', $optionKey); $valueTerm = new \Elastica\Filter\Term(); $valueTerm->setTerm('productOptionValues.value', $value); $andInner->addMust($optionKeyTerm); $andInner->addMust($valueTerm); $orOuter->addShould($andInner); } $andOuter->addMust($orOuter); } $filtered = new \Elastica\Query\Filtered(); $filtered->setFilter($andOuter); $products = $finder->find($filtered); } 


Here I take an array passed through the address line and iterate through the filter values ​​selected by the user to create a tree structure of class objects for which the Elastica library will generate a JSON string, by which ElasticSearch will filter our data set:
gist.github.com/ArFeRR/97671e54515dfd7be012

This JSON roughly corresponds to the following condition in a relational database:
WHERE ((option = resolution AND value = 1980x1020) OR (option = resolution AND value = 1600x900)) AND (option = weight AND value = 2.7 kg)

As a result, as a result, we must receive products that must have the same weight and at least one of the two screen resolutions selected by the user. In my data set, this is only 1 item.



It seems to be working properly.

The above filtering example can be improved. The next step should be the implementation of sorting the results by relevance, their paginated output and setting up aggregations (private implementation of the facets in ES). I will write about it later if it will be interesting to the habr-community.

upd0:
At the request of readers, a filter form handler was rewritten using the secure Symfony \ Component \ HttpFoundation \ Request object. It should be embedded in the action (passed as a parameter) or obtained from the service via $ request = $ this-> get ('request') in the action.

Source: https://habr.com/ru/post/229905/


All Articles