How to start working with Hibernate Search

Today, many are developing enterprise Java applications using spring boot. In the course of projects there are often tasks to create search engines of various complexity. For example, if you are developing a system that stores data about users and books, then sooner or later it may need to search by user name / last name, by name / annotation for books.

In this post I will briefly talk about tools that can help in solving such problems. And then I will present a demo project of a search service where a more interesting and complex feature is implemented - synchronization of entities, database and search index. Using this demo project as an example, you can get acquainted with Hibernate Search - a convenient way to communicate with the full-text Solr, Lucene, ElasticSearch indexes.

Among the tools for deploying search engines, I would highlight three.
')
Lucene is a java-library that provides a low-level denormalized database interface with full-text search capability. With it, you can create indexes and fill them with records (documents). Read more about Lucene here .

Solr is a Lucene-based end product, a full-text database, a standalone, separate web server. It has an http interface for indexing and full-text queries, allows you to index documents and search them. Solr has a simple API and built-in UI, which saves the user from manual index manipulations. On Habré there was a good comparative analysis of Solr and Lucene.

ElasticSearch is a more modern analogue of Solr. It is also based on Apache Lucene. Compared to Solr, ElasticSearch can withstand higher document indexing loads and therefore can be used to index log files. On the net you can find a detailed table comparing Solr and ElasticSearch.

This, of course, is not a complete list; I have selected above only those systems that deserve the most attention. There is a lot of systems for search organization. PostgreSQL has full-text search capabilities; Do not forget about Sphinx.

Main problem

Go to the main thing. RDB (Relational Database) is typically used for reliable / consistent data storage. It provides transactivity in accordance with ACID principles. For the search engine to work, an index is used in which you need to add entities and those fields of the tables that will be searched. That is, when a new object enters the system, it must be saved both in the relational database and in the full-text index.

If the transactionality of such changes is not organized within your application, various types of desynchronization may occur. For example, you are sampling from a database, and there is no index in this object. Or vice versa: there is an object record in the index, and it was deleted from the RDB.

This problem can be solved in different ways. You can manually organize the transactional changes using the JTA and Spring Transaction Management mechanisms. Or you can go a more interesting way - use Hibernate Search, which will do it all by yourself. The default is Lucene, which stores the index data within the file system, in general, the connection to the index is configured. When the system is started, you start the startAndWait () synchronization method, and the records will be stored in the RDB and index during operation.

To illustrate this solution, I prepared a demo project with Hibernate Search. We will create a service containing methods for reading, updating, and searching for users. It can form the basis of an internal database with the possibility of full-text search by first name, last name, or other metadata. To interact with relational databases, use the framework Spring Data Jpa .

Let's start with the entity class to represent the user:

import org.hibernate.search.annotations.Field import org.hibernate.search.annotations.Indexed import javax.persistence.Entity import javax.persistence.Id import javax.persistence.Table @Entity @Table(name = "users") @Indexed internal data class User(        @Id        val id: Long,        @Field        val name: String,        @Field        val surname: String,        @Field        val phoneNumber: String)

Everything is standard, we denote an entity with all the necessary annotations for spring data. With the help of Entity we specify the entity, with the help of the Table we specify the table in the database. Abstract Indexed indicates that the entity is indexed and will fall into the full-text index.

JPA-Repository required for CRUD operations on users in the database:

 internal interface UserRepository: JpaRepository<User, Long>

Service for working with users, UserService.java:

 import org.springframework.stereotype.Service import javax.transaction.Transactional @Service @Transactional internal class UserService(private val userRepository: UserRepository, private val userSearch: UserSearch) {   fun findAll(): List<User> {       return userRepository.findAll()   }   fun search(text: String): List<User> {       return userSearch.searchUsers(text)   }   fun saveUser(user: User): User {       return userRepository.save(user)   } }

FindAll gets all users directly from the database. Search uses the userSearch component to retrieve users from the index. Component for working with user search index:

 @Repository @Transactional internal class UserSearch(@PersistenceContext val entityManager: EntityManager) {   fun searchUsers(text: String): List<User> {       // fullTextEntityManager,  entityManager       val fullTextEntityManager = org.hibernate.search.jpa.Search.getFullTextEntityManager(entityManager)       //     Hibernate Search query DSL       val queryBuilder = fullTextEntityManager.searchFactory               .buildQueryBuilder().forEntity(User::class.java).get()       // ,            val query = queryBuilder               .keyword()               .onFields("name")               .matching(text)               .createQuery()       // Lucene Query  Hibernate Query object       val jpaQuery: FullTextQuery = fullTextEntityManager.createFullTextQuery(query, User::class.java)       //         return jpaQuery.resultList.map { result -> result as User }.toList()   } }

REST controller, UserController.java:

 import org.springframework.web.bind.annotation.GetMapping import org.springframework.web.bind.annotation.PostMapping import org.springframework.web.bind.annotation.RequestBody import org.springframework.web.bind.annotation.RestController import java.util.* @RestController internal class UserController(private val userService: UserService) {   @GetMapping("/users")   fun getAll(): List<User> {       return userService.findAll()   }   @GetMapping("/users/search")   fun search(text: String): List<User> {       return userService.search(text)   }   @PostMapping("/users")   fun insertUser(@RequestBody user: User): User {       return userService.saveUser(user)   } }

We use two methods to extract from the database and search by string.

Before the application to work, it is necessary to initialize the index, we do it with the ApplicationListener.

 package ru.rti import org.hibernate.search.jpa.Search import org.springframework.boot.context.event.ApplicationReadyEvent import org.springframework.context.ApplicationListener import org.springframework.stereotype.Component import javax.persistence.EntityManager import javax.persistence.PersistenceContext import javax.transaction.Transactional @Component @Transactional class BuildSearchService( @PersistenceContext val entityManager: EntityManager) : ApplicationListener<ApplicationReadyEvent> { override fun onApplicationEvent(event: ApplicationReadyEvent?) { try { val fullTextEntityManager = Search.getFullTextEntityManager(entityManager) fullTextEntityManager.createIndexer().startAndWait() } catch (e: InterruptedException) { println("An error occurred trying to build the search index: " + e.toString()) } } }

For the test used PostgreSQL:

 spring.datasource.url=jdbc:postgresql:users spring.datasource.username=postgres spring.datasource.password=postgres spring.datasource.driver-class-name=org.postgresql.Driver spring.datasource.name=users

Finally, build.gradle :

 buildscript {   ext.kotlin_version = '1.2.61'   ext.spring_boot_version = '1.5.15.RELEASE'   repositories {       jcenter()   }   dependencies {       classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"       classpath "org.jetbrains.kotlin:kotlin-allopen:$kotlin_version"       classpath "org.springframework.boot:spring-boot-gradle-plugin:$spring_boot_version"       classpath "org.jetbrains.kotlin:kotlin-noarg:$kotlin_version"   } } apply plugin: 'kotlin' apply plugin: "kotlin-spring" apply plugin: "kotlin-jpa" apply plugin: 'org.springframework.boot' noArg {   invokeInitializers = true } jar {   baseName = 'gs-rest-service'   version = '0.1.0' } repositories {   jcenter() } dependencies {   compile "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version"   compile 'org.springframework.boot:spring-boot-starter-web'   compile 'org.springframework.boot:spring-boot-starter-data-jpa'   compile group: 'postgresql', name: 'postgresql', version: '9.1-901.jdbc4'   compile group: 'org.hibernate', name: 'hibernate-core', version: '5.3.6.Final'   compile group: 'org.hibernate', name: 'hibernate-search-orm', version: '5.10.3.Final'   compile group: 'com.h2database', name: 'h2', version: '1.3.148'   testCompile('org.springframework.boot:spring-boot-starter-test') }

This demo is a simple example of using the Hibernate Search technology, with which you can understand how to make friends with Apache Lucene and Spring Data Jpa. If necessary, projects based on this demo can be connected to Apache Solr or ElasticSearch. The potential direction of the project development is a search by large indices (> 10 GB) and measuring their performance. You can create configurations for ElasticSearch or more complex index configurations, exploring the possibilities of Hibernate Search at a deeper level.

Useful links:

Source: https://habr.com/ru/post/428578/

All Articles

How to start working with Hibernate Search

Main problem

More articles: