Dive into BerkeleyDB JE. Introduction to Collections API

Introduction

A little bit about sabzh. BerkleyDB is a high-performance embedded DBMS that comes in the form of a library for various programming languages. This solution assumes the storage of key-value pairs, the ability to assign several values to one key is also supported. BerkeleyDB supports multi-threading, replication, and more. Attention of this article will be paid primarily to the use of the library provided by Sleepycat Software in the bearded 90s.

In the previous article, we looked at the main aspects of working with the Direct Persistence Layer API, thanks to which you can work with Berkeley as a relational database. Today, attention will be paid to the Collections API, which provides the ability to work through all the familiar Java Collections interface adapters.

Note : All examples in this article will be given in the Kotlin language.

Some general information

We all know that working with annotations, deferred initialization and nullable types in Kotlin is a big, big pain. Due to its specificity, DPL does not allow to eliminate these problems, and the only loophole is the creation of its own implementation of EntityModel , a mechanism that determines the way of working with bins. The main incentive, for me personally, is to use the Collections API, this is the ability to work with completely clean, familiar data-class bins. Let's look at how to transfer the code from the previous article to this framework.
')

Entity description

To describe any object in the database, which we represent as a separate entity, we need to create three classes: an integral entity, which we will operate outside of the database context, its key and data.

There are no requirements for the essence, and the key and the value (in the standard work with the database covered by this article) are required to implement the Serializable interface. Everything is standard here, we want in-memory fields - we add the @Transient annotation to them. Anything not marked as @Transient will be serialized.

As we remember, to organize the records in the sample, you need to use the implementation of the Comparable interface as a key. Here the principle is the same: the selection will be sorted by keys.

Bean Description Example

 data class CustomerDBO( val id: String, val email: String, val country: String, val city: String, var balance: Long ) data class CustomerKey( val id: String ): Serializable data class CustomerData( val email: String, val country: String, val city: String, val balance: Long ): Serializable

Entity operations

To create a connection in the case of Collections API will have to sweat a little. To begin with, you should consider the principle of operation in the most common case - N: 1.

We’ll omit the standard actions for creating EnvironmentConfig , since it doesn’t differ in any way from the configuration for DPL. Differences begin immediately after them.

For each of the entities we need to create a separate database, giving it a unique name, plus, we need to create a separate database that stores information about the entities in this Environment and wrap it in ClassCatalog . We can say that in Berkeley databases have about the same essence as tables in SQL. An example under a cat.

Creating a database for the entity and directory

  private val databaseConfig by lazy { DatabaseConfig().apply { transactional = true allowCreate = true } } private val catalog by lazy { StoredClassCatalog(environment.openDatabase(null, STORAGE_CLASS_CATALOG, databaseConfig)) } val customersDatabase by lazy { environment.openDatabase(null, STORAGE_CUSTOMERS, databaseConfig) }

Further, it is logical that we need some convenient point of contact with the framework, since the Database itself has a very bad low-level API. Such adapters are the StoredSortedMap and StoredValueSet . It is most convenient to use the first one as an immobile point of contact with the database, and the second one is mutable.

Collection adapters

  private val view = StoredSortedMap<CustomerKey, CustomerData>( customersDatabase, customerKeyBinding, customerDataBinding, false ) private val accessor = StoredValueSet<CustomerDBO>( customersDatabase, customerBinding, true )

You may notice that at the moment, Berkeley does not know how the mapping (key, data) -> (dbo) and dbo -> (key, data) done. In order for the mapping to work, you need to implement one more mechanism for each of the bins — the binding. The interface is extremely simple - two methods, for mapping into data and into a key, and one into essence.

Binding example

 class CustomerBinding( catalog: ClassCatalog ): SerialSerialBinding<CustomerKey, CustomerData, CustomerDBO>(catalog, CustomerKey::class.java, CustomerData::class.java) { override fun entryToObject(key: CustomerKey, data: CustomerData): CustomerDBO = CustomerDBO( id = key.id, email = data.email, country = data.country, city = data.city, balance = data.balance ) override fun objectToData(dbo: CustomerDBO): CustomerData = CustomerData( email = dbo.email, country = dbo.country, city = dbo.city, balance = dbo.balance ) override fun objectToKey(dbo: CustomerDBO): CustomerKey = CustomerKey( id = dbo.id ) }

Now we can safely use working collections that will be automatically synchronized as the data in the database changes. In this case, the "pulp" is the ability to quietly use parallel recording and reading from different streams. This possibility is primarily due to the fact that the iterator these collections will be a copy of the current state, and will not change when the collection changes, while the collections themselves are mutable. Thus, the only thing that a programmer should think about is monitoring the relevance of the data.

Well, with the usual CRUD we figured out, go to the connections!

Relationships between entities

To work with relationships, we will need to additionally create a SecondaryDatabase , which will provide access to some entities by the key of others. An important note is the need to set the value of sortedDuplicates in DatabaseConfig to true if the link is not 1: 1 or 1: M. This action is quite logical, based on the fact that indexing will occur via a foreign key, and several entities will correspond to one key.

An example of a secondary database with a configuration

  val ordersByCustomerIdDatabase by lazy { environment.openSecondaryDatabase(null, STORAGE_ORDERS_BY_CUSTOMER_ID, ordersDatabase, SecondaryConfig().apply { transactional = true allowCreate = true sortedDuplicates = true keyCreator = OrderByCustomerKeyCreator(catalog = catalog) foreignKeyDatabase = customersDatabase foreignKeyDeleteAction = ForeignKeyDeleteAction.CASCADE }) }

It is noteworthy that as a foreign key, you can choose not only the field by which the link will be set, but also any arbitrary data type. The key creation role is assumed by the implementation of the SecondaryKeyCreator , or SecondaryMultiKeyCreator interface (there are also more specific options, but it suffices to implement one of these two).

SecondaryKeyCreator Example

 class OrderByCustomerKeyCreator( catalog: ClassCatalog ): SerialSerialKeyCreator<OrderKey, OrderData, CustomerKey>(catalog, OrderKey::class.java, OrderData::class.java, CustomerKey::class.java) { override fun createSecondaryKey(key: OrderKey, data: OrderData): CustomerKey = CustomerKey( id = data.customerId ) }

It remains a little bit - to create a collection to receive samples of our foreign keys, the code under the cat, does not fundamentally differ from creating a collection for non-secondary databases.

Creating a collection for sampling N: 1

 private val byCustomerKeyView = StoredSortedMap<CustomerKey, OrderData>( database.ordersByCustomerIdDatabase, database.customerKeyBinding, database.orderDataBinding, false )

Instead of conclusion

This article was the last part of a series on acquaintance with the basics of working with BerkeleyDB. After reading this and the previous article, the reader is able to use the DBMS in their projects as a local repository, for example, in the client application. The following articles will discuss more interesting aspects - migration, replication, some interesting configuration parameters.

As usual - if someone has additions or corrections of my jambs - welcome in the comments. I am always glad to constructive criticism!

Source: https://habr.com/ru/post/337024/

All Articles