⬆️ ⬇️

BaseX. Unknown NoSQL Universe



Far, far away, somewhere on the edge of the Galaxy, I found a very impressive NoSQL solution ...



Love, apathy, hatred, admiration, pride, anger, joy - these were the emotions I had for a whole year. The more I studied this product, the stronger were the feelings.



Marketing priming from the authors sounds like this:

BaseX is a very lightweight, high-performance and scalable XML database with an XPath / XQuery 3.0 processor that has full support for the W3C Update and Full Text specifications. An interactive and friendly graphical interface makes it easy to study your XML documents.


It sounds very tasty, but the reality, as always, hurts the most unprotected places.



What's inside



BaseX is an open source software product written in Java, distributed under the BSD license. Out of the box we get a utility with a graphical interface (management, analysis, code editing with syntax highlighting), server, client, web server. In reality, we get even more, but first things first.

')

Storage



The BaseX team has developed its own repository, to which even scientific work is dedicated. Although inside the data is still stored in a “disassembled” form, this does not prevent it from being a clean, document-oriented database without a fixed structure. And the very concept of the database here is somewhat specific.



In BaseX, a database is a folder in which certain resources are stored. A resource can be either an ordinary file or an XML document. Files are stored in their original form in the file system, and XML documents are transformed into an internal representation. The database can store both XML documents and other resources simultaneously. This way of storing data makes it easy to transfer databases by simple copying.



Usual for the relational model table is absent as a class. But there are XML documents and their collections. Well, what else could be in the XML database? :)



Indices



BaseX can index XML structure, attributes, text, and even make a full-text index (for a limited number of languages). The barrel of tar is that one of the types of indices is static, i.e. updating data leads to index invalidation, and the second, dynamic, is slow. The speed degradation in data insertion operations reaches order.



A database can have only one type of index, static or updated, without the right to change the index. To perform a change operation, you need to export the data, create a database with a different type of index, and fill in the data anew.



In general, the index leaves much to be desired. Yes, this is a useful mechanism that can speed up read requests by several orders of magnitude, but the possibilities are very weak. Perhaps I'm too used to relational database indexes.



Transactions



The concept of a transaction that everyone is used to in the world of relational databases is not in BaseX. You cannot explicitly start a transaction, perform several actions, and then complete. A transaction is a server command or a script being executed. BaseX is not a multi-version and write transactions are blocking the base. Since version 7.6, lock management has been moved to the server level, not the file system level, which significantly accelerated the execution of requests.



From the above, we can make a very simple conclusion - BaseX does not like recording. An intensive write load leads to an increase in query latency. But in reading he manifests himself very, very.



Backups



Reservations are made out of the box, but this is more of a solution to the forehead than an elegant mechanism. As we already know, the base is an ordinary directory. BaseX tritely archives it and adds it as a normal zip file. Stupid as the corner of the house. Everything works fine on small files, but if there are several gigabytes of data in the database, then everything becomes expectedly sad. For offline solutions, the time of generating a backup is not a problem, but for a constantly working system, this may cause a short-term refusal of service.



You can run the backup procedure as for a specific database. and for all at once. Recovery is just as easy as creating a backup. Since there may be a lot of zip files, you can restore to a specific backup. Stopping the server for this is not required.



Replication and Redo logs



With this, BaseX has sadness. I have repeatedly raised this issue with the project manager, Christian GrĂĽn, and he promised to consider the possibility of introducing this functionality in the near future, but so far the question has been opened.



Triggers



Another sadness is trouble ...



Administration Features



They are quite simple. You can get users, give them the right to 4 types of actions: read, write, create databases and administer. Not very much, and another and hard to imagine. Again, since the database is a collection of files, additional “protection” can be organized at the file system level.



Action Log



This is also there, but, to be honest, I don’t really like its format and structure. The main complaint about the format of data storage. Some information can be pulled out of it, but it will be somewhat difficult to restore the picture of what is happening.



Client-server architecture



BaseX rises easily as a server. Special drivers are not required for work, since work with the server goes through the ports. To raise several servers at the same time, you just need to spread them to different ports. They can easily perform each other's queries, for this you need to write a few lines of code (or copy from the documentation :)



The data exchange protocol with the server is perfectly documented, writing a client for different languages ​​is not a problem. There are currently ready clients for the following languages ​​or systems: C #, VB, Scala, Java, ActionScript, Perl, PHP, Python, Rebol, Ruby, Haskell, Lisp, node.js, Qt, and, of course, C.



Xquery



This, in my humble opinion, is the biggest plus of this product. For me personally, XQuery is much more attractive than SQL, and there are several reasons for this.



Simplicity


I will not give direct comparisons of XQuery and SQL, but for me the first one is much more logical, more consistent, more readable. I was able to write normal, more or less complex queries on XQuery on the second day. True, after two months we rewrote them, but the fact remains.



Functions


XQuery supports functions and you can either create them directly in the request code, or connect them as modules. In general, XQuery can be called a functional language. Functions allow you to make the code as concise as possible.



Modularity


BaseX supports modularity at two levels: Java modules and xqm (written in XQuery). If to carry out analogs, then these are the purest stored procedures.



Xpath


Another technology that helps write clean and concise code. XPath is used in the formation of primary samples, and as an analogue of JOINs from the SQL world.



Pre-installed modules



As I mentioned above, BaseX has a lot of functionality right out of the box. These modules do not need to be connected additionally and they are immediately accessible from the query language.



If you shorten the list and do not really spray, the list of possibilities looks like this: a module for viewing system data (lists of users, sessions, logs), data archiving (zip), a client for connecting to other BaseX servers and remote query execution, a module for converting data formats, cryptography, working with csv-files, module for managing databases, creating them, optimizing, transferring, restoring, module for requesting information on uri, working with files, module for full-text analysis, for working with hash functions, conversion HTML documents in XML, for working with HTTP requests, requesting information about indexes and inspecting databases, JSON parsing and serialization module, data mapping, math operations, data formatting, system function calls, profiling, repository management XQM modules

JDBC-connector to databases, module for working with streaming data, module for unit testing, validation of documents for DTD and XSD, XSL transformation. Full list here



Not bad for a baseball like BaseX!



Application area



After studying all the pros and cons for the year, there are several main areas of application for this product.





Conclusion



I would like to thank the BaseX team for an impressive product that has opened up a lot of new features and technologies for me!



I would recommend to look closely at this product, even if not using it in real projects, only as a basis for studying XQuery.



Official site of the project: basex.org

Documentation: docs.basex.org

Download: from here

Source: https://habr.com/ru/post/201166/



All Articles