📜 ⬆️ ⬇️

Databene Benerator - Benefit from it.

“Get the dictionary and see what“ catharsis ”is. If this is what he wants to dunk us with, I want to know what it is. ”(C) Analyze it!

Introduction


Late in the evening, when the design of the database on 64 tables was almost complete, and the interface for filling them out had not even begun, the question arose of how to fill them with data.
Fill in by hand - the idea was cast aside immediately.
“It is necessary to nakodit something!” Shouted the soul.
“We need to download something!” Insisted the mind!
As a result, I went through the Internet and found about a dozen different kinds of solutions, both installed and SaaS, both paid and free - I found it - a databene-benerator-generator of related data (fixtures) for databases. And the article in Russian, with a description of features and syntax (1), as well as her, but in English (2). I understand - this is what you need! But where to get it? How to use it under Windows? Conveniently? With the support of Russian characters?

And so "catharsis" (3) is a concept in ancient philosophy; The term for the process and the result of the facilitating, purifying and refining effects on a person of various factors.
')
How does this relate to the topic of publication? You will understand if you read this. I invite under cut!

Creating a project in Eclipse


What was described in the aforementioned articles about the “generator” did not quite fit me.
  1. I use for work Windows;
  2. I love GUI (such weakness as the seals ... Well, you understand).
  3. I can work with MySQL, but not with PostgreSQL.
  4. I need data also in Russian.

If at least one of the above is suitable for you, it means that you and I have another way, or rather another entrance to this way!

First we need to get the bogenerator itself, for this we need to fill in the form at:
bergmann-it.de/download/download_benerator?lang=en
and click "Download".

At the time of publication, 0.9.8 is available, I used 0.9.7, in fact, you probably will not notice the difference, since the most recent manual I could find is the one (4) for version 0.8.1.

I stumbled upon it quite by accident, comparing the version in the manual (http://databene.org/download/databene-benerator-manual-0.7.6.pdf) on the site and the version of the generator. I began to select the version in the address of the manual, and what a surprise it was to find 0.8.1 !!! Further searches were unsuccessful ...

And so you did it! In our hands, i.e. at your fingertips archive "databene-benerator-0.9.7" (you have fresh). Now what to do with it.

We unpack in “D: \ databene-benerator-0.9.7”.

And then shamanism is pure: the maven is mentioned on the forums - I don’t know who the beast is, but I’ll say that it works without it!
By not tricky operation, I see what is in the archive. There are batnichki (or sh-scripts, by the way, too) that launch something ... benerator.bat launches benerator_common.bat, it launches java.exe. In the parameters of the first benerator.xml. In the second path to the lib folder, and there is * .jar. ...
At that time I tried to work only in two IDEs for Java development - Netbeans and Eclipse. I asked Google the question "databene Benerator eclipse" and in the issue I received the answer " databene.org/databene-benerator/115-my-first-ide-based-benerator-project.html " - but there are no links from the pages of the official website to this page !


Now we need Eclipse, download and unpack it if it is not already there. Any version will do. I am a little familiar with PHP, so you will guess my choice. By the way, the location of the working Eclipse windows (the so-called perspective) for working with the generator is more convenient for PHP (you can choose in the upper right corner).
And so we run Eclipse, create a project:
Choose "File-> New-> Project ..."

Then select “Java Project”, click “Next->”.
In the window that appears, enter the name of the project “generatedb”, and select the Project layout as “Use project folder”, click “Next->”.
Switch to the Libraries tab, click "Add External JARs ...". In the opened window, go to "D: \ databene-benerator-0.9.7 \ lib" and select all the files that are there.
Click “Finish”. Project created.
However, in order to start the bearer, you need to configure the "starter"!
Choose "Run-> Run Configurations ...".

In the window that appears:
1. On "Java Aplication" we make PCM and select "New".
2. Then "Name" indicate the name of our configuration to run.
3. Next, “Project” is left unchanged.
4. And in the "Main class" enter "org.databene.benerator.main.Benerator".
5. Click “Apply”.
If you still click "Run", then in the tab "Console" will be given a large number of lines of different expletives, everything is not in Russian. This is because we have not done the most important thing. So what are we waiting for?

Project structure


It is time to add the files “benerator.xml” and “log4j.xml” to our project, for the absence of which the benerator swore.
PCM on the project in the project browser and select "New-> XML File", enter the file name, then "Finish".
benerator.xml is the main project file, and it describes everything you do with your tables.
log4j.xml is the configuration file of the logging tool, it depends on its configuration what and in what volume the bouncer will spit out into the console (service information).

The contents of "log4j.xml" we bring to the form:
log4j.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd"> <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/" debug="false"> <!-- Append messages to the console --> <appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender"> <param name="Target" value="System.out"/> <param name="Threshold" value="debug"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%d{ABSOLUTE} %-5p (%t) [%c{1}] %m%n"/> </layout> </appender> <!-- Limit categories --> <category name="org.apache"> <priority value="warn"/> </category> <category name="org.databene"> <priority value="info"/> </category> <!-- <category name="org.databene.commons"> <priority value="debug"/> </category> --> <category name="org.databene.COMMENT"> <priority value="debug"/> </category> <category name="org.databene.benerator.STATE"> <priority value="info"/> </category> <category name="org.databene.domain"> <priority value="info"/> </category> <category name="org.databene.SQL"> <priority value="debug"/> </category> <!-- ======================= --> <!-- Setup the Root category --> <!-- ======================= --> <root> <priority value="info"/> <appender-ref ref="CONSOLE"/> </root> </log4j:configuration> 



The contents of the "benerator.xml" lead to the form:
benerator.xml
 <?xml version="1.0" encoding="UTF-8"?> <setup xmlns="http://databene.org/benerator/0.9.7" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://databene.org/benerator/0.9.7 benerator-0.9.7.xsd" defaultEncoding="UTF-8" defaultDataset="RU" defaultLocale="ru" defaultLineSeparator="\r\n" defaultSeparator=";"> <import platforms="db,csv" /> <import defaults="true" domains="organization,address,person,net" /> <import class="org.databene.benerator.distribution.function.*, org.databene.benerator.primitive.*,org.databene.platform.db.*"/> <import class="org.databene.commons.TimeUtil"/> <database id="db" url="jdbc:mysql://localhost:3306/qs?characterEncoding=UTF-8" driver="com.mysql.jdbc.Driver" user="root" password="" catalog="qs"/> <memstore id="memstore"/> </setup> 



The main scenario and some working solutions


Let us dwell on some points of “benerator.xml”, it is discussed in more detail in the first Russian-language article and in the manual for version 0.8.1.
I’ll also note that the magic string “characterEncoding = UTF-8” in the url parameter solves the problem of transferring Russian characters to the database (and not only Russians).

To mention this, the developers on the site, too, have forgotten. Well, yes it is not their concern. The jdbc driver configuration string is universal for Java applications, and as a result I found it somewhere on a non-searchable resource.

Clearing database tables before re-generating
To begin, prepare the file “truncate_tables.mysql.sql” (text)
truncate_tables.mysql.sql
 SET foreign_key_checks = 0; --truncate table s_person; --truncate table s_job_title; --truncate table s_organization; --truncate table s_department; --truncate table t_orgstructure; --truncate table s_type_project; --truncate table s_direction_project; --truncate table s_norm_labor; --truncate table s_timetable; SET foreign_key_checks = 1; 



The first and last lines are disabling and enabling table consistency checking. Otherwise, a situation may arise when one table blocks deleting records from another (referential integrity).
Lines - commands to clean specific database tables. It is desirable to group them by the following principle:
1. A separate group of reference books (not related to any other)
2. Further groups can cascade to be tied up on already filled tables.
Cleaning - either the entire group, or one by one, but follow the dependencies.
Commenting on the table is convenient in order to be able to flexibly perform the repeated generation of data.

After defining memstore, add the following lines to benerator.xml :
  <comment>  </comment> <execute uri="truncate_tables.mysql.sql" target="db" /> 

Be carefull! After the start of the alternator - not commented out tables will be cleared!

Deleting tables and creating a database - in my opinion, it is not worth making such manipulations here. To do this, there are convenient tools for synchronizing the model and database. I use MySQL Workbench (5).

Embedding scripts on ... JavaScript !? Yes, yes on it!
After determining the execute, add the following lines to benerator.xml:
 <comment>  </comment> <execute uri="script.js" type="js"/> 

Create a file "script.js" (text)
script.js
 function toLink (str) { var space = ''; str = str.toLowerCase(); var transl = { '': 'a', '': 'b', '': 'v', '': 'g', '': 'd', '': 'e', '': 'e', '': 'zh', '': 'z', '': 'i', '': 'j', '': 'k', '': 'l', '': 'm', '': 'n', '': 'o', '': 'p', '': 'r','': 's', '': 't', '': 'u', '': 'f', '': 'h', '': 'c', '': 'ch', '': 'sh', '': 'sh','': space, '': 'y', '': space, '': 'e', '': 'yu', '': 'ya' } var link = ''; for (var i = 0; i < str.length; i++) { if(/[-]/.test(str.charAt(i))) { //   -  ,    link += transl[str.charAt(i)]; } else if (/[a-z0-9]/.test(str.charAt(i))) { link += str.charAt(i); //   -    ,     } else { if (link.slice(-1) !== space) link += space; //         space } } return link; } function cut(str, cutStart, cutEnd){ return str.substr(cutStart,cutEnd); } 



The first function performs the transliteration of Russian characters into English (taken from (5), with some processing).
The second - cuts a piece of string.

An example of using JavaScript in code:
 <generate type="s_organization" count="20" consumer="db,ConsoleExporter"><variable name="sgn" script="{js: (p.gender.name()=='MALE') ? sgnMALE : sgnFEMALE}"/> <attribute name="email" type='string' script="{js:toLink(p.givenName+p.familyName)+'@'+d}" converter="ToLowerCaseConverter, UniqueStringConverter"/> <variable name="theme_tmp" type='string' generator="new SeedSentenceGenerator('csv/notes.txt',3)" /> <attribute name="theme" maxLength="45" script="{js:cut(theme_tmp,0,44)+'.'}"/></generate> 

The basic idea: everything in the script parameter, inside {js:}, is the essence of JavaScript. Variables are passed transparently, in other respects it can be seen from examples of use.
Already paid attention to the abbreviated notation of the conditional operator if?

Distribution of database tables into separate files for easy generation
It was convenient for me to select each table, or a group of 2-3 interrelated tables, the generation of which cannot be performed independently - into a separate file "* .ben.xml". Each file is commented separately for the convenience of its separate generation.
Please note: these files must have the extension “* .ben.xml”.
In the main file, it looks like this:
 <!-- <include uri="table.s_organization.ben.xml" /> --> <!-- <include uri="table.s_job_title.ben.xml" /> --> <!-- <include uri="table.s_type_doc.ben.xml" />--> 

Sample file “table.s_organization.ben.xml” (XML)
table.s_organization.ben.xml
 <?xml version="1.0" encoding="UTF-8"?> <setup xmlns="http://databene.org/benerator/0.9.7" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://databene.org/benerator/0.9.7 benerator-0.9.7.xsd" defaultEncoding="UTF-8" defaultDataset="RU" defaultLocale="ru" defaultLineSeparator="\r\n" defaultSeparator=";" > <comment>[[POPULATE TABLE s_organization]] Processed...</comment> <generate type="s_organization" count="20" consumer="db,ConsoleExporter"> <attribute name="bik" type='string' pattern='[0-9]{9}'/> <variable name="c" generator="CompanyNameGenerator" dataset="US" locale="us"/> <attribute name="caption" type='string' script="c.fullName" /> <attribute name="short_caption" type='string' script="c.shortName" /> <attribute name="form_sobs" type='string' script="c.legalForm" /> <variable name="a" generator="AddressGenerator" dataset="US" locale="us"/> <attribute name="ur_strana" type='string' script="a.country" /> <attribute name="ur_index" type='string' pattern="[0-9]{6}"/> <attribute name="ur_nas_punkt" type='string' script="a.city" /> <attribute name="ur_ulica" type='string' script="a.street" /> <attribute name='ur_dom' type='int' min='1' max='150' /> <attribute name='ur_office' type='int' min='1' max='100' /> <attribute name="telefon" type="string" script="a.officePhone" unique="true" /> <attribute name="faks" type="string" script="a.fax" unique="true" /> <variable name="d" generator="DomainGenerator" dataset="US" locale="us"/> <variable name="p" generator="PersonGenerator" dataset="RU" locale="ru"/> <variable name="tag1" source="memstore" type="sgnMALE" distribution="random"/> <variable name="tag2" source="memstore" type="sgnFEMALE" distribution="random"/> <variable name="sgn" script="{js: (p.gender.name()=='MALE') ? sgnMALE : sgnFEMALE}"/> <attribute name="email" type='string' script="{js:toLink(p.givenName+p.familyName)+'@'+d}" converter="ToLowerCaseConverter, UniqueStringConverter"/> <attribute name="webpage" type='string' script="d" converter="ToLowerCaseConverter, UniqueStringConverter"/> <attribute name="fio_ruk" type='string' script="p.familyName +' '+ p.givenName +' '+ sgn.secondgiven"/> <attribute name="rschet" type='string' pattern="[0-9]{20}"/> <attribute name="kschet" type='string' pattern="[0-9]{20}"/> <attribute name="INN" type='string' pattern="[0-9]{10}"/> <attribute name="KPP" type='string' pattern="[0-9]{9}"/> <attribute name="date_update" type="datetime" generator="dtGen"/> <attribute name="note" type='string' generator="new SeedSentenceGenerator('csv/notes.txt',3)" maxLength="255"/> </generate> <comment>[[POPULATE TABLE s_organization]] End. OK!</comment> </setup> 



Please note - the structure is similar to “benerator.xml”, however, it is not necessary to describe the connection to the database and the connection of common modules, since All this has already been done in the main configuration file.

Conclusion


Now, why did I experience “catharsis” - after so much agony everything worked:
1. The databene-benerator was launched and filled the tablets with data, only 2-3 nights and voila is a handy tool for solving an urgent task!
2. It turns out that Russian characters, he understands, and this is my cant, that I am not familiar with the syntax of accessing the jdbc driver in Java projects (universal syntax) - 3 more nights and also everything is smooth!
3. Send the algorithms for filling the tables one by one, they gave up under my head every evening. And all 64 tables managed to fill in another 6 nights.
Yes, there are still many questions, but the main ones are revealed, the task is completed, knowledge is gained, experience has been gained. To change the quantity and quality of records in the tables, I do not need to “shovel” them with my hands. The generator will do its job in a few minutes.

The article does not cover:
1. generation of interconnected tables
2. work with date and time
3. generation of real numbers.
However, this information is in the posts for which there are links, as well as in the documentation. So after such an acceleration, the reader will not be able to master these questions too much.

Especially for the article I registered on github and posted the source code , which can help deal with the examples. To use them, just download as a * .zip archive, unpack it. Create a new project and import “File-> Import-> General-> FileSystem” into it. Mark the whole project and click "Finish". Do not forget to add the "launcher" and the library of the generator.

Thanks for attention!

Materials used


1. habrahabr.ru/post/169713 . [In the Internet]
2. sysmagazine.com/posts/169713 . [In the Internet]
3. ru.wikipedia.org/wiki/Katarsis . [In the Internet]
4. databene.org/download/databene-benerator-manual-0.8.1.pdf . [In the Internet]
5. dev.mysql.com/downloads/workbench . [In the Internet]
6. ajaxs.ru/lesson/javascript/137-transliteracija_stroki_na_javascript.html . [In the Internet]

Source: https://habr.com/ru/post/262387/


All Articles