📜 ⬆️ ⬇️

Databene Benerator - test data generation

The essence of the problem


Now there is a lot of material about the unit and load testing. All polls write tests, code is created exclusively through TDD , use jmeter / ab . However, all testing is very closely related to test data. And they need to generate / write. The problem is not acute for unit testing - I put a mock in, chased it and forgot it. But what about stress testing? When do I need 1-2-5-10 objects, but millions ?

image Most (php) developers I met, faced with the task of load testing my code, create several fixtures with their hands and force them ( ab / jmeter ). The obtained test result is not reliable, but they do not think about it. More advanced ones write scripts to generate data, upload them to the database and after that they are already playing. It is commendable, but there are significantly fewer such ones, and the method itself does not seem ideal to me - another programmer may not understand the general hardware fixtures (the creator wrote it quickly and for utilitarian purposes) and sooner or later everyone will either follow the first path or start writing a new generalku.

The value of the correct drawing up of fixtures is now underestimated, many people simply hammer on it because of the laboriousness of such work (let's imagine 15-25 related tables, writing a script for generating fixtures will be very, ahem, interesting). I understand perfectly why the developers are doing this, and when the same problem appeared, I decided not to beat my head against the wall, but to look for tools for the normal generation of related data.
')
I was very surprised, but nothing intelligible was found, I got the feeling that this question simply didn’t interest anyone and I’ll have to write curved scripts all my life with lots of cycles. Nevertheless, a suitable tool was found, we successfully tested it in work, and now I want to present it to you.

Databene Benerator - FTW!


The generator (yes, the funny name) serves for 2 purposes: the generation of data and their anonymization. The latter is beyond the scope of this article, but also a very correct and useful business (modification of the dump database from production to cut user personal data and their credit card numbers).
Tulsa uses your XML script to generate CSV / XML or export directly to the database. It works in general everywhere and supports the following databases:


The script is a series of generate tags in which you describe which entities you will create and in what way. It sounds simple, but there are nuances.

This is a review article, I will not try to describe all the features of this tool, I am not going to translate all the multi-page documentation, my task is to show the way to solve classical cases and, thus, to interest those who need it.

Now I will try to step by step describe the process from downloading to receiving data in a postgrese. It is assumed that you have already installed postgres :)

Unpacking / Installation


Unpack the product archive and add it to the end ~ / .bash_profile :
export BENERATOR_HOME=/path/to/unpacked/benerator
export PATH=$PATH:$BENERATOR_HOME/bin

We carry out:
chmod a+x $BENERATOR_HOME/bin/*.sh

The first script


First we need to get acquainted with the basic structures of the script, how it is built and what it consists of. Let's generate 5 users with several fields and give them to the console.

Create an arbitrary folder, save the benerator.xml in it with the following contents:
benerator.xml
 <?xml version="1.0" encoding="UTF-8"?> <setup xmlns="http://databene.org/benerator/0.7.6" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://databene.org/benerator/0.7.6 benerator-0.7.6.xsd" defaultEncoding="UTF-8" defaultDataset="US" defaultLocale="us" defaultLineSeparator="\n"> <bean id="dtGen" class="DateTimeGenerator"> <property name='minDate' value='2013-01-01'/> <property name='maxDate' value='2013-01-31'/> <property name='dateGranularity' value='00-00-02' /> <property name='dateDistribution' value='random' /> <property name='minTime' value='08:00:00' /> <property name='maxTime' value='17:00:00' /> <property name='timeGranularity' value='00:00:01' /> <property name='timeDistribution' value='random' /> </bean> <import domains="person"/> <generate type="user" count="5" consumer="ConsoleExporter"> <variable name="person" generator="PersonGenerator"/> <attribute name="first_name" script="person.givenName"/> <attribute name="last_name" script="person.familyName"/> <attribute name="birthdate" script="person.birthDate"/> <attribute name="email" script="person.email"/> <attribute name="gender" script="person.gender" map="'MALE'->'true','FEMALE'->'false'"/> <attribute name="created_at" type="timestamp" generator="dtGen"/> </generate> </setup> 


Having launched benerator.sh ./benerator.xml in this folder, you should see the output of the received objects to the console. And now let's carefully examine the benerator.xml and see how it happened.

Source: https://habr.com/ru/post/169713/


All Articles