Mimesis is a library for the Python programming language that helps generate dummy data for various purposes. The library is written using tools included in the standard library of the Python language, so it has no third-party dependencies. At the moment, the library supports 30 language standards (including Russian) and more than 20 provider classes that provide all sorts of data.
The ability to generate fictitious, but at the same time valid data is very useful when developing applications that involve working with a database. Manual filling of the database is rather time consuming and laborious process, which is performed in at least 3 stages:
This difficult task is really complicated at the moment when it is necessary to generate not 10-15 users, but 100-150 thousand users (or some other kind of data). In this and the next two articles we will try to draw your attention to a tool that at times simplifies the process of generating test data, initial loading of the database and testing in general.
Supported language standards:
No | Code | Title |
---|---|---|
one | cs | Czech |
2 | da | Danish |
3 | de | Deutsch |
four | de-at | Austrian German |
five | de-ch | Swiss German |
6 | en | English |
7 | en-au | Australian English |
eight | en-ca | Canadian English |
9 | en-gb | British English |
ten | es | Spanish |
eleven | es-mx | Mexican spanish |
12 | fa | Persian (Farsi) |
13 | fi | Finnish |
14 | fr | French |
15 | hu | Hungarian |
sixteen | is | Icelandic |
17 | it | Italian |
18 | ja | Japanese |
nineteen | ko | Korean |
20 | nl | Dutch |
21 | nl-be | Belgian Dutch |
22 | no | Norwegian |
23 | pl | Polish |
24 | pt | Portuguese |
25 | pt-br | Brazilian Portuguese |
26 | ru | Russian |
27 | sv | Swedish |
28 | tr | Turkish |
29 | uk | Ukrainian |
thirty | zh | Chinese |
The list of supported provider classes is constantly expanding. All supported data providers are listed here .
In addition to those listed above, country-specific data is also supported, which can be imported from the builtins
subpacket:
No | Provider | Methods |
---|---|---|
one | USASpecProvider | tracking_number (), ssn (), personality () |
2 | JapanSpecProvider | full_to_half (), half_to_full () |
2 | RussiaSpecProvider | patronymic (), passport_series (), passport_number (), snils () |
2 | BrazilSpecProvider | cpf (), cnpj () |
Mimesis
installed as usual, i.e. via the pip
package manager. To install the latest fresh version of the library, run the following command:
➜ ~ pip install mimesis
If for some reason you are unable to install the package with pip
, then try installing it manually, as shown below:
(venv) ➜ git clone https://github.com/lk-geimfari/mimesis.git (venv) ➜ cd mimesis/ (venv) ➜ python3 setup.py install # (venv) ➜ make install
Please note that the library only works on Python 3.5 +
. There are no plans for developers to add support for Python 2.7
.
Initially, we planned to show the generation of data on the example of a small web application on Flask, but decided to abandon this idea, for the reason that not everyone is familiar with Flask and not everyone is eager to correct it. Because we will show everything in pure Python. In case you want to transfer everything to your project on Flask or Django, then you only need to define a static method that performs all the manipulations associated with the current model and calls it at that moment when you need to perform the initial loading of the database, as shown in the example below.
The model for Flask ( Flask-SQLAlchemy
) will look something like this:
class Patient(db.Model): id = db.Column(db.Integer, primary_key=True) email = db.Column(db.String(120), unique=True) phone_number = db.Column(db.String(25)) full_name = db.Column(db.String(100)) weight = db.Column(db.String(64)) height = db.Column(db.String(64)) blood_type = db.Column(db.String(64)) def __init__(self, **kwargs): super(Patient, self).__init__(**kwargs) @staticmethod def _bootstrap(count=2000, locale='en'): from mimesis.providers import Personal person = Personal(locale) for _ in range(count): patient = Patient( email=person.email(), phone_number=person.telephone(), full_name=person.full_name(gender='female'), weight=person.weight(), height=person.height(), blood_type=person.blood_type() ) db.session.add(patient) try: db.session.commit() except Exception: db.session.rollback()
Go to shell-mode:
(venv) ➜ python3 manage.py shell
And we generate the data, making sure that the database and the experimental model are available.
>>> db <SQLAlchemy engine='sqlite:///db.sqlite'> >>> Patient <class 'app.models.Patient'> >>> Patient()._bootstrap(count=4000, locale='ru') # 4 .
I would like to note that we will cite in the examples only the basic capabilities of the library and, in the main, we will manage with several of the most frequently encountered provider classes, because there are too many of them to cover all of them in detail. If the article makes you interested in the library, you can find useful links at the very end of the article and study everything yourself.
The library is quite simple and all you need to start working with this data is to create an instance of the provider class. The most common data in web applications is personal user data, such as ,
,
,
,
, etc. To generate such data, there is a special provider class -
Personal()
, which accepts locale code as a string, as shown below:
>>> from mimesis import Personal # - . >>> person = Personal('is') # . >>> for _ in range(0, 3): ... person.full_name(gender='male') `Karl Brynjúlfsson` `Rögnvald Eiðsson` `Vésteinn Ríkharðsson`
Almost every web application requires you to enter an e-mail address during registration. The library, of course, supports the ability to generate e-mail addresses and is done using the email()
method of the Personal()
class, as shown below:
# : >>> person.email(gender='female') >>> 'lvana6108@gmail.com' # : >>> person.email(gender='male') 'john2454@yandex.com'
In the way that was mentioned above, there is a small problem that can contaminate the code a little if the application uses not one single provider class, but several. In such cases, use the Generic()
object, which gives access to all providers from a single object, as shown below:
>>> from mimesis import Generic >>> # ISO 639-1, pl - . >>> g = Generic('pl') >>> g.personal.full_name() 'Lonisława Podsiadło' >>> g.datetime.birthday(readable=True) 'Listopad 11, 1997' >>> g.personal.blood_type() 'A−'
Combining data gives a lot of room. For example, you can create fictitious holders (female) Visa cards (or MasterCard, Maestro):
>>> user = Personal('en') >>> def get_card(sex='female'): ... owner = { ... 'owner': user.full_name(sex), ... 'exp_date': user.credit_card_expiration_date(maximum=21), ... 'number': user.credit_card_number(card_type='visa') ... } ... return owner >>> for _ in range(0, 3): ... get_card()
Conclusion:
{'exp_date': '02/20', 'owner': 'Laverna Morrison', 'card_number': '4920 3598 2121 3328'} {'exp_date': '11/19', 'owner': 'Melany Martinez', 'card_number': '4980 9423 5464 1201'} {'exp_date': '01/19', 'owner': 'Cleora Mcfarland', 'card_number': '4085 8037 5801 9703'}
As mentioned above, the library supports more than 20 provider classes that contain data for all occasions (if not, then we are waiting for PR with the correction of this terrible misunderstanding). For example, if you are developing an application focused on shipping or on other activities related to transport and you need to generate transport models, then you can easily do this using the Transport()
provider class, which contains transport data:
>>> from mimesis import Transport >>> trans = Transport() >>> for _ in range(0, 5): ... trans.truck() 'Seddon-2537 IM' 'Karrier-7799 UN' 'Minerva-5567 YC' 'Hyundai-2808 XR' 'LIAZ-7174 RM'
Well, or you can specify the mask of the transport model:
>>> for _ in range(0, 5): ... trans.truck(model_mask="##@") # # - , @ -
Henschel-16G Bean-44D Unic-82S Ford-05Q Kalmar-58C
Often, when testing web applications (blog testing is a vivid example), it becomes necessary to generate text data ( ,
,
etc.). Manually typing text during testing is long and boring, and Mimesis avoids this, thanks to the
Text()
provider class:
>>> from mimesis import Text >>> text = Text('ru') >>> text.text(quantity=3) # quantity - . ' . Python , , -, , -. , , , , .'
You can get a list of random words:
>>> text = Text('pt-br') >>> text.words(quantity=5) ['poder', 'de', 'maior', 'só', 'cima']
Generate street name:
>>> from mimesis import Address >>> address = Address('ru') >>> address.address() '. 651'
Get the name of the subject / state / province of the country to which the selected locale applies. In this case, it is a subject of the Russian Federation:
>>> address.state() ' '
Generate coordinates:
>>> address.coordinates() {'latitude': -28.362892454682246, 'longitude': 11.512065821275826}
The library also has funds for the Romanization of Cyrillic languages (at the time of this writing, only Russian and Ukrainian are supported):
>>> from mimesis.decorators import romanized >>> @romanized('ru') ... def name_ru(): ... return ' ' ... >>> @romanized('uk') >>> def name_uk(): ... return 'іі ' ... >>> name_ru() 'Veronika Denisova' >>> name_uk() 'Emіlіja Akulenko'
In fact, there are a lot of opportunities and you can come up with a huge number of much more illustrative examples in the context of which the data will look more valuable than the above. We are waiting for such examples from you - the readers. We will be very happy to read about the successful experience of applying the library in your projects.
Here you can find the second part of the article.
Here you can read a revised version of this article and many other interesting articles on various topics.
Github: lk-geimfari / mimesis
Read the Docs: mimesis
Thank you for your attention and successful tests for you!
Source: https://habr.com/ru/post/318120/