Automate imports to Python

Before After

Before	After
`import math import os.path import requests # 100500 other imports print(math.pi) print(os.path.join('my', 'path')) print(requests.get)`	`import smart_imports smart_imports.all() print(math.pi) print(os_path.join('my', 'path')) print(requests.get)`

import math import os.path import requests # 100500 other imports print(math.pi) print(os.path.join('my', 'path')) print(requests.get)

 import smart_imports smart_imports.all() print(math.pi) print(os_path.join('my', 'path')) print(requests.get)

It so happened that since 2012 I have been developing an open source browser, being the only programmer. On Python by itself. The browser is not the easiest thing, now there are more than 1000 modules in the main part of the project and more than 120,000 lines of Python code. In total, with satellite projects there will be one and a half times more.

At some point I was tired of messing with the floors of imports at the beginning of each file and I decided to deal with this problem once and for all. This is how the smart_imports ( github , pypi ) library was born.

The idea is quite simple. Any complex project eventually forms its own naming convention for everything. If this agreement is transformed into more formal rules, then any entity can be imported automatically by the name of its associated variable.
')
For example, it will not be necessary to write import math order to refer to math.pi - and so we can understand that in this case math is a module of the standard library.

Smart imports support Python> = 3.5 The library is fully covered with tests, coverage> 95% . I've been using it myself for a year.

For details, I invite under the cat.

How does it work in general?

So, the code from the title image works as follows:

During a call to the smart_imports.all() library, it builds the AST module from which the call was made;
Find uninitialized variables;
The name of each variable is run through a sequence of rules that try to find the module you need to import (or the module attribute) by name. If the rule detects the required entity, the following rules are not checked.
Found modules are loaded, initialized and placed in the global namespace (or the necessary attributes of these modules are placed there).

Uninitialized variables are searched in all places of the code, including the new syntax.

Automatic import is enabled only for those project components that explicitly call smart_imoprts.all() . In addition, the use of smart imports does not prohibit the use of regular imports. This allows you to implement the library gradually, as well as resolve complex cyclical dependencies.

A meticulous reader will notice that the AST module is constructed two times:

the first time it builds CPython during the import of the module;
the second time it is built by smart_imports during a call to smart_imports.all() .

AST can really be built only once (for this you need to integrate into the process of importing modules using import hooks implemented in PEP-0302 , but this solution slows down imports.

What do you think, why?

Comparing the performance of the two implementations (with hooks and without), I came to the conclusion that when importing a module, CPython builds AST in its internal (C-shnyh) data structures. Converting them into Python data structures is more expensive than building a tree from source using the ast module.

Of course, the AST of each module is built and analyzed only once per launch.

Default import rules

The library can be used without additional configuration. By default, it imports modules according to the following rules:

By exact match of the name, it searches for the module next to the current one (in the same directory).
Checks standard library modules:
- by exact name matching for top-level packages;
- for nested packages and modules checks for compound names with the replacement of a dot with an underscore. For example, os.path will be imported with the os_path variable.
By exact name match, it looks for installed third-party packages. For example, the well-known requests packet.

Performance

The work of smart imports does not affect the performance of the program, but increases the time it starts.

Due to the repeated construction of the AST, the time of the first launch increases approximately 1.5-2 times. For small projects this is irrelevant. In large projects, the launch time suffers more from the structure of dependencies between modules than from the import time of a specific module.

~~When~~ smart imports become popular, I will rewrite the work with AST on C - this should significantly reduce startup costs.

To speed up loading, the results of processing AST modules can be cached on the file system. Enables caching in config. Of course, the cache is invalid when the source changes.

The launch time is affected by both the list of search rules for the modules and their sequence. Since some rules use standard Python functionality to search for modules. You can exclude these costs by explicitly specifying that the names and modules match the “Customized Names” rule (see below).

Configuration

The default configuration was described earlier. It should be enough to work with the standard library in small projects.

Default config

 { "cache_dir": null, "rules": [{"type": "rule_local_modules"}, {"type": "rule_stdlib"}, {"type": "rule_predefined_names"}, {"type": "rule_global_modules"}] }

If necessary, a more complex config can be put on the file system.

An example of a complex config (from browser).

During a call to the smart_import.all() library, it determines the position of the calling module on the file system and starts searching for the smart_imports.json file in the direction from the current directory to the root directory. If such a file is found, it is considered the configuration for the current module.

You can use several different configs (placing them in different directories).

There are not so many configuration options:

 { //     AST. //     null —   . "cache_dir": null|"string", //       . "rules": [] }

Import rules

The order of the rules in the config determines the order of their application. The first rule that was triggered stops the further search for imports.

In the examples of configs, the rule_predefined_names rule will often appear rule_predefined_names ; it is necessary for the built-in functions to be correctly recognized (for example, print ).

Rule 1: Predefined Names

The rule allows you to ignore predefined names like __file__ and built-in functions, such as print .

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}] # } import smart_imports smart_imports.all() #        __file__ #        print(__file__)

Rule 2: Local Modules

Checks whether there is a module with the specified name next to the current module (in the same directory). If there is, it imports it.

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_local_modules"}] # } # #    : # # my_package # |-- __init__.py # |-- a.py # |-- b.py # b.py import smart_imports smart_imports.all() #    "a.py" print(a)

Rule 3: Global Modules

It tries to import a module directly by name. For example, requests module.

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_global_modules"}] # } # #    # # pip install requests import smart_imports smart_imports.all() #    requests print(requests.get('http://example.com'))

Rule 4: Customized Names

Corresponds to the name of a specific module or its attribute. Correspondence is indicated in the config of the rule.

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_custom", # "variables": {"my_import_module": {"module": "os.path"}, # "my_import_attribute": {"module": "random", "attribute": "seed"}}}] # } import smart_imports smart_imports.all() #       #        print(my_import_module) print(my_import_attribute)

Rule 5: Standard Modules

Checks if the name is a standard library module. For example math or os.path which is transformed into os_path .

It works faster than the import rule of global modules, since it checks for the presence of a module using a cached list. Lists for each version of Python are taken from here: github.com/jackmaney/python-stdlib-list

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_stdlib"}] # } import smart_imports smart_imports.all() print(math.pi)

Rule 6: Import by Prefix

Imports a module by name, from the package associated with its prefix. It is convenient to use when you have several packages used throughout the code. For example, the utils package modules can be accessed with the utils_ prefix.

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_prefix", # "prefixes": [{"prefix": "utils_", "module": "my_package.utils"}]}] # } # #    : # # my_package # |-- __init__.py # |-- utils # |-- |-- __init__ # |-- |-- a.py # |-- |-- b.py # |-- subpackage # |-- |-- __init__ # |-- |-- c.py # c.py import smart_imports smart_imports.all() print(utils_a) print(utils_b)

Rule 7: Module from parent package

If you have subpackets with the same name in different parts of the project (for example, tests or migrations ), you can allow them to search for modules to be imported by name in the parent packages.

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_local_modules_from_parent", # "suffixes": [".tests"]}] # } # #    : # # my_package # |-- __init__.py # |-- a.py # |-- tests # |-- |-- __init__ # |-- |-- b.py # b.py import smart_imports smart_imports.all() print(a)

Rule 8: Binding to another package

For modules from a specific package, allows searching for imports by name in other packages (specified in the config file). In my case, this rule turned out to be useful for cases when I did not want to extend the work of the previous rule (Module from the parent package) to the whole project.

Example

 # : # { # "rules": [{"type": "rule_predefined_names"}, # {"type": "rule_local_modules_from_namespace", # "map": {"my_package.subpackage_1": ["my_package.subpackage_2"]}}] # } # #    : # # my_package # |-- __init__.py # |-- subpackage_1 # |-- |-- __init__ # |-- |-- a.py # |-- subpackage_2 # |-- |-- __init__ # |-- |-- b.py # a.py import smart_imports smart_imports.all() print(b)

Add your own rules

Adding your own rule is quite simple:

Inheriting from the class smart_imports.rules.BaseRule .
We implement the necessary logic.
Register a rule using the smart_imports.rules.register method
We add the rule to the config.
???
Profit

An example can be found in the implementation of the current rules.

Profit

Multi-line import lists were missing at the beginning of each source.

Reduced the number of rows. Before the browser was transferred to smart imports, it had 6,688 lines responsible for imports. After the transition, there are 2084 left (two lines of smart_imports for each file + 130 imports, called explicitly from functions and similar places).

A nice bonus was the standardization of names in the project. The code has become easier to read and easier to write. There is no need to think about the names of imported entities - there are some clear rules that are easy to follow.

Development plans

I like the idea of defining code properties by variable names, so I will try to develop it both within smart imports and within other projects.

Regarding smart imports, I plan:

Add support for new versions of Python.
Investigate the ability to rely on the current developments of the community on the code type annotation.
Explore the opportunity to make lazy imports.
Implement utilities to automatically generate a config from sources and refactor sources to use smart_imports.
Rewrite a piece of code in C to speed up the work with AST.
Develop integration with linter and IDE, if those will have problems with code analysis without explicit imports.

In addition, I’m interested in your opinion about the library’s default behavior and import rules.

Thank you for mastering this sheet of text :-D

Source: https://habr.com/ru/post/459930/

All Articles

Automate imports to Python

How does it work in general?

Default import rules

Performance

Configuration

Import rules

Rule 1: Predefined Names

Rule 2: Local Modules

Rule 3: Global Modules

Rule 4: Customized Names

Rule 5: Standard Modules

Rule 6: Import by Prefix

Rule 7: Module from parent package

Rule 8: Binding to another package

Add your own rules

Profit

Development plans

More articles: