Apparently, it should start with the discussion of the CPython interpreter version 2.7.x (examples were tested on version 2.7.3).
The official site has descriptions of import instructions and modules in Python:
It follows from them that in Python there are packages (package), modules (module) and names defined in modules (names). It should also be noted that in some parts of the documentation modules are called submodules if they are located inside a package.
In Python, the import statement allows you to import packages, modules, and names into the namespace in which the import statement is executed. There are two interesting features:
- The syntax of the import statement does not always explicitly indicate what exactly should be imported: package, module or name
- The syntax of the import statement cannot explicitly indicate that the path to the module is an absolute path (although it is possible to explicitly indicate that the path is relative and you can also change the semantics of the instruction, in terms of using the default absolute path, see www.python.org/dev/ peps / pep-0328 )
Of these two features are the following ambiguities for writing import abcd:
- Import abcd package or import abcd module
- Import the package / module abcd from the CURRENT PACKAGE (from the package of the module in which import abcd is executed), or FROM the PACKAGE in accordance with the list of directories specified in sys.path
More ambiguity examples:
- from abcd import defg: (import the defg module from the abcd package, or import the defg package from the abcd package, or import the defg name from the abcd package, or import the defg name from the abcd module) X (from the same package, or from the package according to sys.path)
- import abcd.defg: (import the defg package from the abcd package, import the defg module from the abcd package) X (from the same package, or from a package in accordance with sys.path)
To resolve these declarative ambiguities, there must be an imperative algorithm. Such an algorithm is described in some form in the official Python documentation.
')
Name search algorithm for import abcd
The search for the name abcd for import occurs according to the following algorithm:
No | What are we looking for? | Where are we looking * | Comment |
---|
one | abcd package | in the current module package (the module in which import abcd is executed) | only if the current module itself is contained in the package ** |
2 | abcd module | in the current module package (the module in which import abcd is executed) | only if the current module itself is contained in the package ** |
3 | abcd module | in built-in modules | reference to the documentation is specified in *** |
four | abcd package | in directories listed in sys.path | reference to the documentation is given in **** |
five | abcd module | in directories listed in sys.path | reference to the documentation is given in **** |
Further search is terminated if the package is successfully searched for at one of the above listed steps.
* Information about the priority of the search package above the module installed empirically, the documentation is not explicitly stated.
** In this case, the __package__ variable of this module is equal to the package name, otherwise it is equal to None.
Link to the documentation:
docs.python.org/2/tutorial/modules.html#intra-package-references“In fact, such references are in the standard module search path.”
(!!!) Here it is worth noting the absence of mention of this fact elsewhere in the same document (http://docs.python.org/2/tutorial/modules.html#the-module-search-path), which is misleading (see
bugs.python.org/issue16891 ).
(!!!) The second thing to note is that this search step is present only if the module in which import abcd is executed is imported from the package itself (that is, using the import <package name>. <Module name >). If this module is imported without specifying a package, or the module is executed as a script, this step will be skipped. This is reflected in the document
www.python.org/dev/peps/pep-0302/#id23 :
“The built-in __import__ function (it can be used in import.c) will be a package or a submodule of a package. If the package is a submodule For example, if a package is named "spam" it doesn’t import eggs, it is a module named "spam.eggs". If you fail, the module named "eggs".
***
docs.python.org/2/tutorial/modules.html#the-module-search-path“When you first searched for a module with that name.”
****
docs.python.org/2/tutorial/modules.html#the-module-search-path“If you’ve found a file of sys.path.
Name search algorithm for import abcd.defg
First, the package or module abcd is searched according to the algorithm described for import abcd.
If the search is successful, the package or module defg is searched according to the following algorithm:
No | What are we looking for? | Where are we looking * | Comment |
---|
one | package defg | in abcd package | link to the documentation is listed in ** |
2 | defg module | in abcd package | link to the documentation is listed in ** |
* If as a result of the search abcd, the latter turned out to be a module, then the import will end with an error ImportError: No module named defg, since the module cannot contain other modules or packages:
docs.python.org/2/reference/simple_stmts.html#import : “A package can contain other modules or modules.”
**
www.python.org/dev/peps/pep-0302/#id23“Deeper down in the mechanism, a dotted name import is split up by its components. For "import spam.ham", it is a first submodule of "spam".
Name search algorithm for from abcd.defg import ghi
First, the package or module abcd.defg is searched in accordance with the algorithm described for import abcd.defg:
No | What are we looking for? | Where are we looking |
---|
one | name ghi | in the package or module defg |
2 | ghi package | in the defg package |
3 | ghi module | in the defg package |
Name overlap
I want to note one interesting feature that follows from the consistent application of the above algorithms. Imagine the following situation: there is a module called abcd and a package called abcd, which contains, in turn, the module defg, the module abcd and the package abcd are located in different directories, and the module abcd is placed in the same package as the module in which execute import instruction abcd.defg. In this case, the import will fail. This is due to the fact that the Python interpreter will first find the abcd module, then try to look for the defg module in it, which is impossible.
It would be more reasonable to determine from the syntax of the import statement that abcd can only be a package (since all elements up to a point can only be packages) and search for abcd only as a package. In this case, the abcd package would be imported from another directory, and the defg module would be found in it and the program would continue to run without errors.
Unfortunately, this Python behavior is not implemented. See
bugs.python.org/issue16891#msg179353 .
The author of the article encountered this problem, but due to the diversity of the description in the official Python documentation, it took some time to figure out the reasons for this behavior of the interpreter. As a result, the following discussion has emerged on
stackoverflow.com/questions/14183541/why-python-finds-module-instead-of-package-if-they-have-the-same-name and
bugs.python.org/issue16891 , and also written this article.
In the event of such name conflicts, the following solutions are possible:
- Rename a module or package so that the package name does not match the module name, i.e. exclude name matching
- Sometimes it can help to include the absolute default paths for import using the from __future__ import absolute_import statement (in fact in this case, it only increases control over the sequence of searching for packages and modules by making changes to sys.path)
Application: source code
This repository contains source code that demonstrates the algorithms described above:
bitbucket.org/dmugtasimov/python_import .