
This is a continuation of the translation of the article by David Goodger
“Write the code like a real Pythonist: Python idiomatics”Start and
end translation.
')
Thanks to all habrayusers for evaluating the first part, valuable comments and positive comments. I tried to take into account the mistakes, again waiting for a constructive discussion.
Use in when possible (1)
Good:
for key in d:
print key
- in is usually faster.
- This method also works for elements of arbitrary containers (such as lists, tuples, sets).
- in is also an operator (as we shall see).
Poorly:
for key in d.keys ():
print key
This also applies to all objects with the keys () method.
Use in when possible (2)
But .keys () is
required when changing the dictionary:
for key in d.keys ():
d [str (key)] = d [key]
d.keys () creates a static list of dictionary keys. Otherwise, you would get the exception “RuntimeError: dictionary changed size during iteration” (The size of the dictionary changed during the iteration).
It is more correct to use key in dict, and not dict.has_key ():
# Do this:
if key in d:
... do something with d [key]
# not so:
if d.has_key (key):
... do something with d [key]
This code uses in as an operator.
Get dictionary method
Often we need to fill the dictionary with data before use.
A naive way to do this:
navs = {}
for (portfolio, equity, position) in data:
if portfolio not in navs:
navs [portfolio] = 0
navs [portfolio] + = position * prices [equity]
dict.get (key, default) avoids checks:
navs = {}
for (portfolio, equity, position) in data:
navs [portfolio] = (navs.get (portfolio, 0)
+ position * prices [equity])
So more correct.
Setdefault (1) dictionary method
Let now we have to initialize the values of the elements of a non-static dictionary, where each element is represented by a list. Here is another naive way:
Initializing elements of a variable dictionary:
equities = {}
for (portfolio, equity) in data:
if portfolio in equities:
equities [portfolio] .append (equity)
else:
equities [portfolio] = [equity]
dict.setdefault (key, default) makes this work more efficiently:
equities = {}
for (portfolio, equity) in data:
equities.setdefault (portfolio, []). append (
equity)
dict.setdefault () is equivalent to “get, or install and get” (“get, or set & get”). Or "install if necessary, then receive" ("set if necessary, then get"). This is especially effective if the dictionary key is difficult to calculate or take a long time to type from the keyboard.
Only there is a problem with dict.setdefault (), it is that the default value is always calculated, regardless of whether it is needed or not. This is important if the default value is expensive to calculate.
If the default value is difficult to calculate, it may be more convenient to use the class defaultdict, which we briefly consider.
Setdefault (2) dictionary method
Here we see how the setdefault method can also be a separately used expression:
navs = {}
for (portfolio, equity, position) in data:
navs.setdefault (portfolio, 0)
navs [portfolio] + = position * prices [equity]
The setdefault dictionary method returns the default value, but we ignore it here. We bypass the side effect of setdefault, which results in assigning a value to a yet not initialized dictionary element.
defaultdict
New in Python 2.5.
defaultdict appeared in Python 2.5, as part of the collections module. defaultdict is identical to regular dictionaries, except for two things:
- it takes the generating function as the first argument; and
- when the dictionary key is encountered for the first time, the generating function is called, and its result initializes the value of the new dictionary element.
Here are two ways to get the defaultdict:
- import the collections module and refer to it via the name of this module,
- or import the name defaultdict directly:
import collections
d = collections.defaultdict (...)
from collections import defaultdict
d = defaultdict (...)
Here is the previous example, where each dictionary entry is initialized as an empty list, rewritten with defaultdict:
from collections import defaultdict
equities = defaultdict (list)
for (portfolio, equity) in data:
equities [portfolio] .append (equity)
In this case, the generating function list returns an empty list.
And this example shows how to get a dictionary with a default value = 0: this is done using the generating function int:
navs = defaultdict (int)
for (portfolio, equity, position) in data:
navs [portfolio] + = position * prices [equity]
Still be careful with defaultdict. You cannot get a KeyError exception from a properly initialized defaultdict. You can use the “key in dict” condition if you need to check for the existence of a specific key.
Compilation and analysis of dictionaries
Here is a useful technique for compiling a dictionary from two lists (or sequences): one is a list of keys, the other is from values.
given = ['John', 'Eric', 'Terry', 'Michael']]
family = ['Cleese', 'Idle', 'Gilliam', 'Palin']
pythons = dict (zip (given, family))
>>> pprint.pprint (pythons)
{'John': 'Cleese',
'Michael': 'Palin',
'Eric': 'Idle',
'Terry': 'Gilliam'}
The reverse, of course, is trivial:
>>> pythons.keys ()
['John', 'Michael', 'Eric', 'Terry']
>>> pythons.values ()
['Cleese', 'Palin', 'Idle', 'Gilliam']
Note that the order of the .keys () and .values () results is different from the order of the elements when a dictionary is created. The order at the entrance is different from the order at the exit. This is because the vocabulary is essentially unordered. However, the output order is guaranteed to match (the order of the keys corresponds to the order of values), as far as possible, if the dictionary has not changed between calls.
Validation test
# do this: # and not:
if x: if x == True:
pass pass
This is an elegant and effective way to check the validity of Python objects (or Boolean values).
Check list:
# do like this: # not like this:
if items: if len (items)! = 0:
pass pass
# and definitely wrong:
if items! = []:
pass
Meaning of truth
The names True and False are instances of the built-in boolean type. Like None, only one instance of each is created.
False | True |
---|
False (== 0) | True (== 1) |
"" (empty line) | any string except "" ("", "something")) |
0, 0.0 | any number except 0 (1, 0.1, -1, 3.14) |
[], (), {}, set () | any non-empty container ([0], (None,), ['']) |
None | almost any object that is clearly not False |
An example of the meaning of Truth in objects:
>>> class C:
... pass
...
>>> o = C ()
>>> bool (o)
True
>>> bool (C)
True
(Examples: follow the
truth.py .)
To control the truth of instances of user-defined classes, use the special method __nonzero__ or __len__. Use __len__ if your class is a container with a length:
class MyContainer (object):
def __init __ (self, data):
self.data = data
def __len __ (self):
"" "Return my length." ""
return len (self.data)
If your class is not a container, use __nonzero__:
class MyClass (object):
def __init __ (self, value):
self.value = value
def __nonzero __ (self):
"" "Return my truth value (True or False)." ""
# This could be arbitrarily complex:
return bool (self.value)
In Python 3.0, __nonzero__ has been renamed to __bool__ for consistency with the built-in type bool. For compatibility, add this code to the class definition:
__bool__ = __nonzero__
Index and Item (Index & Item) (1)
Here’s a tricky way to save some typed text to a list of words:
>>> items = 'zero one two three'.split ()
>>> print items
['zero', 'one', 'two', 'three']
Let's say we want to iterate over the elements and we need both the elements themselves and their indices:
- or -
i = 0
for item in items: for i in range (len (items)):
print i, item print i, items [i]
i + = 1
Index and Item (Index & Item) (2): enumerate
The enumerate function takes a list argument and returns pairs (index, item) (number, element):
>>> print list (enumerate (items))
[(0, 'zero'), (1, 'one'), (2, 'two'), (3, 'three')]
We need to convert to the list to get the full result, because the enumerate is a lazy function: it generates one element, a pair, in one call, as if "how much was asked." The for loop is the place that iterates over the list and causes one result per pass. enumerate is an example of generators (
generator ), which we will look at in more detail later. print does not take one result at a time — we want a general result, so we must explicitly convert the generator to a list when we want to print it.
Our cycle becomes much simpler:
for (index, item) in enumerate (items):
print index, item
# compare: # compare:
index = 0 for i in range (len (items)):
for item in items: print i, items [i]
print index, item
index + = 1
The enumerate variant is significantly shorter and simpler than the method on the left, and also easier to read and understand.
An example showing how the enumerate function actually returns an iterator (the generator is a kind of iterator):
>>> enumerate (items)
<enumerate object at 0x011EA1C0>;
>>> e = enumerate (items)
>>> e.next ()
(0, 'zero')
>>> e.next ()
(1, 'one')
>>> e.next ()
(2, 'two')
>>> e.next ()
(3, 'three')
>>> e.next ()
Traceback (most recent call last):
File "<stdin>", line 1, in?
StopIteration
Other languages have “variables”
In many other languages, assigning a variable puts a value in a cell.
int a = 1;
| 
|
Cell “a” now contains integer 1.
Assigning a different value to the same variable replaces the contents of the cell:
a = 2;
| 
|
Now cell “a” contains integer 2.
Assigning one variable to another creates a copy of the value and places it in a new cell:
int b = a;
| 
| 
|
“B” is the second cell, with a copy of integer 2. Cell “a” has a separate copy.
Python has "names"
In Python, “names” or “identifiers” are similar to labels, (tags, tags) attached to an object.
a = 1
| 
|
Here the whole one has the label "a".
If we reassign “a”, we simply move the shortcut to another object:
a = 2
| 
| 
|
Now the name "a" is attached to the whole object 2.
The source object of integer 1 no longer has the label “a”. He may live a little longer, but we cannot get him by the name "a." When an object no longer has references or tags, it is removed from memory.
If we assign one name to another, we simply attach another label to an existing object:
b = a
| 
|
The name “b” is simply the second label assigned to the same object as “a”.
Although we usually say “variables” in Python (because this is a recognized terminology), we actually mean “names” or “identifiers”. In Python, “variables” are references to values, not named cells.
If you haven’t got anything from this tutorial yet, I hope you understand how the names work in Python. A clear understanding will undoubtedly do good service, will help you to avoid cases like this:
?
(for some reason, there is no example code - approx. transl.)Default Parameter Values
This is a common mistake that beginners often make. Even more advanced programmers allow it if they don't understand enough names in Python.
def bad_append (new_item, a_list = []):
a_list.append (new_item)
return a_list
The problem here is that the default value for a_list, an empty list, is calculated only during the function definition. Thus, every time you call a function, you get
the same default value. Try this several times:
>>> print bad_append ('one')
['one']
>>> print bad_append ('two')
['one', 'two']
Lists are changeable objects, you can change their contents. The correct way to get the default list (or dictionary, or set) is to create it at run time,
and not in a function declaration :
def good_append (new_item, a_list = None):
if a_list is None:
a_list = []
a_list.append (new_item)
return a_list
% formatting lines
In Python, the% operator works like the sprintf function from C.
Although, if you do not know C, this is not much for you. In general, you specify a template or format and substitute values.
In this example, the template contains two view specifications: "% s" means "insert a string here", and "% i" means "convert an integer to a string and insert here." "% s" is especially useful because it uses the built-in Python function str () to convert any object into a string.
Substituted values must match the pattern; we have here two values composed in a tuple.
name = 'David'
messages = 3
text = ('Hello% s, you have% i messages'
% (name, messages))
print text
Conclusion:
Hello David, you have 3 messages
For details, see the
Python Library Reference , section 2.3.6.2, “String Formatting Operations”. Bookmark this!
If you haven't done so already, go to python.org, download the HTML documentation (in .zip or whatever you like), and install it on your machine. There is nothing more useful than having the fullest guidance at your fingertips.
Extended% formatting of strings
Many do not know that there are other, more flexible ways of formatting strings:
By name with a dictionary:
values = {'name': name, 'messages': messages}
print ('Hello% (name) s, you have% (messages) i'
'messages'% values)
Here we define the names for the substituted values that are searched in the dictionary.
Notice redundancy? The names "name" and "messages" are already defined in the local
namespace. We can improve it.
By name using local namespace:
print ('Hello% (name) s, you have% (messages) i'
'messages'% locals ())
The locals () function returns a dictionary of all identifiers available in the local namespace.
This is a very powerful tool. With it, you can format all strings as you want without having to worry about matching the substituted values to the template.
But be careful. (“With great power comes great responsibility.”) If you use locals () with externally-connected string patterns, you provide your local namespace to the caller. It's just that you know.
To check your local namespace:
>>> from pprint import pprint
>>> pprint (locals ())
pprint is also a useful feature. If you do not know yet, try to play with her. It makes debugging your data structures much easier!
Extended% formatting of strings
The attribute namespace of an object instance is just a dictionary, self .__ dict__.
By name using instance namespace:
print ("We found% (error_count) d errors"
% self .__ dict__)
Equivalent but more flexible than:
print ("We found% d errors"
% self.error_count)
Note: class attributes in class __dict__. Browsing the namespace is actually a dictionary search.
The final part of the translation.