We came here in our office as the most beautiful conversation and quickly put together a list of lists in Python. Really how?
Even this seemingly trivial task can be solved in several ways that differ significantly in speed and expressiveness.
OPTION 1Everyone knows that the elements of the list can be iterated through in a loop, and that elements can be added to the end. This leads us to the first solution:
def listmerge1 ( lstlst ) :
all = [ ]
for lst in lstlst:
for el in lst:
all . append ( el )
return all
Not only did the function stretch as much as 6 lines, but in addition it is also not effective.
Let's try to improve it in both senses: speed and beauty (“pythonic way”).
OPTION2Here we recall that in Python there is an operator "+" for strings and we get:
def listmerge2 ( lstlst ) :
all = [ ]
for lst in lstlst:
all = all + lst
return all
This is the slowest implementation. The puncture is that in this form the operator "+" in each step creates a new list object, which is thrown out in the next step, etc.
')
OPTION 3It is easy to fix, it is necessary to replace the "+" with a form that does not create a new list, but adds to the old one. This is the "+ =" operator, but I prefer to write the "extend" method explicitly.
So we got the fastest option, but not the shortest.
def listmerge3 ( lstlst ) :
all = [ ]
for lst in lstlst:
all . extend ( lst )
return all
I will write all subsequent decisions through lambda expressions, since they consist of one expression. The name of the argument is abbreviated to ll, it does not reduce readability in the one-line code.
# via anonymous function
listmerge = lambda ll: simple-statement
# equivalent to
def listmerge ( ll ) :
return simple-statement
OPTION 4Using the built-in functions of working with lists, you can rewrite version 2 in the style of functional programming:
listmerge4a = lambda ll: reduce ( lambda a, b: a + b, ll, [ ] )
listmerge4b = lambda ll: sum ( ll, [ ] )
He is a little bit faster, but still inhibitory, for the same reason as his iterative relative. Here "lambda a, b: a + b" is an anonymous function of two arguments that simply returns their sum. Option B is just a shortcut built into Python for easy calculation of the sum of the elements. This option is the shortest.
Personally, I do not like neither the shortest (speed) nor the fastest (beauty). Let's try to find a compromise.
OPTION5Using list expressions:
listmerge5 = lambda ll: [ el for lst in ll for el in lst ]
Not much longer than the previous one, but radically faster. The variant is undoubtedly beautiful, although the nested list expressions are not always clear at first glance.
OPTION6And what if you try to rewrite the fastest version in the functional style? Easy:
listmerge6 = lambda s: reduce ( lambda d, el: d. extend ( el ) or d, s, [ ] )
Note "
d.extend (el) or d " we had to add the "
or " operator; the
extend method returns
None . In terms of speed, it is practically not inferior to the fastest method No. 3 (the difference in speed is literally a few percent and in my opinion is not significant).
In my opinion, the "
editorial choice " should be awarded option
number 6 )
For measuring the speed of small pieces of code in Python there is a
timeit library. Here is an example of code testing options 3, 5 and 6 (the fastest and most beautiful).
import timeit
variants = {
'Reduce' :
'listmerge = lambda s: reduce (lambda d, el: d.extend (el) or d, s, [])' ,
'Iterate' :
"" "
def listmerge (lstlst):
all = []
for lst in lstlst:
all.extend (lst)
return all
"" " ,
'Comprehension' :
'listmerge = lambda ll: [x for lst in ll for x in lst]' ,
}
initstr = 'lstlst = [range (i) for i in range (1000)] \ n gc.enable ()'
def test ( variants, initstr, n = 100 ) :
print "Test repeats n =" , n, "times \ n INITSTR:" , initstr, " \ n \ n "
for k, v in variants. iteritems ( ) :
print k, "-" , timeit . Timer ( "listmerge (lstlst)" , initstr + " \ n " + v ) . timeit ( n )
print
test ( variants, initstr, 100 )
An example of running a test time. It can be seen that the difference in speed between the iterative and functional variants is vanishingly small. The variant on the list expressions is noticeably slower (there is no way out of error here), but the size of our lists is huge, for applications that are not critical to speed, it also has the right to life.
Test repeats n = 100 times
INITSTR: lstlst=[range(i) for i in range(1000)]
gc.enable()
Iterate - 1.56133103371
Reduce - 1.57647109032
Comprehension - 7.5749669075
HOMEWORKI propose to solve / discuss the more complex task of expanding nested lists to a linear one.
Example:
Source list:
[ 7 , [ [ [ [ 2 ] ] ] , [ [ [ ] ] , [ 4 ] ] , [ 4 , 5 , [ 6 , 7 ] ] , 8 ]
# Result:
[ 7 , 2 , 4 , 4 , 5 , 6 , 7 , 8 ]
UPD: Answer
habrahabr.ru/blogs/python/63539/#comment_1764419UPD2:OPTION 6B (from an anonymous commentator in LJ)
import operator
listmerge6b = lambda s: reduce ( operator . iadd , s, [ ] )