📜 ⬆️ ⬇️

6 ways to merge list of lists

We came here in our office as the most beautiful conversation and quickly put together a list of lists in Python. Really how?

Even this seemingly trivial task can be solved in several ways that differ significantly in speed and expressiveness.

OPTION 1
Everyone knows that the elements of the list can be iterated through in a loop, and that elements can be added to the end. This leads us to the first solution:
def listmerge1 ( lstlst ) :
all = [ ]
for lst in lstlst:
for el in lst:
all . append ( el )
return all

Not only did the function stretch as much as 6 lines, but in addition it is also not effective.
Let's try to improve it in both senses: speed and beauty (“pythonic way”).

OPTION2
Here we recall that in Python there is an operator "+" for strings and we get:
def listmerge2 ( lstlst ) :
all = [ ]
for lst in lstlst:
all = all + lst
return all

This is the slowest implementation. The puncture is that in this form the operator "+" in each step creates a new list object, which is thrown out in the next step, etc.
')
OPTION 3
It is easy to fix, it is necessary to replace the "+" with a form that does not create a new list, but adds to the old one. This is the "+ =" operator, but I prefer to write the "extend" method explicitly.

So we got the fastest option, but not the shortest.
def listmerge3 ( lstlst ) :
all = [ ]
for lst in lstlst:
all . extend ( lst )
return all

I will write all subsequent decisions through lambda expressions, since they consist of one expression. The name of the argument is abbreviated to ll, it does not reduce readability in the one-line code.
# via anonymous function
listmerge = lambda ll: simple-statement
# equivalent to
def listmerge ( ll ) :
return simple-statement

OPTION 4
Using the built-in functions of working with lists, you can rewrite version 2 in the style of functional programming:
listmerge4a = lambda ll: reduce ( lambda a, b: a + b, ll, [ ] )
listmerge4b = lambda ll: sum ( ll, [ ] )

He is a little bit faster, but still inhibitory, for the same reason as his iterative relative. Here "lambda a, b: a + b" is an anonymous function of two arguments that simply returns their sum. Option B is just a shortcut built into Python for easy calculation of the sum of the elements. This option is the shortest.

Personally, I do not like neither the shortest (speed) nor the fastest (beauty). Let's try to find a compromise.

OPTION5
Using list expressions:
listmerge5 = lambda ll: [ el for lst in ll for el in lst ]

Not much longer than the previous one, but radically faster. The variant is undoubtedly beautiful, although the nested list expressions are not always clear at first glance.

OPTION6
And what if you try to rewrite the fastest version in the functional style? Easy:
listmerge6 = lambda s: reduce ( lambda d, el: d. extend ( el ) or d, s, [ ] )

Note " d.extend (el) or d " we had to add the " or " operator; the extend method returns None . In terms of speed, it is practically not inferior to the fastest method No. 3 (the difference in speed is literally a few percent and in my opinion is not significant).

In my opinion, the " editorial choice " should be awarded option number 6 )

For measuring the speed of small pieces of code in Python there is a timeit library. Here is an example of code testing options 3, 5 and 6 (the fastest and most beautiful).
import timeit

variants = {
'Reduce' :
'listmerge = lambda s: reduce (lambda d, el: d.extend (el) or d, s, [])' ,
'Iterate' :
"" "
def listmerge (lstlst):
all = []
for lst in lstlst:
all.extend (lst)
return all
"" " ,
'Comprehension' :
'listmerge = lambda ll: [x for lst in ll for x in lst]' ,
}

initstr = 'lstlst = [range (i) for i in range (1000)] \ n gc.enable ()'

def test ( variants, initstr, n = 100 ) :
print "Test repeats n =" , n, "times \ n INITSTR:" , initstr, " \ n \ n "
for k, v in variants. iteritems ( ) :
print k, "-" , timeit . Timer ( "listmerge (lstlst)" , initstr + " \ n " + v ) . timeit ( n )
print

test ( variants, initstr, 100 )

An example of running a test time. It can be seen that the difference in speed between the iterative and functional variants is vanishingly small. The variant on the list expressions is noticeably slower (there is no way out of error here), but the size of our lists is huge, for applications that are not critical to speed, it also has the right to life.
Test repeats n = 100 times
INITSTR: lstlst=[range(i) for i in range(1000)]
gc.enable()

Iterate - 1.56133103371
Reduce - 1.57647109032
Comprehension - 7.5749669075


HOMEWORK
I propose to solve / discuss the more complex task of expanding nested lists to a linear one.
Example:
Source list:
[ 7 , [ [ [ [ 2 ] ] ] , [ [ [ ] ] , [ 4 ] ] , [ 4 , 5 , [ 6 , 7 ] ] , 8 ]
# Result:
[ 7 , 2 , 4 , 4 , 5 , 6 , 7 , 8 ]


UPD: Answer habrahabr.ru/blogs/python/63539/#comment_1764419

UPD2:
OPTION 6B (from an anonymous commentator in LJ)
import operator
listmerge6b = lambda s: reduce ( operator . iadd , s, [ ] )

Source: https://habr.com/ru/post/63539/


All Articles