One simple little problem. Fast, beautiful or clean?

I believe that 99% of Python developers solved this problem in one way or another, since it is included in the standard set of tasks offered by them for job seekers of Python Developer in one well-known company.

#     .    ,    . #  ,        . #     ,      None. # ,    ,  .

Below curiosity for the sake of I gave a list of solutions analyzed

To evaluate the correctness of the code, I sketched a simple unit test.

Option 1
')

 import unittest def dict_from_keys(keys, values): res = dict() for num, key in enumerate(keys): try: res[key] = values[num] except IndexError: res[key] = None return res class DictFromKeysTestCase(unittest.TestCase): def test_dict_from_keys_more_keys(self): keys = range(1000) values = range(900) for _ in range(10 ** 5): result = dict_from_keys(keys, values) self.assertEqual(keys,result.keys()) def test_dict_from_keys_more_values(self): keys =range(900) values = range(1000) for _ in range(10 ** 5): result = dict_from_keys(keys, values) self.assertEqual(keys, result.keys())

Here I gave the first solution I found. Run unit test:

 random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 26.118s OK

What was the first moment I mentioned right away? The use of dict () is a function call, while the use of {} is a syntax. Replace the dictionary initialization:

 random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 25.828s OK

Trifle, but nice. Although it can be attributed to the error. From the next option, I cried in blood, but still bring it here:

Option 2

 def dict_from_keys(keys, values): res = {} it = iter(values) nullValue = False for key in keys: try: res[key] = it.next() if not nullValue else None except StopIteration: nullValue = True res[key] = None return res

Test result:

 random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 33.312s OK

No comments.

Option 3

The following solution:

 def dict_from_keys(keys, values): return {key: None if idx>=len(values) else values[idx] for idx, key in enumerate(keys)}

Test result:

 random1st@MacBook-Pro ~/P/untitled [1]> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 26.797s OK

As you can see, a significant acceleration was not achieved. Another variation on the topic:

Option 4

 def dict_from_keys(keys, values): return dict((len(keys) > len(values)) and map(None, keys, values) or zip(keys, values))

Result:

 random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 20.600s OK

Option 5

 def dict_from_keys(keys, values): result = dict.fromkeys(keys, None) result.update(zip(keys, values)) return result

Result:

 random1st@MacBook-Pro ~/P/untitled [1]> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 17.584s OK

The expected use of built-in functions provides significant acceleration. Is it possible to achieve even more impressive results?

Option 6

 def dict_from_keys(keys, values): return dict(zip(keys, values + [None] * (len(keys) - len(values))))

Result:

 random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 14.212s OK

Even faster:

Option 7

 def dict_from_keys(keys, values): return dict(itertools.izip_longest(keys, values[:len(keys)]))

Result:

 random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 10.190s OK

A reasonable question arises whether it is possible to get something faster than this solution. Obviously, if you can't do the calculations faster, they should be made lazy. The dubiousness of this option is obvious, but now everything depends on the context. In particular, the following code passes tests for itself, however this is not a fully Python dictionary:

Option 8

 class DataStructure(dict): def __init__(self, *args, **kwargs): super(DataStructure, self).__init__(*args, **kwargs) self._values = None self._keys = None @classmethod def dict_from_keys_values(cls, keys, values): obj = cls() obj._values = values[:len(keys)] obj._keys = keys return obj def __getitem__(self, key): try: return super(DataStructure, self).__getitem__(key) except KeyError: try: idx = self._keys.index(key) self._keys.pop(idx) super(DataStructure, self).__setitem__( key, self._values.pop(idx) ) except ValueError: raise KeyError except IndexError: super(DataStructure, self).__setitem__(key, None) return super(DataStructure, self).__getitem__(key) def keys(self): for k in self._keys: yield k for k in super(DataStructure, self).keys(): yield k

 random1st@MacBook-Pro ~/P/untitled [1]> python -m unittest dict_from_keys .. ---------------------------------------------------------------------- Ran 2 tests in 1.219s OK

From myself I would add that I personally am most impressed by the 6th version, both in terms of readability and speed.
PS Once again I was amazed at the number of commentators of an absolutely useless article.

Source: https://habr.com/ru/post/315170/

All Articles

One simple little problem. Fast, beautiful or clean?

More articles: