List comprehensions
Let's continue our cycle of lessons. Good day.
List generation
')
The generation of lists (I do not know how to adequately translate into Russian list comprehensions) is a vivid example of “syntactic sugar”. That is, constructions, without which it is easy to do, but it is much better with it :) List generators, as it is not strange, are designed for convenient processing of lists, to which creation of new lists and modification of existing ones can be attributed.
Suppose we need to get a list of odd numbers not exceeding 25.
In principle, just getting to know the work of the xrange team is easy to solve this problem.
>>> res = []
>>> for x in xrange (1, 25, 2):
... res.append (x)
...
>>> print res
In general, the result is completely satisfied with everything except the long record. this is where our “sugar” comes to the rescue. In its simplest form, it is usually
>>> res = [x for x in xrange (1, 25, 2)]
>>> print res
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23]
The syntax is basically simple. The whole expression is written in square brackets. First there is an expression that will define the elements of the list, then a cycle with which you can change the expression. Both parts can be arbitrarily complex. For example, here's how to get a list of squares of the same odd numbers:
>>> res = [x ** 2 for x in xrange (1, 25, 2)]
>>> print res
[1, 9, 25, 49, 81, 121, 169, 225, 289, 361, 441, 529]
Optionally, you can add additional filtering conditions. For example, we will modify our previous example so that squares of multiples of 3 are excluded.
>>> res = [x ** 2 for x in xrange (1, 25, 2) if x% 3! = 0]
>>> print res
[1, 25, 49, 121, 169, 289, 361, 529]
Here is an example a little more complicated.
>>> dic = {'John': 1200, 'Paul': 1000, 'Jones': 1850, 'Dorothy': 950}
>>> print "\ n" .join (["% s =% d"% (name, salary) for name, salary in dic.items ()])
Jones = 1850
Dorothy = 950
Paul = 1000
John = 1200
Here we used a list generator to turn the dictionary into a set of entries, it can be convenient in such tasks as saving configs or generating HTML. The expression "% s =% d"% (name, salary) is intended for formatting strings, and is essentially similar to the analogs in C. At the beginning there is a line where positions for inserting values ​​are specified (% s is string,% d is numeric) . After the% sign, the “inserted” values ​​are in the form of a tuple.
In principle, the generator can also process finished lists.
Consider a simple example. Suppose we have a log file in which the statistics of server requests are stored in the form of “ip bytes” separated by a space, one host per
string. Like that:
127.0.0.1 120
10.1.1.1 210
127.0.0.1 80
10.1.1.1 215
10.1.1.1 200
10.1.1.2 210
We need to calculate the total amount of traffic for each host and issue it in the form of a list in the order of decreasing traffic.
The program for solving this problem will be very short :) It can be reduced even more, but this will obviously go to the detriment of its readability.
#! / usr / bin / env python
#coding: utf8
# 1 read the lines from the file and divide them into pairs IP-address
raw = [x.split ("") for x in open ("log.txt")]
# 2 fill out the dictionary
rmp = {}
for ip, traffic in raw:
if ip in rmp:
rmp [ip] + = int (traffic)
else:
rmp [ip] = int (traffic)
# 3 translate into list and sort
lst = rmp.items ()
lst.sort (key = lambda (key, val): key)
# 4 we get the result
print "\ n" .join (["% s \ t% d"% (host, traff) for host, traff in lst])
Let us analyze it step by step.
1. In this line we read from the file that we open using the open function. The open function, by default, opens a file for reading and returns a file object, which, among other things, is iterable. That is, it can be “moved” using the for loop, which we use in our case. In addition, using the split method, we divide each line into a pair - address - traffic.
2. For convenience, we generate a hash, the keys in which are the addresses, and the values ​​are the traffic. If there is no key yet, then we create it, if there is one, then we add the current traffic to its previous "developments". After this cycle, we get a hash with the amounts of traffic on all hosts.
3. Unfortunately, dictionaries in Python are not sorted. Therefore we should transfer it to the list for sorting. The next two lines translate the dictionary into a list and sort it by the second field.
4. That's all that remains - to “collect” the result, as we have done before.
Enough for today :)
Homework.
1. Implement the function generator of the row with the multiplication table by the number X.
2. There is a log file of some chat. Calculate the "talkativeness" of users in it in the form of a nickname - the number of phrases. Count the average number of letters per chat member.
Threat What is better to continue - about classes and OOP, or about the elements of functional programming?