Introduction to functional programming in Python

Speaking about functional programming, people often begin to produce a bunch of "functional" characteristics. Immutable data, first class functions, and tail recursion optimization. These are language features that help write functional programs. They mention mapping, currying, and using higher-order functions. These are programming techniques used to write functional code. They mention parallelization, lazy computation, and determinism. These are the benefits of functional programs.

Hammer in. Functional code has one property: no side effects. It does not rely on data outside the current function, and does not change data outside the function. All other "properties" can be derived from this.

Non-functional function:
')

a = 0 def increment1(): global a a += 1

Functional function:

 def increment2(a): return a + 1

Instead of walking through the list, use map and reduce.

Map

Accepts function and data set. Creates a new collection, performs a function at each data position, and adds the return value to the new collection. Returns a new collection.

A simple map that takes a list of names and returns a list of lengths:

 name_lengths = map(len, ['', '', '']) print name_lengths # => [4, 4, 3]

This map squares each element:

 squares = map(lambda x: x * x, [0, 1, 2, 3, 4]) print squares # => [0, 1, 4, 9, 16]

It does not accept the named function, but takes an anonymous one defined via lambda. The lambda parameters are defined to the left of the colon. The function body is on the right. The result is returned implicitly.

The non-functional code in the following example takes a list of names and replaces them with random nicknames.

 import random names = ['', '', ''] code_names = ['', '', ''] for i in range(len(names)): names[i] = random.choice(code_names) print names # => ['', '', '']

The algorithm can assign the same nicknames to different secret agents. Hopefully this will not be a source of problems during a secret mission.

Rewrite it via map:

 import random names = ['', '', ''] secret_names = map(lambda x: random.choice(['', '', '']), names)

Exercise 1 . Try rewriting the following code via map. He takes a list of real names and replaces them with nicknames, using a more reliable method.

 names = ['', '', ''] for i in range(len(names)): names[i] = hash(names[i]) print names # => [6306819796133686941, 8135353348168144921, -1228887169324443034]

My decision:

 names = ['', '', ''] secret_names = map(hash, names)

Reduce

Reduce accepts function and set points. Returns the value obtained by combining all items.

An example of a simple reduce. Returns the sum of all items in the set:

 sum = reduce(lambda a, x: a + x, [0, 1, 2, 3, 4]) print sum # => 10

x - the current item, and - the battery. This is the value that returns the lambda execution on the previous paragraph. reduce () iterates over all values, and for each lambda at the current values of a and x, and returns the result in a for the next iteration.

And what is the same in the first iteration? It is equal to the first element of the collection, and reduce () starts working from the second element. That is, the first x will be equal to the second set item.

The following example considers how often the word "captain" is found in the list of strings:

 sentences = ['  ', '  ', '  , '] cap_count = 0 for sentence in sentences: cap_count += sentence.count('') print cap_count # => 3

The same code using reduce:

 sentences = ['  ', '  ', '  , '] cap_count = reduce(lambda a, x: a + x.count(''), sentences, 0)

And where does the initial value of a come from here? It cannot be calculated from the number of repetitions in the first line. Therefore, it is specified as the third argument to the reduce () function.

Why is map and reduce better?

First, they usually fit in one line.

Second, the important parts of the iteration — the collection, the operation, and the return value — are always in the same place as map and reduce.

Thirdly, the code in the loop can change the value of previously defined variables, or affect the code after it. By convention, map and reduce are functional.

Fourthly, map and reduce are elementary operations. Instead of line-by-line reading of cycles, it is easier for the reader to perceive map and reduce, embedded in complex algorithms.

Fifth, they have a lot of friends allowing the useful, slightly modified behavior of these functions. For example, filter, all, any and find.

Exercise 2 : rewrite the following code using map, reduce, and filter. Filter accepts function and collection. Returns a collection of those things for which the function returns True.

 people = [{'': '', '': 160}, {'  ': '', '  ': 80}, {'name': ''}] height_total = 0 height_count = 0 for person in people: if '' in person: height_total += person['  '] height_count += 1 if height_count > 0: average_height = height_total / height_count print average_height # => 120

My decision:

 people = [{'': '', '': 160}, {'  ': '', '  ': 80}, {'name': ''}] heights = map(lambda x: x[''], filter(lambda x: '' in x, people)) if len(heights) > 0: from operator import add average_height = reduce(add, heights) / len(heights)

Write declaratively, not imperatively

The following program emulates a three-car race. At any given time, the car either moves forward or not. Each time the program displays the path traveled by cars. After five intervals, the race ends.

Examples of output:

  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Program text:

 from random import random time = 5 car_positions = [1, 1, 1] while time: # decrease time time -= 1 print '' for i in range(len(car_positions)): # move car if random() > 0.3: car_positions[i] += 1 # draw car print '-' * car_positions[i]

Code is imperative. The functional version would be declarative - it would describe what needs to be done, not how to do it.

We use functions

Declarativity can be achieved by inserting code into functions:

 from random import random def move_cars(): for i, _ in enumerate(car_positions): if random() > 0.3: car_positions[i] += 1 def draw_car(car_position): print '-' * car_position def run_step_of_race(): global time time -= 1 move_cars() def draw(): print '' for car_position in car_positions: draw_car(car_position) time = 5 car_positions = [1, 1, 1] while time: run_step_of_race() draw()

To understand the program, the reader scans the main loop. “If there is time, go through one step of the race and display the result. Check the time again. ” If the reader needs to figure out how the race step works, he will be able to read his code separately.

Comments are not needed, the code explains itself.

Breaking the code into functions makes the code more readable. This technique uses functions, but only as subroutines. They package the code, but do not make it functional. Functions affect the code surrounding them and change global variables, rather than returning values. If the reader encounters a variable, he will need to find where it came from.

Here is the functional version of this program:

 from random import random def move_cars(car_positions): return map(lambda x: x + 1 if random() > 0.3 else x, car_positions) def output_car(car_position): return '-' * car_position def run_step_of_race(state): return {'time': state['time'] - 1, 'car_positions': move_cars(state['car_positions'])} def draw(state): print '' print '\n'.join(map(output_car, state['car_positions'])) def race(state): draw(state) if state['time']: race(run_step_of_race(state)) race({'time': 5, 'car_positions': [1, 1, 1]})

Now the code is divided into functional functions. There are three signs. The first is that there are no shared variables. time and car_positions are passed directly to race (). The second is that the functions take parameters. The third is that the variables do not change inside the functions, all values are returned. Every time run_step_of_race () takes the next step, it is passed again to the next.

Here are two functions zero () and one ():

 def zero(s): if s[0] == "0": return s[1:] def one(s): if s[0] == "1": return s[1:]

zero () takes the string s. If the first character is 0, then returns the rest of the string. If not, then None. one () does the same if the first character is 1.

Let's represent the rule_sequence () function. It takes a string and a list of rule functions, consisting of the zero and one functions. It calls the first rule, passing it the string. If None is not returned, it takes the returned value and calls the next rule. And so on. If None is returned, rule_sequence () stops and returns None. Otherwise, the value of the last rule.

Examples of input and output data:

 print rule_sequence('0101', [zero, one, zero]) # => 1 print rule_sequence('0101', [zero, zero]) # => None

Imperative version of rule_sequence ():

 def rule_sequence(s, rules): for rule in rules: s = rule(s) if s == None: break return s

Exercise 3 . This code uses a loop. Rewrite it in a declarative form using recursion.

My decision:

 def rule_sequence(s, rules): if s == None or not rules: return s else: return rule_sequence(rules[0](s), rules[1:])

Use pipelines

Now we will rewrite another kind of cycles with the help of a technique called a conveyor.

The next cycle changes the dictionaries containing the name, the wrong country of origin, and the status of some groups.

 bands = [{'name': 'sunset rubdown', 'country': 'UK', 'active': False}, {'name': 'women', 'country': 'Germany', 'active': False}, {'name': 'a silver mt. zion', 'country': 'Spain', 'active': True}] def format_bands(bands): for band in bands: band['country'] = 'Canada' band['name'] = band['name'].replace('.', '') band['name'] = band['name'].title() format_bands(bands) print bands # => [{'name': 'Sunset Rubdown', 'active': False, 'country': 'Canada'}, # {'name': 'Women', 'active': False, 'country': 'Canada' }, # {'name': 'A Silver Mt Zion', 'active': True, 'country': 'Canada'}]

The name of the function “format” is too general. And in general, the code causes some concern. Three different things happen in one cycle. The value of the key 'country' is changed to 'Canada'. Points are removed and the first letter of the name is changed to the capital. It is difficult to understand what the code should do, and it is difficult to say whether it does it. Its hard to use, test and parallelize.

Compare:

 print pipeline_each(bands, [set_canada_as_country, strip_punctuation_from_name, capitalize_names])

It's simple. Auxiliary functions look functional because they are chained together. The previous output is the next input. They are easy to check, reuse, check and parallelize.

pipeline_each () iterates over groups one at a time, and passes them to conversion functions, like set_canada_as_country (). After applying the function to all groups, pipeline_each () makes a list of them and passes it to the next one.

Let's look at the conversion function.

 def assoc(_d, key, value): from copy import deepcopy d = deepcopy(_d) d[key] = value return d def set_canada_as_country(band): return assoc(band, 'country', "Canada") def strip_punctuation_from_name(band): return assoc(band, 'name', band['name'].replace('.', '')) def capitalize_names(band): return assoc(band, 'name', band['name'].title())

Each binds a group key with a new value. Without changing the original data, it is hard to do, so we solve this with the help of assoc (). It uses deepcopy () to create a copy of the transferred dictionary. Each function converts a copy and returns that copy.

Everything seems to be fine. Data originals are protected from changes. But in the code there are two potential places for data changes. In strip_punctuation_from_name (), a name without dots is created by calling calling replace () with the original name. In capitalize_names (), a name is created with the first capital letter based on the title () and the original name. If replace and time are not functional, then strip_punctuation_from_name () with capitalize_names () is not functional.

Fortunately, they are functional. In Python, strings are immutable. These functions work with copies of strings. Uff, thank God.

This contrast between strings and dictionaries (their variability) in Python demonstrates the advantages of languages like Clojure. There the programmer does not need to think about whether he will change the data. Will not change.

Exercise 4 . Try to make the pipeline_each function. Think about the sequence of operations. Groups - in an array, are transferred one by one for the first conversion function. Then the resulting array is transferred one by one for the second function, and so on.

My decision:

 def pipeline_each(data, fns): return reduce(lambda a, x: map(x, a), fns, data)

All three conversion functions as a result change the specific field of the group. call () can be used to create an abstraction for this. It accepts the function and key to which it will be applied.

 set_canada_as_country = call(lambda x: 'Canada', 'country') strip_punctuation_from_name = call(lambda x: x.replace('.', ''), 'name') capitalize_names = call(str.title, 'name') print pipeline_each(bands, [set_canada_as_country, strip_punctuation_from_name, capitalize_names])

Or, sacrificing readability:

 print pipeline_each(bands, [call(lambda x: 'Canada', 'country'), call(lambda x: x.replace('.', ''), 'name'), call(str.title, 'name')])

The code for call ():

 def assoc(_d, key, value): from copy import deepcopy d = deepcopy(_d) d[key] = value return d def call(fn, key): def apply_fn(record): return assoc(record, key, fn(record.get(key))) return apply_fn

What is going on here?

One. call is a higher order function, since takes another function as an argument and returns a function.

Two. apply_fn () is similar to conversion functions. Gets a record (group). Searches for record [key] value. Causes fn. Assigns the result to a copy of the record and returns it.

Three. The call itself does nothing. All work is done by apply_fn (). In the use case of pipeline_each (), a single instance of apply_fn () sets the 'country' value to 'Canada'. Another - makes the first letter in capital.

Four. When executing the apply_fn () instance, the functions fn and key will not be available in scope. These are not apply_fn () arguments or local variables. But access to them will be. When defining a function, it retains references to variables that it closes — those that were defined outside the function, and are used internally. When a function is started, variables are searched among local ones, then among arguments, and then among references to closed ones. There are fn and key.

Five. There is no mention of groups in the call. This is because call can be used to create any pipelines, regardless of their contents. Functional programming, in particular, builds a library of common functions suitable for compositions and for reuse.

Well done. Closures, higher-order functions, and scope — all in several paragraphs. You can drink tea with cookies.

There remains one more processing of these groups. Remove everything except the name and country. Function extract_name_and_country ():

 def extract_name_and_country(band): plucked_band = {} plucked_band['name'] = band['name'] plucked_band['country'] = band['country'] return plucked_band print pipeline_each(bands, [call(lambda x: 'Canada', 'country'), call(lambda x: x.replace('.', ''), 'name'), call(str.title, 'name'), extract_name_and_country]) # => [{'name': 'Sunset Rubdown', 'country': 'Canada'}, # {'name': 'Women', 'country': 'Canada'}, # {'name': 'A Silver Mt Zion', 'country': 'Canada'}]

extract_name_and_country () could be written in a generalized form called pluck (). It would be used like this:

 print pipeline_each(bands, [call(lambda x: 'Canada', 'country'), call(lambda x: x.replace('.', ''), 'name'), call(str.title, 'name'), pluck(['name', 'country'])])

Exercise 5 . pluck accepts a list of keys to extract from records. Try to write it. This will be a higher order function.

My decision:

 def pluck(keys): def pluck_fn(record): return reduce(lambda a, x: assoc(a, x, record[x]), keys, {}) return pluck_fn

And now what?

Functional code is well combined with the traditional. Transformations from this article can be used in any language. Try it for your code.

Think of Masha, Petya and Vasya. Turn the iterations through the lists into maps and.

Remember the race. Break the code into functions, and make them functional. Turn the cycle into recursion.

Remember the group. Turn the workflow into a pipeline.

Source: https://habr.com/ru/post/257903/

All Articles

Introduction to functional programming in Python

Map

Reduce

Why is map and reduce better?

Write declaratively, not imperatively

We use functions

Use pipelines

More articles: