I offer the readers of "Habrakhabr" a translation of the article "16 Months of Functional Programming" . All my comments will be in italics.
In this article I want to share with you my experience in functional programming. I feel that in general over the past 16 months I have become better versed in computer science and computers than in the previous 10 years, and all this thanks to my immersion in Scala and the world of functional programming. The reason functional programming leads you to continuous development is that each task needs to be rethought again. Sometimes it is impossible to believe that most standard tasks can be solved in a different way and - boom! - the functional approach offers the best solution and it is shocking.
So what happened to me during these 16 months? Having closed my startup, I began to look for an interesting project for work. He received a temporary job as a consultant in
2lemetry , which later resulted in full-time work with immersion in functional programming, working with MQTT brokers and distributed systems. While working at 2lemetry, Scala had a strong influence on me. I had to reject my ideas about programming and learn everything from scratch.
In this article I want to talk about some concepts of functional programming that impressed me. My goal is to spark a spark in the minds of programmers who already have experience with languages ​​like Java, Ruby and Python. Do not worry about the fact that some code on Scala, which I cite in this article, may be incomprehensible to you. The purpose of this code is to demonstrate the concepts of functional programming in general terms. It is also worth mentioning that Scala is not a pure functional language, i.e. Some things may seem inconvenient when mixing them with OOP. I will try to point out such moments.
')
Content:
- Immutable state
- Functions
- Option type and pattern matching
- One-liners and for-generators
- Type system
- Lazy calculations and infinite data structures
- What's next?
Immutable state
My first experience with Scala was fairly typical: I started working on a large project with the feeling that Scala is like Java with some cool additions and some features from Ruby. How much I was wrong! My code contained a changeable state wherever possible, and I did not understand why immutable lists and variables could be useful. How to change values ​​in immutable lists? How to change values ​​in fixed maps
(map) ? How to work with immutable state in cycles?
To demonstrate the benefits of the
unmodified state, I will show two versions of the same program, one in Java, the other in Scala. The following Java code filters the list of users by activity flag, sorts it by ID, then retrieves the list of names from these filtered users:
public class User { private final int id; private final String firstName; private final String lastName; private final Boolean active;
The author of the original article in the comments indicated that this code would be simpler for Java 8. He replied that his task was to show a generally accepted imperative approach when working with data.
This is the typical Java code for version 8: each collection is modified by a set of actions. In addition, the entire code is somewhat verbose, in each part of the
activeById code you tell the computer what you want it to do with the data instead of describing how the data should be processed from beginning to end. Here is the same program written in a functional style:
case class User(id: Int, firstname: String, lastname: String, active: Boolean) def activeById(us: Seq[User]) = us.filter(_.active).sortBy(_.id).map(_.lastname) val activeUsersById = activeById(Seq( User(11, "Nick", "Smith", false), User(89, "Ken", "Pratt", true), User(23, "Jack", "Sparrow", true) ))
Unlike the Java example, the code is given in full, i.e. you can run it in this form and it will work.
The code looks cleaner and shorter compared to the imperative approach, because there is no state to monitor. The
activeById function takes one argument (a list of users) and passes it through a chain of functions that are part of the language. It is important to note that the
filter ,
sortBy and
map functions were not chosen randomly. These functions are well described and studied by adherents of functional programming.
Rubisty should have noticed that this sample code is very similar to what they write in Ruby. Yes, this code may look similar, but the underlying state mechanism is very different. The problem with Ruby is that it does not maintain a constant state. Each variable and data structure can potentially be changed and this leads to the fact that you can not trust anything. In Scala, there are vals (read-only variables) and immutable collections that are truly immutable.
In the end, what gives us immutability? From a practical point of view, your code will be cleaner, less error prone (you always know what is in your collections and read-only variables) and better abstracted. Another big benefit from the immutability of the state is that when writing
concurrent programs you will not be bothered that one stream can damage the data used in another stream.
Functions
Functional programming is about functions (surprise?). There are various kinds of functions and techniques of functional composition that are used in functional programming.
Pure functions are one of the pillars of functional programming. A pure function is a function that depends only on its input parameters. It returns the result without changing the external state. The
sin (x: Double) or
md5 (s: String) functions are excellent examples of pure functions that depend only on the input parameters and always return the expected result, since they do not rely on the state of the surrounding world. These factors make pure functions easy to test and less prone to bugs.
Obviously, not all tasks can be implemented using pure functions. For example, I / O, logging, reading and writing to the database, etc. In functional programming, there are models and abstractions that allow you to implement these unclean abstractions in their pure form, which leads to cleaner and more composable code.
Functions in Scala are
first class objects . This means that not only class methods can be declared and called. In addition, they can be used as a standalone data type. You can pass a function to another function and return another function from the function. You can save a function in a variable or in a data structure. You can work with it as a literal without somehow calling it
(lambda function) . Example:
val ints = Seq(1, 2, 3, 4, 5) ints.filter(n => n % 2 == 0)
Look at
n => n% 2 == 0 , the full form
(n) => {n% 2 == 0} , this is a function with no name
(lambda function) that checks whether the number is even. You can pass a lambda function as an argument to another function, or use it as a return value.
Functions can be nested in other functions. This is a useful feature when you need to recursively call a subroutine
(subroutines) that you do not want to place in the scope that goes beyond the limits of your function.
def toList: List[A] = { @annotation.tailrec def go(s: Stream[A], l: List[A]): List[A] = s match { case Empty => l case Cons(h, t) => go(t(), h() +: l) } go(this, List[A]()).reverse }
In this example, we would not want the
go function to be available in the same scope as the
toList function, because its implementation is necessary only for the
toList function and in the scope in which the
toList function is
located , there may be another function called
go .
Currying and partial application of a function is a purely mathematical concept that is well used in functional languages. This allows us to save partially called functions into variables and transfer them to other functions.
trait Resource case class User(id: Int) extends Resource case class Record() case class FormattedRecord() def loadRecordsFor(r: Resource, since: Long): List[Record] = ??? def formatRecords(f: Long => List[Record], since: Long): List[FormattedRecord] = ??? val userRecordsLoader = loadRecordsFor(User(36), _: Long) formatRecords(userRecordsLoader, System.currentTimeMillis - 86400000)
??? - means ad without definition
In this example, we have two template functions
loadRecordsFor and
formatRecords . We partially applied
loadRecordsFor for some user and saved the result in the
userRecordsLoader variable. Then we call
formatRecords with the
userRecordsLoader parameter, since this variable corresponds to the signature of the
formatRecords function
(i.e., call the formatRecords function with the first argument, the partially applied function) . This kind of functional composition is quite convenient in many situations and makes the code less rigid.
This example is not very indicative. I believe that the author did not disclose the subject of currying.
Option type and pattern matching
Option data type is an abstraction that represents optional values. It may seem that it is not particularly necessary, but in everyday work it is an extremely powerful mechanism for representing null, empty, broken objects and variable values.
The Option type is a container that contains a value of a specific type, represented as
Some [T] or containing nothing, represented as
None . The use of this type in our code allows us to forget about the countless cases of exceptions when accessing a
null pointer exceptions or any type incompatibility, whenever null values ​​are retrieved.
Let's look at an example:
case class Project(id: String, name: String, priority: Int, description: Option[String]) object ProjectsAccessor { find(id: String): Option[Project] = ??? } val project = ProjectsAccessor.find(123)
Here we are trying to get a record about the project from the database, but we don’t know if there is a project with such an ID. Instead of returning null or throwing an exception, we return
Some [Project] or
None , since when declaring the
find method, we specified the return type as
Option [Project] .
Container types allow us to use another powerful tool -
pattern matching (pattern matching) . Pattern matching is an approach to data processing based on their structure. If we want to process the result of calling the
find method from the example above and get the name of the project, we can do something like:
ProjectsAccessor.find(123) match { case Some(p) => p.name case None => "" }
We compare with the sample the result of the
find method to verify the existence of a project. If it exists, then its name is returned, otherwise we return an empty string. At first glance, this may look like a switch-case statement in Java, but in fact they are very different. When matching with the sample, you can add non-trivial logic to the templates:
ProjectsAccessor.find(123) match { case Some(p) if 1 until 5 contains p.priority => p.name case Some(p) if name == "Default Project" => p.name case Some(p) => None case None => "" }
You can also compare the result based on the current structure of the object:
def descriptionWrapper(p: Project) = p match { case Project(_, _, _, None) => "No description." case Project(id, _, _, Some(d)) => s"Project $id's description: $d" }
This approach to processing complex logical constructions is more compact and straightforward compared with if-operators and a cumbersome switch-case.
One-Liners and for-generators
One of the great features of a functional composition is the functional chains. Instead of constantly repeating repetitive actions on collections using loops, you can do it with one elegant expression or one-liner. For example:
case class Participant(name: String, score: Int, active: Boolean) val ps = Seq(Participant("Jack", 34, true), Participant("Tom", 51, true), Participant("Bob", 90, false)) ps.filter(_.score < 50).filter(_.active).map(_.copy(active = false))
In this one-liner, we collected all subscribers whose result is less than 50 and which are still active, then we changed the status of the selected subscribers to false. In the end, we got a
List (Participant (Jack, 34, false)) . There are quite a large number of situations in which such one-liners save programmers time and dramatically reduce the amount of code.
If the one-liner becomes too opaque, then it can always be broken using the for-generator. The example above can be rewritten into an equivalent expression:
for { loser <- ps if loser.score < 50 activeLoser <- Some(loser) if activeLoser.active deactivatedLoser <- Some(activeLoser.copy(active = false)) } yield deactivatedLoser
This expression is more verbose than a one-liner, but in cases where the logic becomes too opaque, this technique really helps to improve the readability of the code while retaining all the advantages of immutability and function chains.
In Scala, for-generator is syntactic sugar for a chain of functions.
Type system
After programming in Ruby, strict typing in Scala felt like a burden, because I used it as a javist. I added a detailed description of types everywhere and did not use
generic functions. Needless to say, this was the wrong approach to work. Some functional languages ​​have an advanced type system with properties that are not used by programmers in popular languages. These properties allow the code to be more flexible and composable. Let's go over some of them.
Type inference is the ability of a compiler to guess the type of an expression without being explicitly indicated by the programmer. In some cases, Scala cannot guess the type without a hint. Let's look at an example:
This example shows both sides of type inference in Scala: you must explicitly specify types, but in some cases, such as when you pass a lambda function as an argument
(l, r) => ... , you can omit the types. In pure functional languages ​​such as Haskell, you hardly ever have to specify the type in programs
(controversial statement) . The compiler is smart enough to output them.
It is worth noting that when declaring a function it was not necessary to specify the type of the return value, the Scala compiler will output it itself.
Type constraints are another important concept in functional programming. In general, this means that you can specify a class hierarchy when declaring a generic type. In Java, you can use generalizations to define a type at runtime and still consider your code as type-safe. For example, to declare a generic list of elements of some type, you must use an interface (Java code):
public interface MyList <T> . If you want to declare a list, say, containing mappings
(Map) , but you do not know which implementation of mappings will be used, then you need to specify an upper limit for the type in your interface
public interface MyList <T extends Map> . Now you can use a list filled with such types as
Hashtable ,
LinkedHashMap ,
HashMap and
TreeMap or in other words all descendants of the
Map interface. However, no other type can be used, since marked upper bound. Here is an example of using Scala's upper bound:
def convertToInts[T <: AnyVal](es: Seq[T], f: T => Int): Seq[Int] = { es.map(f(_)) }
AnyVal is the parent of classes such as
Double ,
Float ,
Int, and many other
(scalar types) . In this function, we simply declared that type
T would be a descendant of
AnyVal . You can also specify a lower bound for the type, like
[T>: Int] this will correspond to parents of type
Int . You can also mix type boundaries for various generalizations in the function signature:
[T>: Int <: Any] .
One of the important properties of an advanced type system is
covariance and contravariance . In Java, if you have
class List <T> ,
List <Object>, and
List <String> , then they will be unconnected or invariant. On the other hand, there are covariant arrays, i.e.
String [] is a subtype of type
Object [] . Since arrays are mutable in some cases, you can get an
ArrayStoreException exception at
runtime . In Scala, arrays are invariant by default, and immutable collections (or container types) are covariant
[+ A] . Since they are unchangeable, all potential errors will be detected at compile time, and not runtime. Another possibility is to specify the container as a countervariant
[-A] . Countervariance means that the container with the parent type is a subtype of the container with the child type. Here's how it works:
case class InvariantContainer[A]() case class CovariantContainer[+A]() case class ContravariantContainer[-A]() class Person class User extends Person class Admin extends User val inv1: InvariantContainer[User] = InvariantContainer[User]
Covariance and countervariance are widely used in the implementation of collections and hot-webs when specifying the type of function.
The last advanced type feature that I would like to touch on is the bounds of the view
(view bounds) . Suppose you need to perform operations on numbers, but some of them are represented as strings (ie, “123” or 123). How do you do it? In a simple case, like this one, you can convert strings to numbers manually or, in more complex cases, write your own converter, and then explicitly call it to convert data from one type to another. In weakly typed languages ​​such as Ruby, the interpreter will dynamically convert strings to numbers. You might be surprised that Scala has a way to implement this behavior without losing type safety.
As far as I understand, the author is mistaken with Ruby, for example, such code will not work [10, “20”]. Min . In this case, it would be more correct to cite PHP as an example.
For this to work in Scala (let's use the standard function
math.min () for an example), all you need to do is define an implicit converter for your types:
implicit def strToInt(x: String) = x.toInt math.min("10", 1)
Here Scala will look for implicit conversions from string to number. After finding the
strToInt function based on its signature, it will be applied to all strings passed to
math.min without an explicit call to
strToInt . If you have not declared an implicit conversion, the compiler will throw an exception.
And if you declare and try to pass a string that is not a number, you will get an exception in runtime.
What if we want to write a magic function that finds an implicit conversion for itself? It is very easy! All you need to do is declare a view boundary that tells the compiler to search for an implicit conversion:
implicit def strToInt(x: String) = x.toInt def halfIt[A <% Int](x: A) = x / 2 halfIt("20")
The result of calling
halfIt (“20”) will be 10 as expected. The boundary of the form
[A <% Int] expects
Int or all that can be considered
Int . In our case, this is a string that can be implicitly converted to a number.
Lazy calculations and infinite data structures
The concept of lazy computing does not directly exist in non-functional languages, but it is easy to grasp. Look at the usual condition:
def expensiveOperation() = ??? val a = "foo" val b = "foo" if ((a == b) || expensiveOperation()) true else false
In most imperative languages, the operator
|| computes predicates
(a == b) and
expensiveOperation () is lazy, i.e.
expensiveOperation () will not be called until
if (a == b) is
true (i.e., as long as a is equal to b ) . And
expensiveOperation () will be called if
if (a == b) returns
false . Lazy calculations in Scala allow you to define similar behavior in different contexts.
You can define variables as
lazy , i.e. they will not be calculated until they are addressed for the first time, in contrast to ordinary variables that are calculated upon declaration. After a single calculation, the lazy variable remembers its value.
case class Order(name: String, price: Int) case class User(id: Int, orders: Seq[Order]) { lazy val cheapOrders: Seq[Order] = orders.filter(_.price < 50) lazy val expensiveOrders: Seq[Order] = orders.filter(_.price >= 50) }
In this example, we have a
case class for representing the user, which contains lists of orders. Unlike conventional properties,
cheapOrders and
expensiveOrders will not be calculated at the time of class initialization
(that is, at the time the object is created from the class) . They are calculated when we refer to them directly. Why not use the method? The problem is that we can have expensive computations or every time there are calls to the database when we call the method. Lazy variables retain their results on first access, which can lead to effective optimizations in some cases.
In other languages, they use the technique of memoisation , which can also be considered as lazy calculations.
Another example of deferred computing is passing arguments by name
(call-by-name) . Usually, the function arguments are evaluated before they are passed to it. However, in some cases it is useful to calculate the argument passed in when it is required.
trait Person case class User() extends Person case class Admin() extends Person def loadAdminsOrUsers(needAdmins: => Boolean, loadAdmins: => Seq[Admin], loadUsers: => Seq[User]): Seq[Person] = { if (needAdmins) loadAdmins else loadUsers }
Here we have three by-name parameters with potentially expensive database access operations. We would not want all of them to be completed before they fall into our method. The arrow => means that we are passing in the function itself, and not the value it returns. Now we can call her when we need it.
Lazy calculations and by-name parameters are used to implement one of the most powerful functional programming constructs - infinite data structures. In imperative programming, all your data structures have a certain size, which is actually excellent for most cases, but sometimes it is unknown what size the data structure will be. With deferred calculations, it is possible to declare a data structure in general form without its full content until you need it.
All this sounds good in theory, but will it work? Let's use an infinite data structure called stream to generate prime numbers. In Java, to generate prime numbers, you can write a function that would generate prime numbers up to a certain limit. Then you could call this function to generate a list of N prime numbers and pass it somewhere. If you need another list of prime numbers, you will have to recalculate your list from scratch. In Scala, we can do something like:
val primes: Stream[Int] = { def generatePrimes (s: Stream[Int]): Stream[Int] = s.head #:: generatePrimes(s.tail filter (_ % s.head != 0)) generatePrimes(Stream.from(2)) }
The syntax may seem confusing to you, but in this case it does not matter. The key is what you can do with this data structure. Let's say that you need to get the first 5 primes greater than 100. This is easy using our stream:
primes.filter(_ > 100).take(5).toList
This functional chain will return a List (101, 103, 107, 109, 113) - as expected. The cool thing is that you can transfer primes to any other functions without having to reassemble it every time. You can also use chains of functions (like filter in the example above) and pass this expression already without generating an intermediate result. This allows us to compose abstractions and sculpt programs like plasticine.
What's next?
I hope I am interested in functional programming. For me, writing about it was very exciting. I admit that much remains to be learned and learned in this area. My goal is to start digging deeper into functional programming theory, such as typed lambda calculus, type theory, and category theory. I also want to learn Haskell - a pure functional programming language, and I hope that it will bear fruit.
I read thanks to all who read it!