Use Stream API easier (or do not use at all)

With the advent of Java 8 Stream API allowed programmers to write significantly shorter what used to take many lines of code. However, it turned out that many, even using the Stream API, write longer than necessary. Moreover, this not only makes the code longer and complicates its understanding, but sometimes leads to a significant performance failure. It is not always clear why people write like that. Perhaps they have read only a small piece of documentation, but have not heard about other features. Or, in general, the documentation was not read, they just saw an example somewhere and decided to do the same. Sometimes it reminds a joke about “the task is reduced to the previous one”.

In this article I collected those examples that I encountered in practice. I hope after such an educational program the code of programmers will become a little more beautiful and faster. Most of these pieces are a good IDE will help you fix it, but the IDE is still not omnipotent and does not replace your head.

1. Stream from the collection without intermediate operations are usually not needed.

If you do not have intermediate operations, it is often possible and necessary to do without a stream.

1.1. collection.stream (). forEach ()

Want to do something for all items in the collection? Wonderful. But why do you stream? Write just collection.forEach() . In most cases this is the same, but shorter and produces less garbage. Some fear that there is a difference in functionality, but they cannot really explain what it is. They say, they say, forEach does not guarantee order. Just in the stream, according to the specification, it does not guarantee (in fact it is), but without stream it guarantees for ordered collections. If the order you do not need, you will not become worse if it appears. The only difference from the standard library that I know is the synchronized collections created through Collections.synchronizedXyz() . In this case, collection.forEach() synchronizes the entire operation, while collection.stream().forEach() does not synchronize anything. Most likely, if you already use synchronized collections, you still need synchronization, so it will only get better.

1.2. collection.stream (). collect (Collectors.toList ())

Are you going to convert an arbitrary collection to a list? Wonderful. Starting with Java 1.2, you have a great opportunity for this: new ArrayList<>(collection) (well, before Java 5 there were no generics). This is not only shorter, but faster and again will create less garbage in the heap. It can be much smaller, since in most cases you will have one array of the right size, while the stream will add elements one by one, stretching as needed. Similarly, instead of stream().collect(toSet()) create a new HashSet<>() , and instead of stream().collect(toCollection(TreeSet::new)) - new TreeSet<>() .

1.3. collection.stream (). toArray (String [] :: new)

A new way to convert to an array is no better than the good old collection.toArray(new String[0]) . Again: since there are fewer abstractions on the path, the transformation may be more efficient. In any case, you do not need a stream object.

1.4. collection.stream (). max (Comparator.naturalOrder ()). get ()

There is a wonderful method Collections.max , which for some reason is undeservedly forgotten by many. Calling Collections.max(collection) will do the same again with less garbage. If you have a comparator, use Collections.max(collection, comparator) . The Collections.max() method is worse if you want to specially process an empty collection, then the stream is more justified. The chain of collection.stream().max(comparator).orElse(null) looks better than collection.isEmpty() ? null : Collections.max(collection, comparator) collection.isEmpty() ? null : Collections.max(collection, comparator) .

1.5. collection.stream (). count ()

This is absolutely no gateway: there is a collection.size() ! If in Java 9, count() work quickly, in Java 8 this call always counts all the elements, even if the size is obvious. Do not do this.

2. Item Search

2.1. stream.filter (condition) .findFirst (). isPresent ()

I see this code surprisingly often. Its essence: to check whether the condition for some element of the stream. For this there is a special method: stream.anyMatch(condition) . Why do you need Optional ?

2.2. ! stream.anyMatch (condition)

Here some will argue, but I think that using the special stream.noneMatch(condition) method is more expressive. But if there is a negation in the condition !stream.anyMatch(x -> !condition(x)) , then it is definitely better to write stream.allMatch(x -> condition(x)) . The one who will read the code will thank you.

2.3. stream.map (condition) .anyMatch (b -> b)

And such strange code is sometimes written to confuse colleagues. If you see this, know that this is just stream.anyMatch(condition) . Here, variations on a topic like stream.map(condition).noneMatch(Boolean::booleanValue) or stream.map(condition).allMatch(Boolean.TRUE::equals) .

3. Creating a stream

3.1. Collections.emptyList (). Stream ()

Need an empty stream? Sometimes it is okay. And for this there is a special method Stream.empty() . Performance is the same, but shorter and clearer. The emptySet method here does not differ from emptyList .

3.2. Collections.singleton (x) .stream ()

And here you can simplify life: if you need to stream from one element, just write to Stream.of(x) . Again, no matter if singleton or singletonList : when there is only one element in a stream, nobody cares whether the stream is streamlined or not.

3.3. Arrays.asList (array) .stream ()

The development of the same topic. People for some reason do it, although Arrays.stream(array) or Stream.of(array) work just as well. If you explicitly specify elements ( Arrays.asList(x, y, z).stream() ), then Stream.of(x, y, z) will also work. Similarly with EnumSet.of(x, y, z).stream() . You need a stream, not a collection, and create a stream right away.

3.4. Collections.nCopies (N, "ignored"). Stream (). Map (ignored -> new MyObject ())

Need to stream from N identical objects? Then nCopies() is your choice. But if you need to generate a stream of N objects created in the same way, then it is more beautiful and more optimal to use Stream.generate(() -> new MyObject()).limit(N) .

3.5. IntStream.range (from, to) .mapToObj (idx -> array [idx])

Need to stream from a piece of array? There is a special method Arrays.stream(array, from, to) . Again, shorter and less garbage, plus since the array is no longer captured by the lambda, it does not have to be effectively-final. Clearly, if from is 0, and to is array.length , then you just need Arrays.stream(array) , and then the code will be more pleasant, even if there is something more complex in mapToObj . For example, IntStream.range(0, strings.length).mapToObj(idx -> strings[idx].trim()) easily turns into Arrays.stream(strings).map(String::trim) .

A more tricky variation on the topic is IntStream.range(0, Math.min(array.length, max)).mapToObj(idx -> array[idx]) . A little thought, you realize that this is Arrays.stream(array).limit(max) .

4. Unnecessary and complex collectors

Sometimes people study collectors and try to do everything through them. However, they are not always needed.

4.1. stream.collect (Collectors.counting ())

Many collectors are only needed as secondary in complex cascade operations like groupingBy . Collector counting() just one of them. Write stream.count() and do not suffer. Again, if in Java 9 count() can sometimes execute in constant time, then the collector will always recalculate the elements. And in Java 8, the counting() collector is also worthless (I fixed this in Java 9). From the same opera collectors maxBy() , minBy() (there are methods max() and min() ), reducing() (use reduce() ), mapping() (just add the map() step, and then use the secondary collector directly). In Java 9, filtering() and flatMapping() were added, which also duplicate the corresponding intermediate operations.

4.2. groupingBy (classifier, collectingAndThen (maxBy (comparator), Optional :: get))

A frequent task: I want to group the elements by classifier, choosing a maximum in each group. In SQL, it just looks like the SELECT classifier, MAX(...) FROM ... GROUP BY classifier . Apparently, trying to transfer the SQL experience, people are trying to use the same groupingBy in the Stream API. It would seem that groupingBy(classifier, maxBy(comparator)) should work, but no. The maxBy collector returns an Optional . But we know that the nested Optional always not empty, since in each group at least one element exists. Therefore, you have to add ugly steps like collectingAndThen , and everything starts to look quite monstrous.

However, stepping back a step, you can understand that groupingBy not needed here. There is another great collector - toMap , and this is just what you need. We just want to collect the elements in the Map , where the classifier is the key, and the element itself is the value. In the case of a duplicate, choose the larger one. For this, by the way, there is a BinaryOperator.maxBy(comparator) , which can be statically imported instead of the same-name collector. As a result, we have: toMap(classifier, identity(), maxBy(comparator)) .

If you try to use groupingBy , and you have maxBy , minBy or reducing as a secondary collector (possibly with an intermediate mapping ), look towards the collector toMap - it may feel better.

5. Do not count what should not be considered.

5.1. listOfLists.stream (). flatMap (List :: stream) .count ()

This echoes clause 1.5. We want to count the total number of items in nested collections. It would seem logical: stretch these collections into one stream using flatMap and recalculate. However, in most cases, the sizes of nested lists are already counted, they are stored in the field, and are easily accessible using the size() method. A small modification will significantly increase the speed of the operation: listOfLists.stream().mapToInt(List::size).sum() . If you are afraid that int overflow, mapToLong will also work.

5.2. if (stream.filter (condition) .count ()> 0)

Again, a fun way to write stream.anyMatch(condition) . But in contrast to the rather innocuous 2.1, you lose a short circuit here: all the elements will be enumerated, even if the condition worked on the very first one. Similarly, if you check filter(condition).count() == 0 , it is better to use noneMatch(condition) .

5.3. if (stream.count ()> 2)

This case is more cunning. Now it is important for you to know if there are more than two elements in the stream or not. If you’re worried about performance, you may want to insert stream.limit(3).count() . You do not care how many there are if there are more than two.

6. Miscellaneous

6.1. stream.sorted (comparator) .findFirst ()

What did the author want to say? Sort the stream and take the first element. It's like taking the minimum element: stream.min(comparator) . Sometimes you even see stream.sorted(comparator.reversed()).findFirst() , which is similar to stream.max(comparator) . The implementation of the Stream API does not optimize here (although it could), but it will do everything as you said: it will assemble the stream into an intermediate array, sort it all out and give you the first element. You will significantly lose in memory and speed on such an operation. And, of course, the replacement is much clearer.

6.2. stream.map (x -> {counter.addAndGet (x); return x;})

Some people try to do some kind of side effect on the stream. In general, it is in principle already a bell, which may not be necessary for you at all. But anyway for these purposes there is a special peek method. We write stream.peek(counter::addAndGet) .

I have it all. If you come across strange and inefficient ways to use the Stream API, write about them in the comments.

Source: https://habr.com/ru/post/337350/

All Articles

Use Stream API easier (or do not use at all)

More articles: