I have a hobby: I collect various solutions of typical tasks in Java, which I find in the internet, and I try to choose the most optimal in size / performance / elegance. First of all in performance. Let's consider such a typical task that is often found in Java programming as “converting an InputStream into a string” and various options for its solution.
Let's see what limitations each person has (requirements for connecting a specific library / specific version, working correctly with unicode, etc.). The English version of this article can be found in my response to stackoverflow . Tests in my github project.
1. JPA and Hibernate in questions and answers
2. Three hundred fifty most popular non-mobile Java opensource projects on github
3. Java collections (standard, guava, apache, trove, gs-collections, and others)
4. Java Stream API
5. Two hundred and fifty Russian-language teaching videos of lectures and reports on Java.
6. List of useful links for Java programmer
7 Typical tasks
7.1 Optimum way to convert an InputStream to a string
7.2 The most productive way to bypass the Map, count the number of occurrences of the substring
8. Libraries for working with Json (Gson, Fastjson, LoganSquare, Jackson, JsonPath and others)
A very common task, let's consider in what ways it can be done (there will be 11 of them):
Using IOUtils.toString from the Apache Commons
library. One of the shortest one-liners.
String result = IOUtils.toString(inputStream, StandardCharsets.UTF_8);
Using CharStreams from the guava
library. Also pretty short code.
try(InputStreamReader reader = new InputStreamReader(inputStream, Charsets.UTF_8)) { String result = CharStreams.toString(reader); }
Using Scanner
( JDK ). The solution is short, tricky, with the help of pure JDK, but this is more likely a hack that the brain will make to those who do not know about such a focus.
try(Scanner s = new Scanner(inputStream).useDelimiter("\\A")) { String result = s.hasNext() ? s.next() : ""; }
Using Stream Api using Java 8
. Warning : It replaces different line breaks (such as \r\n
) with \n
, sometimes it can be critical.
try(BufferedReader br = new BufferedReader(new InputStreamReader(inputStream))) { String result = br.lines().collect(Collectors.joining("\n")); }
Using parallel Stream Api ( Java 8
). Warning : Like solution 4, it replaces different line breaks (such as \r\n
) with \n
.
try(BufferedReader br = new BufferedReader(new InputStreamReader(inputStream))) { String result = br.lines().parallel().collect(Collectors.joining("\n")); }
Using InputStreamReader and StringBuilder from a regular JDK
final int bufferSize = 1024; final char[] buffer = new char[bufferSize]; final StringBuilder out = new StringBuilder(); try(Reader in = new InputStreamReader(inputStream, "UTF-8")) { for (; ; ) { int rsz = in.read(buffer, 0, buffer.length); if (rsz < 0) break; out.append(buffer, 0, rsz); } return out.toString(); }
Using StringWriter and IOUtils.copy from Apache Commons
try(StringWriter writer = new StringWriter()) { IOUtils.copy(inputStream, writer, "UTF-8"); return writer.toString(); }
Using ByteArrayOutputStream and inputStream.read from JDK
try(ByteArrayOutputStream result = new ByteArrayOutputStream()) { byte[] buffer = new byte[1024]; int length; while ((length = inputStream.read(buffer)) != -1) { result.write(buffer, 0, length); } return result.toString("UTF-8"); }
Using BufferedReader from JDK
. Warning : This solution replaces different line line.separator
(such as \n\r
) with the line.separator
system property (for example, in Windows with "\ r \ n").
String newLine = System.getProperty("line.separator"); try(BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream))) { StringBuilder result = new StringBuilder(); String line; boolean flag = false; while ((line = reader.readLine()) != null) { result.append(flag? newLine: "").append(line); flag = true; } return result.toString(); }
Using BufferedInputStream and ByteArrayOutputStream from JDK
try(BufferedInputStream bis = new BufferedInputStream(inputStream); ByteArrayOutputStream buf = new ByteArrayOutputStream()) { int result = bis.read(); while(result != -1) { buf.write((byte) result); result = bis.read(); } return buf.toString(); }
Using inputStream.read () and StringBuilder ( JDK
). Warning : This solution does not work with Unicode, for example with Russian text.
int ch; StringBuilder sb = new StringBuilder(); while((ch = inputStream.read()) != -1) sb.append((char)ch); reset(); return sb.toString();
Solutions 4
, 5
and 9
convert different line breaks into one.
Solutions 11
does not work with Unicode text.
1
, 7
requires the use of the Apache Commons library, 2
requires the Guava library, 4
and 5
require Java 8 and higher,Warning : performance measurements are always highly dependent on the system, measurement conditions, etc. I measured on two different computers, one Windows 8.1, Intel i7-4790 CPU 3.60GHz 2, 16Gb, the second - Linux Mint 17.2, Celeron Dual-Core T3500 2.10Ghz 2, 6Gb, but I can not guarantee that the results are absolutely true, you You can always repeat the tests ( test1 and test2 ) on your system.
Performance measurements for small lines (length = 175), tests can be found on github (mode = average execution time (AverageTime), system = Linux Mint 17.2, Celeron Dual-Core T3500 2.10Ghz * 2, 6Gb, the lower the better, 1,343 - the best):
Benchmark Mode Cnt Score Error Units 8. ByteArrayOutputStream and read (JDK) avgt 10 1,343 ± 0,028 us/op 6. InputStreamReader and StringBuilder (JDK) avgt 10 6,980 ± 0,404 us/op 10.BufferedInputStream, ByteArrayOutputStream avgt 10 7,437 ± 0,735 us/op 11.InputStream.read() and StringBuilder (JDK) avgt 10 8,977 ± 0,328 us/op 7. StringWriter and IOUtils.copy (Apache) avgt 10 10,613 ± 0,599 us/op 1. IOUtils.toString (Apache Utils) avgt 10 10,605 ± 0,527 us/op 3. Scanner (JDK) avgt 10 12,083 ± 0,293 us/op 2. CharStreams (guava) avgt 10 12,999 ± 0,514 us/op 4. Stream Api (Java 8) avgt 10 15,811 ± 0,605 us/op 9. BufferedReader (JDK) avgt 10 16,038 ± 0,711 us/op 5. parallel Stream Api (Java 8) avgt 10 21,544 ± 0,583 us/op
Performance measurements for large lines (length = 50100), tests can be found on github (mode = average execution time (AverageTime), system = Linux Mint 17.2, Celeron Dual-Core T3500 2.10Ghz * 2, 6Gb, the lower the better, 200,715 - the best):
Benchmark Mode Cnt Score Error Units 8. ByteArrayOutputStream and read (JDK) avgt 10 200,715 ± 18,103 us/op 1. IOUtils.toString (Apache Utils) avgt 10 300,019 ± 8,751 us/op 6. InputStreamReader and StringBuilder (JDK) avgt 10 347,616 ± 130,348 us/op 7. StringWriter and IOUtils.copy (Apache) avgt 10 352,791 ± 105,337 us/op 2. CharStreams (guava) avgt 10 420,137 ± 59,877 us/op 9. BufferedReader (JDK) avgt 10 632,028 ± 17,002 us/op 5. parallel Stream Api (Java 8) avgt 10 662,999 ± 46,199 us/op 4. Stream Api (Java 8) avgt 10 701,269 ± 82,296 us/op 10.BufferedInputStream, ByteArrayOutputStream avgt 10 740,837 ± 5,613 us/op 3. Scanner (JDK) avgt 10 751,417 ± 62,026 us/op 11.InputStream.read() and StringBuilder (JDK) avgt 10 2919,350 ± 1101,942 us/op
Graph of average time versus line length, Windows 8.1 system, Intel i7-4790 CPU 3.60GHz 3.60GHz, 16Gb:
Table of dependence of the average time on the length of the line, Windows 8.1 system, Intel i7-4790 CPU 3.60GHz 3.60GHz, 16Gb:
182 546 1092 3276 9828 29484 58968 test8 0.38 0.938 1.868 4.448 13.412 36.459 72.708 test4 2.362 3.609 5.573 12.769 40.74 81.415 159.864 test5 3.881 5.075 6.904 14.123 50.258 129.937 166.162 test9 2.237 3.493 5.422 11.977 45.98 89.336 177.39 test6 1.261 2.12 4.38 10.698 31.821 86.106 186.636 test7 1.601 2.391 3.646 8.367 38.196 110.221 211.016 test1 1.529 2.381 3.527 8.411 40.551 105.16 212.573 test3 3.035 3.934 8.606 20.858 61.571 118.744 235.428 test2 3.136 6.238 10.508 33.48 43.532 118.044 239.481 test10 1.593 4.736 7.527 20.557 59.856 162.907 323.147 test11 3.913 11.506 23.26 68.644 207.591 600.444 1211.545
The quickest solution in all cases and all systems was the 8th test: Using ByteArrayOutputStream and inputStream.read from the JDK
ByteArrayOutputStream result = new ByteArrayOutputStream(); byte[] buffer = new byte[1024]; int length; while ((length = inputStream.read(buffer)) != -1) { result.write(buffer, 0, length); } return result.toString("UTF-8");
A short and very quick solution would be to use Apache Commons
IOUtils.toString
String result = IOUtils.toString(inputStream, StandardCharsets.UTF_8);
Java 8's Stream Api shows the average time, and using parallel streams only makes sense if the string is quite large, otherwise it works for a very long time (which was generally expected)
1. JPA and Hibernate in questions and answers
2. Three hundred fifty most popular non-mobile Java opensource projects on github
3. Java collections (standard, guava, apache, trove, gs-collections, and others)
4. Java Stream API
5. Two hundred and fifty Russian-language teaching videos of lectures and reports on Java.
6. List of useful links for Java programmer
7 Typical tasks
7.1 Optimum way to convert an InputStream to a string
7.2 The most productive way to bypass the Map, count the number of occurrences of the substring
8. Libraries for working with Json (Gson, Fastjson, LoganSquare, Jackson, JsonPath and others)
Source: https://habr.com/ru/post/278233/
All Articles