⬆️ ⬇️

Multithreading: which way to think

Snake-gorynych from Sun

Not so long ago, by some coincidence, I took part in a kind of competition, which quickly turned into research. This study has produced results that will be of interest to Java blog readers and Algorithms equally. Due to the impossibility to place in two places at once, I decided to divide this post into two parts. As the Captain probably tells you, this part will tell you about the results regarding Java.

By the way, from the research team not only I am registered on Habré: if you want to express gratitude, then do not forget about markiz and icekeeper .



And what is it all about?

It will talk about what you need to keep in mind, parallelizing some calculations or tasks. I will consider several options for implementing such calculations and explain why this or that solution is more effective. So let's go.



Oh, well, these are your streams!

Let's start with the simplest: We have a computer with one core, on which many tasks are required. Let's create some Task class, which will have a work method that performs some actions depending on the identifier of the task filed; as well as a class that will perform many tasks:

public class Archiver {

public static long count = 0 ;

class Task {

// ...

void work ( String taskId ) {

count + = something ( taskId ) ;

count + = somethingElse ( taskId ) ;

count + = commit ( taskId ) ;

}

}

public static void main ( String [ ] args ) {

for ( String id : args )

new Task ( ) . work ( id ) ;

System . out . println ( Task. count ) ;

}

}




For the time being, everything was fine, but our server was upgraded, and now we have a true dual-core processor! Of course, now we need to fix this code and make it work in several threads.

')

Don't over-synchronize, young padawan

A little thought, we rushed to do what we wanted and wrote the following code:

public class Archiver {

public static long count = 0 ;

class Task implements Runnable {

// ...

public void run ( ) {

timeout = 0 ;

while ( timeout < 10 ) {

if ( haveMoreTasks ( ) ) {

taskId = getTask ( ) ;

count + = something ( taskId ) ;

count + = somethingElse ( taskId ) ;

count + = commit ( taskId ) ;

} else {

Thread . sleep ( 10 ) ;

timeout ++;

}

}

}

}



private static LinkedList < String > tasks ;



synchronized public static String getTask ( ) {

return tasks. pollLast ( ) ;

}



synchronized public static boolean haveMoreTasks ( ) {

return tasks. size ( ) > 0 ;

}



publicized static static void addTask ( String taskId ) {

tasks. add ( task ) ;

}



public static void main ( String [ ] args ) {

for ( String id : args )

addTask ( id ) ;

int threadNum = Runtime . getRuntime ( ) . availableProcessors ( ) ;



for ( int i = 0 ; i < threadNum ; i ++ )

( threads [ i ] = new thread ( new Task ( ) ) ) . start ( ) ;



boolean finished = false ;

while ( ! finished ) {

finished = true ;

for ( int i = 0 ; i < threadNum ; i ++ )

finished & = ! threads [ i ] . isAlive ( ) ;

try {

Thread . sleep ( 10 ) ;

} catch ( Exception e ) {

e. printStackTrace ( ) ;

}

}

System . out . println ( count ) ;

}

}




To our great regret, this code has several drawbacks. First, it may not work correctly:
  1. long is not an atomic type. You need to use AtomicLong ; <Also, access to the count can be synchronized, and this way:

    long delta = something ( taskId ) ;

    delta + = somethingElse ( taskId ) ;

    delta + = commit ( taskId ) ;

    synchronized ( count ) {

    count. + = delta ;

    }
    The calculation of the result from the synchronized block follows because it can be performed for a very long time, thereby blocking the work of other threads;
  2. Let one element in the queue, and both threads simultaneously complete their work. Then one of them will get true in response to haveMoreTasks () , then the second will get the same answer, after which the first will pick up the value, and the second will break off and throw an exception.




But these errors are not as significant as another: if there are a lot of tasks, and they are not executed for such a long time, then this will work longer than the single-threaded version. Why? The answer is simple, but we will reach it only by writing the solution as it should.



How to do it right

Fortunately, Java has many tools for parallelizing data. Including - ExecutorService . The code below works much faster than all other options invented:

public class Archiver implements Runnable {

public static AtomicLong count = AtomicLong (0 ) ;

public static String [ ] tasks ;

class Task implements Runnable {

Task ( String taskId ) {

this . taskId = taskId ;

}

// ...

public void run ( ) {

long delta = something ( taskId ) ;

delta + = somethingElse ( taskId ) ;

delta + = commit ( taskId ) ;

synchronized ( count ) {

count. addAndGet ( delta ) ;

}

}

}



public void run ( ) {

int processors = runtime . getRuntime ( ) . availableProcessors ( ) ;

ExecutorService pool = Executors. newFixedThreadPool ( processors ) ;

for ( String id : tasks )

pool. execute ( new Task ( id ) ) ;

pool. shutdown ( ) ;

try {

if ( ! pool. awaitTermination ( 20 , TimeUnit. MINUTES ) )

throw new InterruptedException ( "Time Limit Exceeded" ) ;

} catch ( InterruptedException e ) {

e. printStackTrace ( ) ;

pool. shutdownNow ( ) ;

System . exit ( 1 ) ;

}

System . out . println ( count ) ;

}



public static void main ( String [ ] args ) {

tasks = args ;

new thread ( new archiver ( ) ) . start ( ) ;

}

}




If you look at the source code, you can find only one really serious ideological difference: there is some entity here that distributes tasks to the released threads. The gain in time - only due to less synchronization.

Synchronization is slow. It must be minimized!



Afterword

In Java, there are many excellent built-in tools for performing certain tasks, but people continue to reinvent the wheel. In this article, it was shown that this is not only ineffective, but sometimes leads to additional intractable errors. For those who have not heard of ExecutorService and everything else, I strongly advise you to read the excellent book Effective Java (2nd Edition) by Joshua Blosh, in particular, its tenth chapter.

Source: https://habr.com/ru/post/73249/



All Articles