📜 ⬆️ ⬇️

Parallel and distributed computing. Lectures from Yandex for those who want to spend the holidays with benefit

The holiday week is coming to an end, but we continue to publish lectures from the Yandex Data Analysis School for those who want to spend time with benefits. Today is the turn of the course, the importance of which in our time is difficult to overestimate - "Parallel and distributed computing."

What is inside: familiarity with parallel computing and distributed data processing and storage systems, as well as developing practical skills in the use of appropriate technologies. The course consists of four main blocks: concurrence, parallel computing, parallel processing of large data arrays and distributed computing.


')
Lectures are read by Oleg Viktorovich Sukhoroslov, senior researcher at the Center for Grid Technologies and Distributed Computing, ISA RAS. Associate Professor of the Department of Distributed Computing, FIFT MIPT. Candidate of Technical Sciences.


Concurrency (simultaneity).


Applications and problems. Ways to implement simultaneous systems, processes and threads, software tools. Basics of multi-threaded programming on the example of C ++ and Java. Typical errors multithreaded programming. Mutual exclusion and conditional synchronization. Memory model and low-level synchronization primitives. Alternative approaches to the implementation of simultaneous programs.

Parallel computations.


Applications and problems. Modern parallel computing systems. Theoretical foundations of parallel computing. Indicators of the quality of the parallel algorithm. Principles of development and typical structures of parallel algorithms. PCAM methodology. Parallel programming systems. Typical programming models and templates. Basics of parallel programming on shared memory systems on the example of OpenMP technology. Basics of parallel programming on systems with distributed memory on the example of MPI technology.

Parallel processing of large data arrays.


Big Data Phenomenon. MapReduce programming model. Principles of parallel implementation of calculations. Scope and examples of tasks. Principles of distributed implementation of MapReduce on cluster systems. Apache Hadoop platform. Application programming interfaces and implementation of programs for Hadoop. Local debugging and running programs on a cluster. Techniques and strategies for implementing MapReduce-programs. High-level languages ​​and tools for working with Hadoop. Practical examples of the use of MapReduce. MapReduce model limitations, extensions and alternative approaches.

Distributed systems and calculations.


Applications, features and types of distributed systems. Problems of building distributed systems. Theoretical bases of distributed computing, examples of distributed algorithms. Methods of interaction of distributed processes, network protocols. Distributed programming technologies. Familiarity with the language of Erlang. Distributed data storage systems, data replication, NoSQL systems. Distributed computing technology, grids, voluntary computing. Cloud computing systems.


Update: all lectures of the course “Parallel and distributed computing” in the form of an open folder on Yandex.Disk .

Source: https://habr.com/ru/post/208244/


All Articles