Purpose of the article: I want to share the experience of creating three Matlab computing clusters, as well as their remote administration.
Small introduction
In the study / modeling of various natural phenomena (and not only), occasionally there is a need for large computational abilities that a home PC can no longer cope with (no matter how powerful it is). In the end, I have this need.
The simulation associated with solving systems of nonlinear differential equations on a long interval of relative time takes a lot of CPU time, so it was decided to "split" the whole thing.
')
So, everything - in order
Iron available:
Houses : computer (Phenom II x4 840, 7x64) and laptop (Athlon II Dual-Core M320, 7x64) connected to the same network by the good old DIR-300 router.
At home with a girl: computer (i5 4440, 7x64).
At work: 10 computers (Athlon II Dual-Core, XPx86) (connected in one network)
in one room and 4 (Athlon II Dual-Core, XPx86)
in another (also connected in one network). There is no local network between rooms.
All of the above boxes have Internet access.
Getting to the creation of 2 clusters at work.
The
article describes how to create a cluster, but it does not indicate a lot of pitfalls during its creation, which would almost have buried my idea. (Although everything was done according to the wonderful instructions, thanks to the author!)
To begin with, I would like to note that first of all you need to install Matlab correctly. The point here is not to “not breathe during installation”, or “correctly” select components, but that there are 2 versions of Matlab. One Server, another local. So, if you install only one of them - it will not work.
In this
article there are data that will help to understand the issue with the versions, however, it should be noted that in the new version of Matlab R2013b, the installer works a little differently than described on the screenshots in the article, so first you need to install the local version with Parallel Computing Toolbox, and only then, in another folder, the server version with the Distributed Computing Server, otherwise there will be an error when starting the Parallel Computing Toolbox:
Starting parallel pool (parpool) using the 'MJSProfileXXX' profile ... Error using parpool (line 111) Failed to start a parallel pool. (For information in addition to the causing error, validate the profile 'MJSProfile8' in the Cluster Profile Manager.) Error in parallel.internal.ui.PoolHelper.startPool (line 11) parpool(); Caused by: Error using parallel.internal.pool.InteractiveClient/start (line 326) Failed to start pool. Error using parallel.Job/submit (line 304) All dimension arguments must be greater than zero
This error is very popular in various forums, but no one says how to get rid of it. It occurs when Parallel Computing Toolbox is launched on the server version of Matlab (the start parallel pool button).

Therefore, it is necessary
to start parallel / cluster calculations
with Mastera on the local version of Matlab .
Install two versions of Matlab in one folder, as recommended in the article above - the installer R2013b did not allow, and as it turned out, did the right thing!
After the full installation of Matlab software, there should be 2 folders on the Master. The first one is with the local version of Matlab, which hooks all the extensions of the Matlabov files and creates shortcuts, and the second folder with the Matlab Distributed Computing Server installed.
(the second folder is autonomous and can be transferred to all computers on the local network to save time when deploying a cluster)The computer on which 2 versions of Matlab are installed will be considered a master, since it is from it that our program will run.
On the other computers of
this local network, you only need to install the server version of Matlab (
or simply copy the second folder from Master to any directory on the other computers )
Now you can safely use the above article to create a local Matlab cluster. However, it is worth noting that to run the! Mdce install and! Mdce start commands for the first time, it follows from the Matlab window itself no matter which version. This will help to avoid the error of the absence of the vcredist64 / 86 libraries, since if you run the! Mdce install command from the Matlab window, they will be installed by themselves. Otherwise, the mdce server, called for example from a batch file, may simply not rise, despite the lack of libraries.
You need to run these commands from the bin folder of the 2nd folder respectively (there is a famous mdce.bat file).
Personally, I had enough 3 teams on the Master to run the
cluster :
!mdce install !mdce start !admincenter
And already from the Admin Center you can create a scheduler and spread workers over the computers. But here again there is an underwater rock.
Damn firewall! I strongly recommend that you turn it off and tightly! With all the rules of incoming and outgoing connections and with all exceptions. This is the only way I could get the Admin Center to add all the computers on this local network. By the way, when adding a computer on a local network, you can set its name for example Siegurd-PC and are not afraid of a dash. At least in the latest version of Matlab it works.
When adding computers to the Admin Center, it is necessary that the mdce service is already running on each computer and is hanging in the processes. At the same time, Matlab himself can be closed on every computer, since he does not participate in the work.
Admin Center has the ability to start the mdce service remotely, but I never managed to do this. Perhaps the fault is the lack of administrative rights to access the folders of computers on the local network, but this is not so important and does not affect the task in any way.
And yes, when you start mdce, there will most likely be the following messages:
Setting permissions on LOGBASE C:\Windows\TEMP\MDCE\Log Setting permissions on CHECKPOINTBASE C:\Windows\TEMP\MDCE\Checkpoint Setting permissions on SECURITY_DIR C:\Windows\TEMP\MDCE\Checkpoint\security Unable to give the "Administrators" group full control for C:\Windows\TEMP\MDCE\Checkpoint\security You may need to manually give the "Administrators" group read and write access to this directory. Unable to give the "CREATOR OWNER" group full control for C:\Windows\TEMP\MDCE\Checkpoint\security You may need to manually give the "CREATOR OWNER" group read and write access to this directory. Unable to give the "Authenticated Users" group traversal rights for C:\Windows\TEMP\MDCE\Checkpoint\security You may need to manually give the "Authenticated Users" group traversal rights to this directory.
I just ignore them, since they do not affect the performance of the cluster. The article on the correct installation describes in detail the method of treating these errors associated with the Russian-language axis.
Important! When using the Admin Center to create workers, developers recommend making sure that port 7 is open in case of errors.
Create Scheduler
In the Admin Center, click on any of the added computers with the right mouse button and select Start MJS in the context menu, or click on Start in the Admin Center itself:

After creating the scheduler, we similarly add workers to each computer.
This is all about setting up a cluster in the
local network.
Run calculations
To start cluster calculations, you need to run a local version of Matlab on the Master and add a scheduler. In the Mtalaba main window, click on Diskover Clusters ...

Then search for a previously created scheduler on the local network:

After adding it, you need to select the number of workers in the settings of the parallel profile!

Go to Parallel Preferences and select the number of workers to which you should connect.
Important! If the set number of workers in the settings is less than that created in the Admin Center, then the calculations will take place only on the specified number of workers. That is, if you have created 20 stations, and in the settings it is 4, then only 4 of the first Admin Centers in the list will work. The status of the workers you connected to should change from
idle to
busy .
If the set number of workers is greater than that created, Matlab will connect to all existing workers without errors.
After the work done, you can safely run your code, which will itself be parallelized between all the current scheduler (cluster) workers. (I personally used the parfor loop for this, but there are other commands)
This scheme is involved in 2 rooms and masters are administered remotely through team viewer from home. Unfortunately, these clusters are not interconnected, since a fully connected network is required (each with each one), and I couldn’t manage to set up a VPN between so many computers.
Creating a home cluster on VPN
In many ways, the cluster deployment was similar to the previous ones, but the use of VPN created several obstacles. I can get rid of that only after long dances with a tambourine.
For uniting home computers and a computer of his girlfriend into one network, the well-known Hamachi was used. The current version allows you to add 5 machines to the network for free.
How to do it right:On all computers install Hamachi. Create a virtual network on any of them and connect everyone to this network. Again, disable the firewall,
if it’s not okay , we start the services and the scheduler ...
When adding computers to the Admin Center, you need to add them by
IP addresses , not by names. It is important! The computer master on which the calculations will be started from the local version of Matlab should be added by name.
In the Hamachi settings, we disable traffic encryption, traffic compression, and traffic filtering (set the value: allow all). Only this way I managed to ensure that all workers received the status of Connected! Before these actions, the workers were created, but their status was Failed to connect.
After all workers received the Connected status, you can safely begin the calculations using the instructions described above.
Note. Hamachi is good, but unstable, especially when experimenting with network settings, so if all of a sudden you cannot connect to a remote computer, I recommend restarting both (Master, and the computer with which there is no connection) if it does not help, reinstall Hamachi.
Oh, and one more. The mdce service, once installed and running, will turn on itself with the computer turned on until it is stopped. However, sometimes, when changing network settings and conflicts, I advise you to restart this service as a solution to the problem with the commands:
!mdce stop !mdce start
Results
Thus, overcoming all these subtleties, 3 computational clusters were organized, which, through timvuuver, are remotely administered from any computer at any time. The main thing - do not forget to turn off sleep and auto power off on all computers!
Once again I want to thank the authors of 2 articles that were used above. Thanks you!
After all, without your work, Me would never have managed to raise these clusters!
I hope my article will help people who wanted to create a Matlab computing cluster, but stumbling over the pitfalls and could not do it.
PS: If anyone knows how to create a
fully connected, free VPN on windows - please light up. This will help many scientists in organizing and conducting serious scientific research.
Good luck with the deployment of clusters!
Thanks for attention!