Currently, there is a huge number of networks of distributed computing. I counted about 30. The largest ones are
Folding @ home ,
BOINC ,
SETI @ home ,
Einstein @ Home ,
Rosetta @ home (several dozen dissertations were written according to the results of their calculations). They calculate everything that can only be calculated distributed - from the selection of md5 passwords to the simulation of protein coagulation.
Each of these networks has an unusually high performance and includes millions of nodes. The performance of each is comparable to the performance of a supercomputer.
- Rosetta @ home - more than 110 TFlops
- Einstein @ Home - more than 355 TFlops
- SETI @ home - more than 560 TFlops
- BOINC - more than 5.6 Pflops
- Folding @ home - more than 5.9 Pflops
- Bitcoin - more than 9.4 Pflops
Compare with supercomputers:
- Blue Gene / L (2006) - 478.2 TFlops
- Jaguar (supercomputer) (2008) - 1.059 Pflops
- IBM Roadrunner (2008) - 1.042 Pflops
- Jaguar Cray XT5-HE (2009) - 1.759 Pflops
- Tianhe-1A (2010) - 2.507 Pflops
- IBM Sequoia (2012) - 20 Pflops
And now, let's calculate the existing unused potential of Internet users:
According to calculations at the end of 2010, the Internet users were about 2 billion (2 billion).
Each user has at least 1 processor core with a capacity of at least 8 Gflops (AMD Athlon 64 2.211 GHz).
According to simple mathematical calculations, the performance of such a network will be:
8 * 10 9 * 2 * 10 9
=
16 Exaflops (10
18 ).
Such a network is
800 times more productive than a non-built IBM Sequoia (2012),
1700 times more productive than a Bitcoin network and more productive than all supercomputers and computer networks combined! Now the number of PC and Internet users is growing, and the number of cores is growing. Of course, this number (16 exaflops) is perfect, no one will calculate 24/7, but if each user calculates at least 2 minutes a day (which, in principle, is more than realistic), then such a network will compare with IBM Sequoia.
Now distributed browser-based computing networks in JavaScript are more than real.
This article is a logical continuation of my article a year ago:
Distributed computing in javascript')
What has changed over the year and what prevented the creation of a computer network a year ago?Almost all good browsers for the year received WebWorkers, localStorage, SQL DB, IndexedDB. A year ago, nothing prevented us from calculating in the main JavaScript stream and using Flash Storage, but the calculation in the main stream is a terrific source of lags, and Flash Storage is limited in volume. A year ago, we would have got a distributed network-disabled: laguyu, crutch, obsessive.
Now, with the help of WebWorkers, we can utilize 100% of the resources of the 1st processor core, if there are 2 workers, then 2 cores (distribution of cores by cores depends on the implementation of the workers in a particular browser). We are practically not limited by the amount of stored data: 50Mb IndexedDB (Firefox) + 5Mb localStorage + some more storage. These 55 + MB will be enough for us to store task data and intermediate data. At the end of 2010, in 2011, the unusual quickly began to develop. Node.js. I believe this is the perfect solution for a distributed computing server.
We have : Suitable technologies Node.js + WebWorkers + localStorage + IndexedDB. 2000000000 Internet users, whose number is increasing. The number of cores is growing and their performance is increasing. Every month browsers are getting faster and faster. Now is the time to send this stream of non-reclaimed capacity of 16 Exaflops in the right direction!
Where can clients be embedded networks?
While you are viewing the page, your processor is loaded at 10-20%, while you are watching a YouTube video, your processor is loaded at 30-50% (I don’t think more). You have to watch ads and annoying flash banners that can load your processor. Imagine that instead of viewing annoying banners and advertisements, you are asked to calculate for a good cause: you watch videos from YouTube, but at this time your browser calculates protein folding for Folding @ home. Imagine that while you are downloading a file from your favorite file hosting service, at this time your browser is calculating something useful for it you are not watching ads (I know adBlock very well). Imagine that while you are reading this article your browser is calculating something useful. In addition, each user who comes to the site, does something useful for the site, something that can bring income or benefit to society. Utopian, but realizable.
What can be calculated?
Any task that needs a number thrasher: brute force tasks, training neural networks, etc. and which is calculated in parallel, because according
to Amdal's Law, distributed computing is most effective only if the task does not have consecutive calculations, i.e. Calculations of one node are independent of the data of another.
Interesting? Let's make such a network!
Example of distributed computing: Password selection from md5 hash
In the example, I will show what network architecture can be selected for this task. We will select a password length of 8 or less characters, the alphabet of 96 or less characters from the md5 hash. It is clear that, one way or another, the problem is solved only by complete brute force. We will not use password dictionaries or any tricky schemes - just brute force.
Distribution of tasks
We have a maximum of 96
8 potential passwords. Let us give each password a number in the 10-d system from 1 to 96
8 . Now each password can be obtained by translating the number in the 10-level system into the 96-
from10toN
system (
from10toN
), using a non-tricky conversion and alphabet:
var alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKL" + "MNOPQRSTUVWXYZ+/*-\\?=`~!@#$%^&*()_{}[];:'\"|.,<> ", alphabetLength = alphabet.length; function from10toN (number, base) { if (!base || base > alphabetLength) { base = alphabetLength; } if (base < 2) { base = 2; } var result = ''; while (number > 0) { result = alphabet.charAt(number % base) + result; number = Math.floor(number / base); } return result; }
Each task will contain a span of 400,000 passwords for matching (Google Chrome calculates about 200,000 md5 per second). In total, we have 18034739475 tasks - a lot, but not so hopeless as with a password of 16 characters ... It may happen that the client took the task, but did not complete it. For each task, add the time after which it will become obsolete - expires.
The network client's logic is elementary - in the loop we loop through the passwords from N1 to N2 for each one we find md5, the resulting hash is compared with the standard. If the hashes match, send the password to the server, otherwise the empty string:
EcdcWorker.prototype.calculateSync = function (id, data) { var maxPasswordId = data.max, password, alphabetBase = data.base, hash = data.hash; for (var i = data.min; i <= maxPasswordId; i++) {
Customer Logic
1. The client comes to the server, is authorized
2. The client loads computing scripts and other intermediate logic.
3. The client deletes outdated tasks from the Storage.
4. The client starts the Workers (the number depends on the settings)
5. The client checks the tasks (in the Storage) that were completed but not delivered to the server — Sends these tasks through Workers
6. Each Worker requests the tasks from the Server or takes the tasks not completed from the Storage via the Client (1 or more)
7. Client saves tasks in the Vault (in case the page is reloaded)
8. Each Worker begins to calculate his task.
9. After the task is completed, the worker will save the solution to the Vault (in case the page is reloaded or the server is not available)
10. The worker sends the task solution to the Server (and so on from point 6)
While the client performs the task, other clients (scripts on other pages in one browser) are blocked.
Server Logic
1. Server authorizes Client
2. A request for a task from the Worker comes - the Server checks outdated tasks, if there are any - sends to the client
2.1. If there are no obsolete tasks, it creates a new task, sends it to the client.
3. A worker sends a response to the task - the Server checks the response, marks the task as completed
3.1. The server sends the new task to Worker (and so on from point 2)
4. As soon as the Server receives the correct response from the Worker. The server does not stop working - it does not issue any tasks.
General scheme
[Workers: EcdcWorker] / \ Tasks: XHR / \ Messages: postMessage / Page: html \ [Server: EcdcServer] ------------ [Browser: EcdcClient] --- [User] | | [Database: Any] [Storage: localStorage]
Above was presented a simple scheme of operation of MD5 Bruteforce server, it is practically possible to implement the scheme using the Framework for building distributed computing networks
JavaScript ECDCResult
What happened with me you can see here:
Password Recovery Server from md5 hash (when you first log in, you will get the message "You are unauthorized. Login"),
you can use any email or any name, they are used to maintain your statistics - your contribution to the amount of computation (stored as md5 hash) .
Statistics of password selection can be found
here (authorization is required).
The web client works only in browsers with support for Workers, localStorage, JSON, XMLHttpRequest. If you enter the phrase "You are calculating md5", then you are a teacher in the calculation, hooray! I have enabled the work log of the workers, what they do, you can see in any console.
You can embed a calculation frame on your page; its code can be found in the source code of the main page.
Links
1. A
working example of a password guessing server2.
Password selection statistics (authorization on the main one is required)
3.
JavaScript Framework for building distributed computing networks4. Source code of the server in the example:
md5-bruteforce-server.js ,
md5-bruteforce-server /Conclusion
The system proved its viability (in the test, I successfully picked up a 3-character password, but seriously!) It remains to check it for fairly large amounts of users, one check the hosting potential of the nodester.
Do you participate in any distributed computing? Do you think there is a future for browser-based distributed computer networks? Did you want to calculate something useful instead of watching ads and watching videos from YouTube?
Criticism, suggestions, suggestions are welcome!