📜 ⬆️ ⬇️

Multicellular microprocessor, "macrocycles" for working with arrays

The stumbling block for the von Neumann architecture is working with an array of data: checking values, summing, the product of elements. This action requires the organization of the cycle from the first element to the last. In any cycle there are necessarily overheads: calculating the address of the next element, changing and checking the counter. In addition, if the processor architecture does not support loops at the firmware level, then each cycle command must be read from the memory or cache, decoded and executed. The efficiency of such a process can be quite low. The only plus is the length of the program code. Just a few bytes for any size array.

For processors working with vectors, the overhead is lower. The number of cycles is reduced by N times, where N is the number of simultaneously processed elements of the array. However, this does not remove the problem. Yes, we add the cost of the array boundary, if the number of elements does not fit an integer number of times into the processor vector.

To reduce the overhead, it would be possible to treat the elements of the array as free variables, that is, to create a program that would process the elements in pairs, then the results in pairs, etc. But this method is possible only for a fixed array size. Yes, and the size of the program can be significant.

However, the latter method or its modification is more suitable for multi-core architectures, where it would be possible to organize the processing of an array with parts on different cores or processors in parallel. But here again a restriction will appear in the compilation of the program. If the size of the array is dynamically changed, then it is difficult or impossible to automatically parallelize its processing on different kernels. The program has a fixed number of commands and pointers. Or you have to allow the modification of the program code. Which is very fraught.
')
And now we turn to the multicellular processor. Did the authors of the processor consider the idea of ​​a "multi-team"? What I mean. This is a common machine command or group of commands, but it has a dynamic repeater. In the program code, it takes several bytes, and during the execution of the program it is duplicated a specified number of times into the command buffer, depending on the length of the array being processed. You can duplicate the entire sequence at once, which is impractical, but you can make as many copies as there are cells in the processor or even limit the number of cells that participate in this program. Each copy of this command can be processed by its own cell completely independently and in parallel.

It is clear that each subsequent copy of the command processes the next or next array elements with its own cell. This mechanism is almost free from the overhead of the organization of cycles, does not increase the size of the compiled program, allows you to parallelize the processing of array elements to any number of processor cells. In addition, one of the cells will perform an extreme array processing operation and track the need for further processing. At achievement of required conditions, actions with an array can be stopped. Of course, the compiler will determine the choice and location of the multi-command and the completion command, and the processor will already duplicate the command.

If there are no patents and publications of this idea. With this publication I claim the copyright of May 17, 2015.

At the moment, the author of the publication is ready to accept the job offer.

Source: https://habr.com/ru/post/258175/


All Articles