I had the idea to make the maximum number of simultaneously running Tetris for one shader (one framebuffer texture).
Further, a brief description of how the resulting code works.
Each tetris works in three pixels, for 1920x1080
resolution, you can run 619200
copies simultaneously. Also made a simple bot for auto-play.
At the end of the post links to run and source.
Table "Tetris" size [10, 22]
(10 width, 22 height).
Each cell can be either empty or not empty.
A total of 22 * 10 = 220
bits is required to store the entire table.
One "pixel" is four 24-bit float, 96 bits per pixel.
Visually (a piece of the debug frame), three pixels are highlighted in red, this is one saved field:
2 * 96 + 24 + 4
Two pixels, one float of the third pixel, 4 bits of the second float of the third pixel
There are two unused floats in the third pixel pixel3.zw , they store the state of logic , more precisely
[a,b,c]
-1
to this number, as it became 0 then it falls on the block down[a,b,c]
, but also the sign (positive or negative) of the whole float is the end-of-game flag in the current table (so as not to waste resources if the field is filled up )[b,c]
0xffff (16 bits) points of the current table, the number of lines that burnedOnly 20
left unused in the second float of the third pixel.
debug frame showing that save logic is working correctly
on the left are a white field with a size of three pixels, set specifically to show that the gaps are processed correctly (with a resolution not a multiple of three, the bar will go at an angle)
condition on line 75 Buffer A
For the test, set the #define debug in Common and AI 0 there.
I got this result - 10FPS when rendering and processing all 619200 fields,
120 thousand fields (25fps)
The logic is very bad , the bot burns in a minute, and gets up to 60 points.
I could not run a good logic with a lot of cycles checking holes and protrusions and burning fields, considering the best position based on all the possible falls ...
Good logic worked for me up to 100 copies and gave a strong lag when going around all the cycles.
My bot logic works like this
All the logic is in the function AI_pos_gen in Buffer A, there are ten lines of it.
Pseudocode:
The height of the check for setting the block is equal to the maximum for the field in the current column (checking one line for height)
(4 ){ ( (10)){ ( ){ ( , ) best ID() best POS } } } ( ) ( ) 0 1
It turns out three cycles that are trivial - they put the block so that the height is minimal.
The AI_pos_gen function is called when a new block appears, and returns a position to fall from above , taking the block ID and making it rotate, the function works in the third pixel (logic), that is, it has a full loaded map (map array).
You can easily try to write your bot, if you wish.
Slowest place
Having added just one cycle to check the holes , my video card driver fell when the number of bots was more than 10 thousand ... the bot that I wrote is the most "minimalistic" version of the bot that I could do, and it is very bad unfortunately.
All rendering in Image , UI logic in Buffer B.
Rendering:
Split the screen into tiles and draw on the table in each tile, the minimum load.
Card loading logic - the whole map is not unpacked every pixel, only the "necessary bit" (literally) is unpacked, the function code:
int maptmp(int id, int midg) { int nBits = 8; ivec4 pixeldata = loadat(id, midg); int itt = (id / 24) / 4; //data pixel id 0-2 int jtt = (id - itt * 24 * 4) / 24; //component in data pizel id 0-3 int ott = (id - itt * 24 * 4 - jtt * 24) / 8; //component in unpacked value 0-2 int ttt = (id - itt * 24 * 4 - jtt * 24 - ott * 8); //bit after int2bit 0-7 ivec3 val = decodeval16(pixeldata[jtt]); int n = val[ott]; for (int i = 0; i < nBits; ++i, n /= 2) { if (i == ttt) { if ((n % 2) == 0)return 0; else return 1; //switch + return does not work on windows(Angle) /*switch (n % 2) { case 0:return 0;break; case 1:return 1;break; }*/ } } return 0; }
To avoid pixelation when scrolling, starting with 43000, the fractional part of the float is lost, and there is no way to add 619 thousand to UV for scrolling (there will be pixels instead of tables).
All scrolling is divided into one big tile and spinning in a circle, adding a maximum of 32 to UV. (line 207 in Image ).
The same is done to determine the field ID. (line 215 in Image )
Numbers:
Yellow - the number of Tetris fields.
Left large - the number of the current field.
On the right there are fewer points of the current field.
Bufer A logic, Bufer B is the UI control, Image rendering
Source link https://www.shadertoy.com/view/3dlSzs (compile time via Angle 16 seconds)
There is disabled bot (you can enable), and all the fields are playable from the keyboard.
Controls arrows left / right / up / down.
UI red reset box, move (drag the mouse by pressing the LMB) and click on the fields to scroll or select a field to display.
Run from a web browser:
The second option is to run the shader in any "shader launcher", here is a link to the archive ( download ) in which the * .exe file with this shader.
OpenGL compile time about 10 sec.
Update : added a shader with holes checking https://www.shadertoy.com/view/wsXXzH
instead of the condition for the best position at the same height. The check_block_at_wh
function has been check_block_at_wh
(line 380 BufA) considers holes along with checking the validity of the situation, no new cycles have been added, and the condition line 442 through 459 BufA.
It also burns quickly in a minute within 30-60 points (obviously, you need to check a large area for holes, but this gives you strong brakes)
And two pictures a little explaining the work:
position selection https://i.imgur.com/e0uENgV.png
the position of the block for the condition is https://i.imgur.com/ORECXUW.png
Source: https://habr.com/ru/post/443042/
All Articles