📜 ⬆️ ⬇️

Crouching in the shadows or searching for that light


Assembler is my favorite language ... but life is so short.

I continue the cycle of research on the issue of suitable shadows for a bagel. After publishing once and twice, I have somewhat cooled down on this topic, but the effect of incomplete action prompts me to return to the marania of pixels, and complete the gestalt .

Knowing myself, I am sure that the game will hardly get embodied, but perhaps some of the public will be interested in my work on this thorny path. And so begin.

Already at the end of the last cycle I came to understand that calculating graphics on a CPU is already the last century, but natural stubbornness kept saying: not all the possibilities have been used yet, there are also options for interesting solutions.

Reverse ray tracing was not embodied. More precisely, its kind, where for each pixel of the image (block of pixels) a ray is forwarded and the level of illumination of the current point is determined. The algorithm itself is described in the last article and there is no point in returning to it. For back ray tracing, the code was further simplified, the entire trigonometry was removed completely, which in the future could give an acceptable result.
')
Pascal
const tile_size = 32; //   tile_size1 : single = 0.03125; // 1/32 -    block_size = 4; // /    Size_X:Byte = 32; //    X Size_Y:Byte = 24; //    Y //--------------------------------- function is_no_empty(x,y:Integer):Integer; begin if (x>=0) AND (x<Size_X) AND (y>=0) AND (y<Size_Y) then begin if map[x,y]=1 then begin is_no_empty:=1; end else if map[x,y]=2 then begin is_no_empty:=2; end else is_no_empty:=0; end else is_no_empty:=-1; end; //--------------------------------- function crossing(r_view, x,y:Single; xi,yj, i,j:Integer):Byte; var di,dj,ddi,ddj :Shortint; //   k,i2,j2 :integer; //    key:Boolean; last_k, transp_key :Byte; sum_lenX,sum_lenY, Dx,Dy,Dx1,DY1, l :Single; //  sec1,cosec1, temp_x,temp_y, dx0,dy0 :Single; //   i0,j0 :Integer; //       begin temp_x := i*block_size; temp_y := j*block_size; i0 := trunc(temp_x * tile_size1); j0 := trunc(temp_y * tile_size1); l := sqrt(sqr(temp_y-y) + sqr(temp_x-x)) + 0.0000001; transp_key := 0; //     if is_no_empty(xi,yj)>0 then inc(transp_key); if (xi=i0) and (yj=j0) then begin crossing := min(255,transp_key*64+ l * r_view); exit; end; dx0 := (temp_x-x)/l+0.0000001; dy0 := (temp_y-y)/l+0.0000001; key := False; last_k :=0; //   if dx0<0 then begin di :=-1; ddi:= 0; end else begin di := 1; ddi:= 1; end; if dy0<0 then begin dj :=-1; ddj:= 0; end else begin dj := 1; ddj:= 1; end; sum_lenX := 0; sum_lenY := 0; sec1 := 1/dx0; cosec1 := 1/dy0; //       Y temp_x := x-(xi+ddi) * tile_size ; temp_y := y-(yj+ddj) * tile_size ; Dx := sqrt(sqr(temp_x) + sqr(temp_x * sec1 * dy0)); DY := sqrt(sqr(temp_y) + sqr(temp_y * cosec1 * dx0)); //      Y Dx1 := abs(tile_size * sec1); Dy1 := abs(tile_size * cosec1); repeat if sum_lenX+DX < sum_lenY+DY then begin xi += di; k := is_no_empty(xi,yj); sum_lenX += DX; if DX<>Dx1 then DX := Dx1; end else begin yj += dj; k := is_no_empty(xi,yj); sum_lenY += DY; if DY<>Dy1 then DY := Dy1; end; if key Then begin if (xi<>i2) Or (yj<>j2) then begin //  (  ) if last_k=1 then begin crossing := 255; exit; end; //   (  ) if transp_key>2 then begin crossing := 255; exit; end; inc(transp_key); key:= false; end; end; if k>0 then begin i2:=xi; j2:=yj; key:=true; last_k:=k; end; //    if (xi=i0) and (yj=j0) then begin crossing := min(255, transp_key*64+ l * r_view); exit; end; until k=-1; //     end; //--------------------------------- .................. x0:= mouse_x; y0:= mouse_y; //       x1 := x0 div tile_size; y1 := y0 div tile_size; koef := tile_size div block_size; //      (     ) for j:=0 to Size_Y * koef do for i:=0 to Size_X * koef do picture_mask.SetPixel(i, j, BGRA(0,0,0,crossing(x0, y0, x1, y1, i, j))); .................. 


Alas, the result turned out to be much worse than expected, it was enough to expand the picture to full screen, FPS aimed for units.



Grouping pixels into macroblocks to reduce computations and applying subsequent smoothing improved performance slightly. The effect frankly did not like the word at all.



The algorithm paralleled well, but it didn’t wash out many streams, the effect seemed much worse than in the previous article, even with better picture quality.
It turned out to be a dead end. It was necessary to admit, the CPU in calculating the graphics in my eyes had exhausted itself. A curtain.

Digression 1
Over the past decade, there has been almost no progress in the development of general-purpose processors. If approached by the user, then the maximum observed performance gain is no more than 30% per core. Progress, to put it mildly, is insignificant. If we omit the extension of the length of the vector instructions, and some acceleration of the conveyor blocks, then this is an increase in the number of working cores. Safe work with threads, it’s still fun, but not all tasks can be successfully paralleled. I would like to have a working core, albeit one, but since it is 5-10 faster, but as they say, alas.
Here on Habré there is an excellent series of articles "Life in the era of" dark "silicon," which explains some of the prerequisites for the current state of affairs, but also returns from heaven to earth. In the next decade, we can not expect any significant increase in computing per core. But we can expect further development of the number of GPU cores and their general acceleration. Even on my old laptop, the estimated total GPU performance is 20 times higher than a single CPU thread. Even if it is efficient to load all 4 cores of the processor, it is much less than we would like.
I pay tribute to the developers of the graphics of the past, who made their masterpieces without hardware accelerators, real masters.

So, we deal with the GPU. It turned out to be somewhat unexpected for me that in this practice very few people just scatter polygons in shape. All more or less interesting things are created using shaders . Discarding ready-made 3D engines, I tried to study the technology offal as they are at a deep level. The same processors are the same assembler, only a few trimmed instruction set and their own specific work. For the sample I stopped at GLSL , C-like syntax, simplicity, a lot of training lessons and examples, including those in Habré.
Since I was mostly used to writing on Pascal , the challenge was how to connect OpenGL
to the project. I managed to find two ways to connect: the GLFW library and the dglOpenGL header file. The only thing in the first I could not connect shaders, but apparently it is from the curvature of my hands.

Digression 2
Many friends ask me why I write on Pascal? Obviously, this is an endangered language, its community is steadily falling, there is almost no development. Low-level system vendors prefer C, and Java, Python, Ruby or whatever is now at its peak.
For me, Pascal is like a first love. Two decades ago, back in the days of Turbo Pascal 5.5 , he sunk into my soul and has since walked through my life, be it Delphi or in recent years Lazarus . I like the predictability of the language, the relative low level (assembler inserts and viewing processor instructions), compatibility with C. The main thing is that the code is collected and executed without problems, and the fact that it is not fashionable is outdated, and there are not some features, this is nonsense. It is said that there are people who still write to LISP , but he is generally half a century old.

So, plunge into the development. For the trial step, we will not take accurate realistic shading models, but try to implement what we have tried before, but with the performance of the GPU, so to speak, for visual comparison.

Initially, I thought of getting a shadow of approximately the same shape for an object using triangles.



To create a smooth circle effect, you need a mass of polygons. But what if you use triangles to a minimum, using a pixel shader to create a hole in the shape. The idea came to me after reading the article by a respected master, which opened the opportunity to create spheres with a shader.



If you extend the triangle beyond the screen, you end up with this:



The borders of the shadow turned out to be very hard and, moreover, stepped. But there is a way to get an acceptable result without using supersampling , this is the use of smoothed borders. To do this, a little change the scheme. The corners of polygons at the intersection of the tangent to the circle will be made transparent.



The result is better, but still looks unnatural.



Add a slight smoothing of the circle to give softness, and also change the appearance of the gradient from linear to power.



It is an acceptable result.
And as a result we will add the objects imitating obstacles on the form.



Shader code
//

#version 330 core
layout (location = 0) in vec2 aVertexPosition;
void main(void) {
gl_Position = vec4(aVertexPosition.xy, 0, 1.0);
}

//

#version 330 core
layout (points) in;
layout (triangle_strip, max_vertices = 5) out;
uniform mat4 uModelViewMatrix;
uniform float uRadius;
uniform vec2 uHeroPoint;
out float fTransparency;
out vec2 vCenter;
void main(){
vCenter = gl_in[0].gl_Position.xy;
vec2 d = uHeroPoint - vCenter;
float l = length(d);
float i = uRadius / l;
float ii = i*i;
float ij = i * sqrt(1 - ii);
vec2 p1 = vec2(vCenter.x + dx*ii - dy*ij , vCenter.y + dx*ij + dy*ii);
vec2 p2 = vec2(vCenter.x + dx*ii + dy*ij , vCenter.y - dx*ij + dy*ii);
d = uHeroPoint - p1;
vec2 p3 = vec2(p1 - d/length(d)*1000000);
d = uHeroPoint - p2;
vec2 p4 = vec2(p2 - d/length(d)*1000000);
fTransparency = 0;
gl_Position = uModelViewMatrix * vec4(p1, 0, 1);
EmitVertex();
fTransparency = 1;
gl_Position = uModelViewMatrix * vec4(p3, 0, 1);
EmitVertex();
gl_Position = uModelViewMatrix * vec4(vCenter, 0, 1);
EmitVertex();
gl_Position = uModelViewMatrix * vec4(p4, 0, 1);
EmitVertex();
fTransparency = 0;
gl_Position = uModelViewMatrix * vec4(p2, 0, 1);
EmitVertex();
EndPrimitive();
}

//

#version 330 core
precision mediump float;
varying float fTransparency;
varying vec2 vCenter;
uniform float uRadius;
uniform vec2 uScreenHalfSize;
uniform float uShadowTransparency;
uniform float uShadowSmoothness;
out vec4 FragColor;
void main(){
float l = distance(vec2((gl_FragCoord.xy - uScreenHalfSize.xy)/uScreenHalfSize.y), vCenter.xy);
if (l<uRadius) {discard;}
else {FragColor = vec4(0, 0, 0, min(pow(fTransparency, uShadowSmoothness), (l-uRadius)/uRadius*10)*uShadowTransparency);}
}


I hope it was informative

Your humble servant, pixel raiser, Rebuilder.

I attach a small demo . (EXE Windows)

PS The title of the article contains Easter eggs , a reference to the trilogy of the Chronicles of Siala . Excellent work in the style of fantasy, about the misfortunes of the horns, from Alexei Pekhov.

Source: https://habr.com/ru/post/446986/


All Articles