It's no secret that Flash Player 11 has support for GPU graphics acceleration. The new version introduces the Molehill API, allowing you to work with a video card at a fairly low level, which, on the one hand, gives you full imagination, on the other hand, requires a deeper understanding of the principles of modern 3D graphics. This article focuses on the shader writing language - AGAL (Adobe Graphics Assembly Language). It is assumed that the reader is familiar with the basic fundamentals of modern real-time 3D graphics, and ideally has experience with OpenGL or Direct3D . For the rest I will spend a little digression:
in each frame everything is rendered anew, approaches with partial screen redrawing are extremely undesirable
2D - a special case of 3D
the graphics card can rasterize triangles and nothing else
triangles are built on vertices
each vertex contains attributes (coordinate, normal, weight, etc.)
the order in which the vertices in the triangle are defined is determined by the indices
vertex and index data is stored in the vertex and index buffers respectively
Shader - a program executed by a video card
each vertex passes through the vertex shader, and each pixel during rasterization through a fragment (pixel)
the video card does not know how to work with integers, but it works fine with 4D vectors
Syntax
In the current implementation of AGAL, the Shader Model 2.0 trim is used, i.e. iron fitchist limited to 2005. But it is worth remembering that this restriction is only the capabilities of the shader program, but not the performance of the hardware. Perhaps in future versions of Flash Player, the bar will be raised to SM 3.0, and we can render several textures at once and make a texture sample directly from the vertex shader, but taking into account the Adobe policy, this will happen not earlier than the next generation of mobile devices.
Any program on AGAL is essentially a low-level assembly language. The language itself is very simple, but it requires a fair amount of attentiveness. The shader code is represented by a set of instructions like:
opcode [dst], [src1], [src2]
what in the free interpretation means “to execute the opcode command with the src1 and src2 parameters, returning the value in dst”. A shader can contain up to 256 instructions. As dst, src1 and src2 are the names of registers: va, vc, fc, vt, ft, op, oc, v, fs. Each of these registers, with the exception of fs, is a four-dimensional (xyzw or rgba) vector. It is possible to work with individual components of the vector, including swizzling (a different order):
dp4 ft0.x, v0.xyzw, v0.yxww
Consider each of the types of registers in more detail.
Register output
As a result of the calculation, the vertex shader is obliged to write the value of the window position of the vertex to the op (output position) register, and the fragment shader - to oc (output color) the value of the final pixel color. In the case of a fragment shader, it is possible to cancel the kil instruction, which will be described below. ')
Register attribute
A vertex can contain up to 8 vector attributes, which are accessed from the shader via the va registers, whose position in the vertex buffer is specified by the Context3D.setVertexBufferAt function. Attribute data can be in FLOAT_1, FLOAT_2, FLOAT_3, FLOAT_4 and BYTES_4 formats. The number in the title indicates the number of components of the vector. It should be noted that in the case of BYTES_4, the component values are normalized, i.e. are divided by 255.
Interpolator register
In addition to writing to the op register, the vertex shader can transfer up to 8 vectors to the fragment shader via the v registers. The values of these vectors will be linearly interpolated over the entire area of the polygon during rasterization. We illustrate the work of interpolators using the example of a triangle, at the vertices of which the attribute displayed by the fragment shader is stored:
In the vertex and fragment shaders, up to 8 registers vt and ft are available for storing intermediate calculation results. For example, in a fragmentary shader, you must calculate the sum of four vectors taken from the vertex program (v0..v3 registers):
As a result, ft0 will store the amount we need, and everything seems to be great, but there is a seemingly unobvious optimization possibility that is directly related to the architecture of the video card software pipeline and is partly the reason for its high performance.
The basis of shaders laid the concept of ILP (Instruction-level parallelism), which, judging from the name, allows you to perform several instructions simultaneously. The main condition for using this mechanism is the independence of instructions from each other. In relation to the example above:
The first two instructions will be executed at the same time, since work with independent registers. It follows that the key role in the performance of your shader is played not so much by the number of instructions as by their independence from each other.
Register-constant
Storage of numerical constants directly in the shader code is not allowed, i.e. all the constants necessary for operation should be passed to the shader before the Context3D.drawTriangles call, and will be available in the vc (128 vectors) and fc registers (28 vectors). It is possible to refer to the register by its index using square brackets, which is very convenient when implementing skeletal animation or indexing materials. It is important to remember that the operation of specifying shader constants is relatively expensive and should be avoided if possible. For example, it makes no sense to transfer to the shader the projection matrix in front of the render of each object, if it does not change in the current frame.
Sampler Register
Up to 8 textures can be passed to the fragment shader using the Context3D.setTextureAt function, which are accessed via the corresponding fs registers, which are used exclusively in the tex statement. Let's slightly change the example with a triangle, and as the second attribute of the vertex, we will transfer the texture coordinates, and in the fragment shader we will make a texture sample using these already interpolated coordinates:
The remaining operators, including conditional jumps and loops, are planned to be implemented in future versions of Flash Player. But this does not mean that now even the usual if cannot be used, the slt and sge instructions are quite suitable for these tasks.
Effects
We got acquainted with the basics, now the most interesting part of the article is the practical application of new knowledge. As mentioned at the very beginning, the ability to write a shader completely untie the hands of a graphics programmer, i.e. the actual limitations are only in the imagination and mathematical ingenuity of the developer. Previously, it was possible to make sure that the assembly language itself is simple, but behind the simplicity lies the complexity of “tasting” in the already forgotten code. Therefore, I highly recommend commenting on key sections of the shader code in order to quickly navigate in it if necessary.
Stocking
The starting point for all the following examples will be a small “blank” in the form of a teapot. Unlike the example with a triangle, we need a matrix of projection and transformation of the camera, to create the effect of perspective and rotation around the object. We will pass it to the constant registers. Here it is important to remember that the 4x4 matrix occupies exactly 4 registers, and when writing it to the vc0 register, v0..v3 will be occupied. We also need a constant vector of numbers often used in the shader (0.0, 0.5, 1.0, 2.0). Total, the base code of the shader will look like this:
Up to 8 textures are possible in the shader, with an almost unlimited number of samples. This means that this limit does not matter when using atlases or cubic textures. Let's improve our example and, instead of setting the color in the fragment shader, we will get it from the texture by the interpolator texture coordinates adopted from the vertex shader:
The most primitive lighting model imitating the real. Based on the position that the intensity of light falling on a surface linearly depends on the cosine of the angle between the falling vectors and the normal to the surface. From the school course of mathematics we recall that the scalar product of unit vectors gives the cosine of the angle between them, therefore, our Lambert lighting formula will look like: Lambert = Diffuse * (Ambient + max (0, dot (LightVec, Normal))) Color = Lambert where Diffuse is the color of the object at a point (taken from a texture for example), Ambient - background color LightVec - a unit vector from a point to a light source Normal - perpendicular to the surface Color - the final color of the pixel
The shader will take two new constant parameters: the position of the source and the value of the background light:
Introduces the concept of glare from a light source into a Lambert lighting model. It implies that the intensity of the flare is determined by the power function of the cosine of the angle between the vector to the source and the direction resulting from the reflection of the observer's vector relative to the normal to the surface. Phong = pow ( max (0, dot (LightVec, reflect (-ViewVec, Normal))), SpecularPower) * SpecularLevel Color = Lamber + Phong where ViewVec is an observer view vector SpecularPower - the degree that determines the size of the flare SpecularLevel - the intensity level of the flare or its color reflect - the function of calculating the reflection f (v, n) = 2 * n * dot (n, v) - v
For complex models, it is customary to use Specular and Gloss maps, which determine the color / intensity (SpecularLevel), as well as the size of the flare (SpecularPower) on different parts of the textural space of the model. In our case, we will manage constant values of a degree and intensity. In the vertex shader, we will pass a new parameter - the position of the observer for the subsequent calculation of ViewVec:
A relatively simple method to simulate surface relief using normal textures. The direction of the normal in such a texture is usually given in the form of an RGB value obtained from reducing its coordinates to the range of 0..1 (xyz * 0.5 + 0.5). Normals can be represented both in the object space (Object Space) and in relative space (Tangent Space), built on the basis of texture coordinates and the normal to the vertex. The first one has a number of sometimes significant drawbacks in the form of a large memory consumption for textures due to the impossibility of tiling and mirror-texturing, but it allows saving on the number of instructions. In the example, we will use a more flexible and generic version with Tangent Space, for which, in addition to the normal, we will need two more additional vectors of the basis Tangent and Binormal. The implementation is reduced to transferring the viewVec and lightVec vectors to the TBN (Tangent, Binormal, Normal) basis, and further sampling the relative normal from the texture in the fragment shader.
A type of non-photorealistic lighting model that simulates a cartoon shading layout. It is implemented in a variety of ways, the simplest of which is to select a color from a 1D texture along the cosine of an angle from the Lambert model. In our case, use the 16x1 texture as an example:
The easiest option to simulate reflection, often used for the effect of chromium metal. Represents the environment as a texture with a spherical fish eye distortion, as shown below: The main task is to convert the coordinates of the reflection vector to the corresponding texture coordinates: uv = (xy / sqrt (x ^ 2 + y ^ 2 + (z + 1) ^ 2)) * 0.5 + 0.5 Multiplication and shift by 0.5 are needed to bring the normalized result to the space of textural coordinates 0..1. In the simple case, for a perfectly reflective surface, the effect of the map is additive, and for more complex cases when a diffuse component is required, it is customary to use the Fresnel formula approximation. Also for complex models, Reflection maps are often used, indicating the intensity of reflection of different parts of the model texture.
On this probably finish. The examples presented here, for the most part, describe the properties of the object's material, but the shaders find their application in other tasks, such as skeleton animation, shadows, water, and other relatively complex tasks (including non-visual ones). And with proper leveling up of skills, they allow to implement fairly complex things in a short time by the type: