Why 2D vector graphics are much more complicated than 3D.

Recently, a lot of fantastic research on 2D-rendering has appeared. Peter Kobalichek and Fabian Yzerman are working on Blend2D : this is one of the fastest and most accurate CPU rasterizers on the market, with innovative JIT technology. Patrick Walton from Mozilla studied not one, but three different approaches in Pathfinder , the culmination of which was Pathfinder v3. Raf Levien built a computing pipeline using the technology described in a scientific paper by Ghana with colleagues on vector textures (2014) . It seems that the distance fields with a sign are getting some further development: Adam Simmons and Sarah Frisken independently work here.

Some may ask: why is there so much noise around 2D? That can't be much harder than 3D, right? 3D is a completely different dimension! Here we have real-time ray tracing with accurate illumination on the nose, but you can’t master the nondescript 2D graphics with solid colors?

For those who are not very well versed in the details of a modern GPU, this is really very amazing! But in 2D graphics, there are many unique limitations that make it extremely difficult. In addition, it can not be parallelized. Let's walk along the historical path that brought us here.

PostScript takeoff

In the beginning was the plotter. The first graphic device capable of interacting with a computer was called a “ plotter ” (plotter): one or several pens capable of moving on paper. Everything works on the “pen-down” command, then the drawing head moves in a unique way, possibly along a curve, and the “pen-up” command is received . HP, the manufacturer of some of the earliest plotters, used a BASIC variant called AGL on the control computer, which then sent commands to the plotter in another language, such as HP-GL . In the 1970s, graphic terminals became cheaper and more popular, starting with Tektronix 4010 . He showed the image using a CRT, but do not be fooled: this is not a pixel display. Tektronix came from the industry of analog oscilloscopes, and these machines work by controlling the electron beam in a certain way . Thus, Tektronix 4010 did not have pixel output. Instead, you sent him commands in a simple graphic mode that could draw lines, but, again, in pen-down mode, pen-up mode.
')
As in many other areas, everything changed the invention of Xerox PARC. Researchers began to develop a new type of printer, more computationally expressive than a plotter. This new printer worked on a small stack-turing-full programming language similar to Forth, and called it ... Interpress ! Obviously, Xerox could not find a worthy use for it, so the inventors left the ship and founded a small startup called Adobe. They took Interpress with them, and as they were corrected and improved, it changed beyond recognition, so they gave it a different name: PostScript. In addition to the cute, turing-complete stack language, the fourth chapter of the original PostScript Language Reference is an Imaging Model, which is almost identical to modern programming interfaces. Example 4.1 from the manual contains sample code that can be almost line by line translated into HTML5 <canvas>.

/box { function box() { newpath ctx.beginPath(); 0 0 moveto ctx.moveTo(0, 0); 0 1 lineto ctx.lineTo(0, 1); 1 1 lineto ctx.lineTo(1, 1); 1 0 lineto ctx.lineTo(1, 0); closepath ctx.closePath(); } def } gsave ctx.save(); 72 72 scale ctx.scale(72, 72); box fill box(); ctx.fill(); 2 2 translate ctx.translate(2, 2); box fill box(); ctx.fill(); grestore ctx.restore();

This is no coincidence.

Steve Jobs from Apple met with Interpress engineers during his visit to PARC. Jobs thought the printing business would be profitable, and he tried to buy Adobe at birth. But Adobe made a counter offer and eventually sold Apple a five-year PostScript license. The third pillar in Jobs’s plan was financing a small startup, Aldus, who made a WYSIWYG application for creating PostScript documents. It was called PageMaker. In early 1985, Apple released the first PostScript compatible printer, the Apple LaserWriter. The combination of Macintosh, PageMaker and LaserWriter instantly turned the printing industry on its head, and the new hit "desktop publishing" has strengthened its place in history for PostScript. The main competitor Hewlett-Packard eventually also bought a PostScript license for a series of competing LaserJet printers. This happened in 1991 after consumer pressure.

PostScript has slowly moved from printer control language to file format. Clever programmers studied the form in which PostScript commands were sent to the printer — and began to create PostScript documents manually, adding charts, graphs, and drawings to their documents, while using PostScript to display graphics on the display. There is a demand for graphics outside the printer! Adobe noticed and quickly released Encapsulated PostScript , which was no more than a few specially formatted PostScript comments with image size metadata and restrictions on the use of printer commands, such as page feed. In the same 1985, Adobe began developing Illustrator, an application where artists worked in Encapsulated PostScript in a convenient UI. Then these files could be transferred to a word processor, which created ... PostScript documents for sending to PostScript printers. The whole world switched to PostScript, and Adobe could not be even happier. When Microsoft was working on Windows 1.0 and wanted to create its own graphical API for developers, the main goal was to make it compatible with existing printers so that graphics were sent to printers as easily as the screen. This API was eventually released as GDI - the main component used by every engineer during the rapid growth of Windows popularity in the 90s. Generations of programmers on the Windows platform began to unknowingly identify 2D vector graphics with the PostScript image model, securing it with this de facto status.

The only serious PostScript problem was turing completeness: viewing the 86th page of a document means first running the script for pages 1-85. And it may be slow. Adobe found out about this complaint from users and decided to create a new document format that did not have such restrictions, it was called “Portable Document Format” or, for short, “PDF”. A programming language was thrown out of it, but the graphic technology remained the same. A quote from the PDF specifications, chapter 2.1, “Image Model” :

PDF is based on its ability to describe the appearance of complex graphics and typography. This capability is achieved by using the Adobe image model, the same high-level, device-independent representation that is used in the PostScript page description language.

When the W3C consortium considered applicants for 2D markup on the Internet, Adobe defended XML -based PGML , which was based on the PostScript graphic model:

PGML should include a PDF / PostScript image model to ensure scalable 2D graphics that meet the needs of both regular users and graphics professionals.

Microsoft’s rival VML format was based on GDI, which we know is based on PostScript. The two competing proposals, which were still essentially PostScript, were merged together, so the W3C adopted the “scalable vector graphics” (SVG) standard that we know and love today.

Even if it is old, let's not pretend that PostScript innovations brought into this world are something less than a technological miracle. Apple's PostScript Printer LaserWriter was twice as powerful as the Macintosh that controlled it, just to interpret PostScript and rasterize vector paths to points on paper. This may seem excessive, but if you have already bought a fashionable printer with a laser inside, then there is no need to be surprised at the expensive processor. In its first incarnation, PostScript invented a rather complex visualization model with all the features that we now take for granted. What is the most powerful, awesome feature? Fonts. At that time, the fonts were drawn by hand with a ruler and protractor and cast on a photo chemical printing film. In 1977, Donald Knut showed the world what his METAFONT system was capable of, which he introduced along with the text editor TeX, but it did not take root. She demanded from the user a mathematical description of the fonts using brushes and curves. Most font developers did not want to learn it. And the bizarre bends with small sizes turned into a mess: the printers of that time did not have sufficient resolution, so the letters blurred and merged with each other. PostScript has proposed a new solution: an algorithm for “binding” outlines to coarser grids with which printers operated. This is known as grid fitting. To prevent too much distortion of the geometry, they allowed fonts to set “hints” which parts of the geometry are the most important and what should be preserved.

Adobe's original business model was to sell this font technology to printer developers and sell special recreated fonts with added tips to publishers, so Adobe is still selling its versions of Times and Futura . By the way, this is possible because fonts, or, more formally, “typefaces,” are one of five things that are clearly excluded from US copyright law , because they were originally designated as “too simple or utilitarian to be creative works.” Instead, a digital copyright program that reproduces the font on the screen is protected by copyright. So that people cannot copy Adobe fonts and add their own, the Type 1 Font format was originally Adobe property and contained “font encryption” code. Only PostScript from Adobe could interpret Type 1 fonts, and only in them was the proprietary hint technology implemented, which provides clarity on small sizes.

The grid fits, by the way, became so popular that when Microsoft and Apple were tired of paying Adobe licensing fees, they invented an alternative method for their alternative TrueType font format . Instead of specifying declarative “hints”, TrueType gives the font author a full turing-full stack language so that the author can control all aspects of grid fitting (avoiding Adobe patents for declarative hints). For many years, there was a war between the Adobe Type 1 and TrueType formats, and the font developers were stuck in the middle, providing users with both formats. In the end, the industry has reached a compromise: OpenType . But instead of actually determining the winner, they simply pushed both specifications into one file format. Adobe now earned not on the sale of Type 1 fonts, but on Photoshop and Illustrator, so it removed the cryptographic part, refined the format and introduced the CFF / Type 2 fonts, which are entirely included in OpenType as a cff table . On the other hand, TrueType pasted as glyf and other tables. Although somewhat ugly, OpenType seemed to do the work for users, basically took them out of business: just demand that all software support both types of fonts, because OpenType requires you to support both types of fonts.

Of course, we are forced to ask: if not PostScript, then what's in its place? It is worth considering other options. The previously mentioned METAFONT did not use strictly defined letter shapes (filled paths). Instead, Knut, in his typical manner, in the article “Mathematical typography” proposed for typography a mathematical concept that is “most pleasant.” You specify several points, and some algorithm finds the correct "most pleasant" curve through them. You can put these outlines on top of each other: define some of them as “feathers”, and then “drag the feathers” through some other line. Knut, a computer scientist at heart, even introduced recursion. His student, John Hobby, developed and implemented algorithms for calculating the “most pleasant curve,” imposing nested paths, and rasterizing such curves . For more information on METAFONT, curves and typography history in general, I strongly recommend the book Fonts and Encodings , as well as John Hobby's articles .

Fortunately, the renewed interest in 2D studies meant that Knut's and Hobby's splines were not completely forgotten. Although they are definitely abstruse and unconventional, they recently snuck into the Apple iWork Suite , and are the default spline type there.

Triangles take off

Without going too far into the math of the math, at a high level, we call approaches such as Bezier curves and Hobby splines, implicit curves , because they are listed as a mathematical function that generates a curve. They look good on any resolution, which is great for 2D images intended for scaling.

The 2D graphics supported the momentum around these implicit curves, which are almost indispensable when modeling glyphs. The hardware and software for calculating these paths in real time was expensive, but from the printing industry came a big push for vector graphics, and most of the rest of the existing industrial equipment was already much more expensive than a laser printer with a fancy processor.

However, 3D graphics went a completely different route. From the very beginning, an almost universal approach was the use of polygons (polygons), which were often manually labeled and manually entered into a computer . However, this approach was not universal. The 3D equivalent of an implicit curve is an implicit surface consisting of basic geometric primitives, such as spheres, cylinders, and cubes. An ideal sphere with infinite resolution can be represented by a simple equation, therefore at the dawn of 3D development for geometry it was clearly preferable to polygons . One of the few companies that developed graphics with implicit surfaces was MAGI . In combination with the clever artistic use of procedural textures, they won a contract with Disney for the design of the “light motorcycle” for the 1982 film “Tron”. Unfortunately, this approach quickly faded away. Thanks to the acceleration of the CPU and the study of such problems as “removing the hidden surface”, the number of triangles you could display in the scene was rapidly growing, and for complex shapes it was much easier for artists to think about polygons and vertices that you can click and drag, rather than use combinations of cubes and cylinders.

This does not mean that implicit surfaces were not used in the modeling process . Techniques like the Catmella-Clark algorithm became the accepted industry standard by the early 80s, allowing artists to create smooth, organic simple geometric shapes. Although until the beginning of the 2000s, the Catmell-Clark algorithm was not even defined as an “implicit surface” that can be calculated using the equation. Then it was considered as an iterative algorithm: a way to divide polygons into even more polygons.

Triangles took over the world and were followed by tools for creating 3D content. New developers and designers of video games and special effects in films were trained solely on modeling programs with polygonal grids, such as Maya, 3DS Max and Softimage. When in the late 1980s, “3D graphics accelerators” (GPUs) appeared on the scene, they were designed specifically to speed up existing content: triangles. Although early GPU projects, such as NVIDIA NV1 , had limited hardware support for the curves, it was buggy, and was quickly removed from the product line.

This culture basically extends to what we see today. The dominant 2D model of PostScript images began with a product that could display “real-time” curves. At the same time, the 3D industry ignored curves with which it is difficult to work, and instead relied on autonomous solutions to pre-convert curves into triangles.

Implicit surfaces are returned.

But why could the implicit 2D curves be calculated in real time on a printer in the 80s, and the same implicit 3D curves are still very buggy in the early 2000s? Well, the Catmella-Clark algorithm is much more complicated than the Bezier curve. Bezier curves in 3D are known as B-splines, and they are well computable, but there is a drawback that they limit the way the mesh is connected. Surfaces like Catmella-Clark and NURBS allow arbitrarily connected grids to expand the possibilities of artists, but this can lead to polynomials of more than fourth degree, which, as a rule, have no analytical solution . Instead, you get approximations based on the separation of polygons, as is done in Pixar OpenSubdiv . If someone ever finds an analytic solution to find Catmell's roots — Clark or NURBS — Autodesk will pay him a lot. Compared to them, the triangles seem much nicer: just calculate the three linear equations on the plane , and you have an easy answer.

... But what if we do not need an exact solution? This is precisely the question that graphic designer Iñigo Quilles asked when conducting research on implicit surfaces. Decision? Distance fields with a sign (signed distance fields, SDF). Instead of giving the exact point of intersection with the surface, they say how far you are from it. Similar to the difference between the analytically calculated integral and the Euler integral, if you have a distance to the nearest object, you can “march” around the scene, at any given point asking how far you are and passing this distance. Such surfaces breathed a whole new life into the industry through the demoscene and communities like Shadertoy. The hack of the old MAGI modeling technique brings us incredible discoveries, such as Quilhès Surfer Boy , calculated with infinite precision as an implicit surface. You don’t need to search for the algebraic roots of Surfer Boy, you just feel the scene go by.

Of course, the problem is that only a genius like Kilhes is capable of creating Surfer Boy. There are no tools for SDF geometry, all code is written by hand. Nevertheless, given the exciting revival of implicit surfaces and the natural shapes of curves, now there is a lot of interest in this technique. The PS4 MediaMolecule Dreams game is a content creation kit based on a combination of implicit surfaces. In the process, most of the traditional graphics are destroyed and recreated . This is a promising approach, and the tools are intuitive and interesting. Oculus Medium and unbound.io also did some good research on this issue. This is definitely a promising look at what the future of 3D graphics and next-generation tools might look like.

But some of these approaches are less suitable for 2D than you might think. In general 3D gaming scenes, as a rule, advanced materials and textures, but few geometry calculations, as many critics and sellers of questionable products immediately point out. This means that we need less smoothing, because the silhouettes are not so important. Approaches like 4x MSAA may be suitable for many games, but for small fonts with solid colors, instead of 16 fixed sample locations, you would rather calculate the exact area under the curve for each pixel, which will give you as much resolution as you want.

Rotation of the screen in a 3D game causes effects similar to saccadic suppression , as the brain retunes to a new look. In many games, this helps to hide artifacts in post-processing effects, such as temporal smoothing , which Dreams and unbound.io rely on to get good scene performance. Conversely, in a typical 2D scene, we do not have this luxury of perspective, so an attempt to use it will make the glyphs and forms boil and tremble with these artifacts in full. On 2D, they look different, and expectations are higher. When scaling, panning and scrolling, stability is important.

None of these effects can be implemented on the GPU, but they show a radical departure from the “3D” content, with other priorities. Ultimately, 2D-graphics rendering is complex, because we are talking about forms — exact letters and symbols, and not materials and lighting, which are mostly solid colors. As a result of evolution, graphics accelerators decided not to cheat on implicit real-time geometry, such as curves, but instead focused on everything that happens inside these curves. Perhaps, if PostScript had not won, we would have a 2D model of an image without Bezier curves as the main requirement for real-time.Perhaps in such a world, instead of triangles, the best geometric representations would be used, the content creation tools focused on 3D splines, and the GPUs would support real-time curves at the hardware level. In the end, it is always fun to dream.

Source: https://habr.com/ru/post/451394/

All Articles

Why 2D vector graphics are much more complicated than 3D.

PostScript takeoff

Triangles take off

Implicit surfaces are returned.

More articles: