Richard Hamming: Chapter 9. N-Dimensional Space

Hi, Habr. Remember the awesome article "You and your work" (+219, 2222 bookmarks, 350k reads)?

So Hamming (yes, yes, self-checking and self-correcting Hamming codes ) has a whole book based on his lectures. We translate it here, because the man says it.

This book is not just about IT, it is a book about the thinking style of incredibly cool people. “This is not just a charge of positive thinking; it describes the conditions that increase the chances of doing a great job. ”
')
We have already translated 6 (out of 30) chapters.

Chapter 9. N-dimensional space

(Thanks to Alexey Fokin for the translation, who responded to my call in the “previous chapter.”) Who wants to help with the translation - write in a personal or mail magisterludi2016@yandex.ru

When I became a professor after 30 years of active research at Bell Telephone Laboratories mainly in the department of mathematical research, I remembered that professors should reflect on and summarize past experience. I put my feet on the table and began to think about my past. In the early years I was mainly involved in computations, that is, I was involved in many large projects that require computation. Thinking about how several large engineering systems were developed, in which I was partially involved, I began, being now at some distance from them, to see that they had many common elements. Over time, I began to realize that the design tasks are in n-dimensional space, where n is the number of independent parameters. Yes, we create 3-dimensional objects, but their design is in multidimensional space, 1 dimension for each design parameter.

Multidimensional spaces will be needed for further proofs to become intuitive without strict detail. Therefore, we will now consider the n-dimensional space.

You think you live in three-dimensional space, but in many cases you live in two-dimensional space. For example, in the random course of life, if you meet someone, you have a reasonable chance to meet that person again. But in the world of 3 dimensions this chance is not! Consider fish in the ocean that potentially live in three dimensions. They move along the surface or along the bottom, limiting things to two dimensions, or they form shoals, or gather in one place at the same time, for example, such as the mouth of the river, the beach, the Sargasso Sea, etc. They cannot expect to meet a friend if they travel in the open ocean in three dimensions. Or, for example, if you want the planes to collide you must collect them near the airport, place them on two-dimensional flight levels, or send them in a group; really random flights would have less accidents than it does now!

N-dimensional space is a mathematical construction that we must explore in order to understand what happens to us when we wander there when solving design problems. In two dimensions, we have the Pythagorean theorem; for a right triangle, the square of the hypotenuse is equal to the sum of the squares of the other sides. In three dimensions, we are interested in the length of the diagonal of the parallelepiped, Fig. 9.1. To find it, we first draw a diagonal of one face, apply the Pythagorean theorem, then take it as one side with the other side of the third dimension that is perpendicular to it, and again from the Pythagorean theorem we find that the square of the diagonal eats the sum of the squares of three perpendicular sides. Obviously, this proof and the necessary symmetry of the formula follow, that if you rise to higher dimensions, all of you also have a square of a diagonal equal to the sum of squares of mutually perpendicular sides,

where x _{i are the} lengths of the sides of a rectangular block in n-dimensional space.

Fig. 9.I

Continuing the geometric approach, the planes in space will be simply linear combinations of x _i , and the sphere around the point will be all points located at the same fixed distance from the given one.

We need the volume of the n-dimensional sphere to understand the idea of the size of a piece of limited space. But first, we need a Stirling approximation for n !, which I will derive, so that you understand most of the details and are confident of what is right, rather than taking a word.

With a product of type n! difficult to handle, so we take log n !, which becomes

where, of course, ln is the logarithm of base e. The sums remind us that they are connected with integrals, so we will start with such an integral

We use integration by parts (since we know that ln x comes from the integration of an algebraic function and, therefore, can be excluded in the next step). Let U = ln x, dV = dx, then

On the other hand, if we apply the trapezoid formula to the integral ln x we get, see Fig. 9.II,

Since ln 1 = 0, adding (1/2) * ln n to both sides of the equality we end up with

We will get rid of logarithms, raising e to the power of both parts.

where C is a certain constant (close to e) that does not depend on n, since we approximated the integral by flrmula trapezium, the error grows slower and slower with increasing n

Fig. 9.II

more and more, C has a limit. This is the first form of Stirling formula. We will not spend time on calculating the limit of the constant C as we approach infinity, which turns out to be √ (2 * π) = 2.5066 ... (e = 2.71828 ...). Thus, we finally get the Stirling formula for factorial

The following table shows the accuracy of the Stirling approximation for n!

Please note that as the numbers increase, the coefficient approaches one, but the difference gets bigger and bigger!

If you consider 2 functions

then the limit of the ratio f (n) / g (n) with n tending to infinity is 1, but as in the table the difference

getting bigger and bigger as n grows.

We need to extend the concept of factorial to the set of all positive real numbers, for this we introduce a gamma function in the form of an integral

which exists for all n> 0. For n> 1, we integrate again in parts, this time we use dV = e ^ (- x) dx and U = x ^ (n-1). For two limits, the integrable part is 0 and we have the following formula

with r (1) = 1.

Thus, the gamma function takes the values (n-1)! for all positive integers n and naturally expands the concept of factorial to all positive numbers, since the integral exists for all n> 0.

We will need

Denote x = t ^ 2, then dx = 2t * dt, and get (using symmetry in the last step)

Now we use the standard approach to calculate this integral. We get the product of two integrals, one in the variable x and one in the variable y.

x ^ 2 + y ^ 2 imply polar coordinates, so we convert to

Curvilinear integration (angle integration) is simple. Exponent integration is now also easy, and we end up with it.

In this way,

Now let's return to the volume n of the dimensional sphere (or hypersphere, if you wish). It is clear that the volume of an n-dimensional cube with side x is equal to x ^ n. A little thought, you will understand that the formula for the volume of the n-dimensional sphere should have the form

where C _{n is the} corresponding constant. For the case n = 2, the constant is π, for the case n = 1 it is 2 (if you think about it). In the three-dimensional case, we have C ₃ = 4 * π / 3.

We will start with the same trick used for the gamma functions from 1/2, except that this time we take the product of n integrals, each with its own variable x _i . The volume of a sphere can be represented as the sum of the volumes of surfaces, each term of this sum corresponds to the surface area multiplied by the thickness dr. For a sphere, the value of the surface area can be obtained by differentiating the volume of the sphere with respect to the radius r,

and therefore the terms of the volume are equal

Equating r ^ 2 = t, we have

where we get.

Easy to see that

And we can calculate the following table.

Thus, we see that the coefficient C _n increases to n = 5, and then decreases to 0. For spheres of unit radius, this means that the volume of the sphere tends to zero with increasing dimension. If the radius is r, then for volume, denoting n = 2k for convenience (since real numbers change smoothly with increasing n and the sphere of odd dimensions is more difficult to calculate)

Fig. 9.III

No matter how large the radius r is, an increase in the number of dimensions n gives rise to a sphere of arbitrarily small volume.

Now consider the relative amount of volume located close to the surface of the n-dimensional sphere. Let r be the radius of the sphere, and the inner radius of the surface is r (1-ε), then the relative surface volume is

For large n, no matter how thin (relative to the radius) the surface inside it is almost nothing. As we say the volume is almost all on the surface. Even in 3 dimensional space, a unit sphere has a 7/8 volume inside the surface with a thickness of 1/2 radius. In the n-dimensional space, 1- (1/2) ^ n within half the radius from the surface.

This is important in design; It turns out after the calculations and data transformations above, that almost surely the optimal design will be on the surface, and not deep inside as you might think. Computational methods are usually not suitable for finding the optimum in multidimensional spaces. This is not at all strange; generally speaking the best design is to bring one or more parameters to your extremum - it is obvious that you will find yourself on the surface of the visible design area!

Next we consider the diagonal of the n-dimensional cube, in other words, the vector from the origin to the point with coordinates (1,1, ..., 1). The cosine of the angle between this line and any axis is given by definition as the ratio of the value of the coordinate of the length of the projection onto this axis, which is obviously equal to 1 to the length of the vector, which is √n. Consequently

It follows that for large n the diagonal is almost perpendicular to each coordinate axis!

If we consider points with coordinates (± 1, ± 1, ..., ± 1) then there will be 2n such diagonals, which are all almost perpendicular to each coordinate axis. For n = 10, for example, their number is 1024 such almost perpendicular lines.

I need an angle between two vectors, and although you may remember that this is the scalar product of vectors, I propose to derive it again in order to better understand what is happening. [Remark; I found it very useful in important situations to review all the basic participating inference to get a feel for what is happening.] Take two points x and y, with the corresponding coordinates x _i and y _i . Fig. 9.III. Applying the cosine theorem in the plane of 3 points x, y and the origin, we have

where X and Y are the lengths of segments to points x and y. But C can be obtained using the coordinate differences on each axis.

Equating the two expressions we see

We now apply this formula for two segments drawn from the origin to a random point from the set of coordinates

(± 1, ± 1, ..., ± 1)

The scalar product of two such factors, taken randomly, is again equal to ± 1 and this should be summed n times, with the length of each segment equal to √n, therefore (note n in the denominator)

and according to the weak law of large numbers, this tends to 0 with increasing n almost surely. But there are 2 ^ n random vectors and for a given vector all the rest of the 2 ^ n random vectors are almost sure almost perpendicular to this one! n-dimension is really extensive!

In linear algebra and other disciplines, you learned how to find a set of perpendicular axes and then represent everything else in this coordinate system, but you see that in an n-dimensional space after you find n mutually perpendicular coordinate axes, there are 2 ^ n other directions almost perpendicular to those you found! The theory and practice of linear algebra is completely different!

Finally, to further prove that your intuition about n-dimensional space is not very good, I will produce one more paradox that I will need in the following chapters. Let's start with a 4x4 square divided into 4 unit squares, in each of which we draw the unit circle, Fig. 9.IV. Next, we draw a circle with the center in the center of the square, touching the rest from the inside. Its radius should be from Fig. 9.IV,

In the 3-dimensional space we have a 4x4x4 cube and 8 spheres of unit radius. The inner sphere touching the others at a point lying on the segments connecting the centers has a radius

Consider why its radius is larger than for 2 measurements.

Moving to n dimensions, we have a 4x4x ... x4 cube and 2 ^ n spheres, one in each corner, each touching the other n neighboring ones. The inner sphere, relating to the inside of all the others, will have a radius

Check it out carefully! You are sure? If not, why not? Where is the error in the reasoning?
Making sure this is true, apply to the case of n = 10 measurements. For the inner sphere, we have a radius

Fig. 9.IV

and in 10 dimensional space, the inner sphere extended beyond the cube. Yes, the sphere is convex, yes, it touches the remaining 1024 from the inside, and at the same time it goes beyond the limits of the cube!

This is too much for your sensitive intuition about n-dimensional space, but remember that n-dimensional space is the place where the design of complex objects usually takes place. You should try to better feel the n-dimensional space, reflecting on the things just described, until you begin to see how they can be true, or rather, why they should be true. Otherwise, you will have problems when you solve a complex design problem. Perhaps you should re-calculate the radii of different dimensions, and also return to the angles between the diagonals and the axes of coordinates and see how it turns out.

Now it is necessary to strictly note that I did all this in the classical Euclidean space using the Pythagorean distance, where the sum of the squared differences of the corresponding coordinates is equal to the square of the distance between the points. Mathematicians call this distance L ₂ .

The space L ₁ does not use the sum of the squares of the differences of coordinates, but rather the sum of the distances, as if you were traveling around the city with a rectangular grid of streets. This is the sum of the differences between the two points, which tells you how far you have to go. In the realm of computation, this is often called the Hamming distance, for reasons that will become clear in later chapters. In this space, a circle in two dimensions looks like a square, standing on top, Fig. 9.V. In three-dimensional space it is like a cube, standing on top, etc. Now you can better see how the paradoxical inner sphere of the above example can go beyond the cube.

There is a third, frequently used, metric (they all are metrics = distance functions), called L _∞ , or Chebyshev distance. Here, the distance is taken as the maximum difference of coordinates, regardless of the other differences, Fig. 9.VI. In this space, a circle is a square, a 3-dimensional sphere is a cube, and you see that in this case the inner sphere of paradox has a zero radius in all directions.

These were examples of metrics, distance measures. The conditions for determining the metric D (x, y) between two points x and y are as follows:

1. D (x, y) ≥ 0 (non-negative)
2. D (x, y) = 0 if and only if x = y (identity)
3. D (x, y) = D (y, x) (symmetry)
4. D (x, y) + D (y, z) ≥ D (x, z) (the triangle inequality).

Fig. 9.V

Fig. 9.VI

I leave to you to check that the three metrics of L _∞ , L ₂ and L ₁ (Chebyshev, Pythagoras and Hamming) all satisfy these conditions.

The truth is that in complex design for different coordinates we can use any of these metrics mixed together, so design space is not a complete picture, but a mixture of pieces and parts. The L ₂ metric is obviously associated with the smallest squares, and the other two L _∞ and L _{1 are} more similar to the comparisons. When comparing in real life, you usually use either the maximum difference L _∞ in any one characteristic as a sufficient condition for distinguishing two objects, or sometimes, as in strings of bits, this number of mismatches, which is significant, and the sum of squares does not fit, which means what is used is L ₁ metric. This is more true for pattern identification in AI.

Unfortunately, although everything described above is true, it is rarely revealed to you. No one ever told me about it! I will need a lot of the results in subsequent chapters, but generally speaking after this demonstration, you should be better prepared than you were before for complex design and for a thorough analysis of the space in which the design is carried out, as I tried to do. The mess is basically the place where the design appears, and you have to find an acceptable solution.

Since L ₁ and L _{∞ are} not well known, allow a few remarks about the three metrics. L _{2 is a} natural distance function for use in physical and geometric cases, including the extraction of data from physical measurements. Therefore in physics you will find L ₂ everywhere. But when the subject concerns intellectual judgments, the other 2 metrics are more suitable, although this is slowly perceived, so we often see frequent use of the chi-square estimate, which is obviously a measure of L ₂ , where other more appropriate assessments should be used.

To be continued...

Who wants to help with the translation - write in a personal or mail magisterludi2016@yandex.ru

Book content and translated chapters

Intro to Doing Science and Engineering: Learning to Learn (March 28, 1995) (in work)
Foundations of the Digital (Discrete) Revolution (March 30, 1995) Chapter 2. Basics of the digital (discrete) revolution
"History of Computers - Hardware" (March 31, 1995) (in work)
"History of Computers - Software" (April 4, 1995) is ready
"History of Computers - Applications" (April 6, 1995) (in work)
"Artificial Intelligence - Part I" (April 7, 1995) (in work)
"Artificial Intelligence - Part II" (April 11, 1995) (in work)
"Artificial Intelligence III" (April 13, 1995) (in work)
N-Dimensional Space (April 14, 1995) Chapter 9. N-Dimensional Space
"Coding Theory - The Representation of Information, Part I" (April 18, 1995) (in work)
"Coding Theory - The Representation of Information, Part II" (April 20, 1995)
"Error-Correcting Codes" (April 21, 1995) (in work)
Information Theory (April 25, 1995) (in work, Alexey Gorgurov)
Digital Filters, Part I (April 27, 1995) ready
Digital Filters, Part II (April 28, 1995)
Digital Filters, Part III (May 2, 1995)
Digital Filters, Part IV (May 4, 1995)
Simulation, Part I (May 5, 1995) (in work)
Simulation, Part II (May 9, 1995) ready
Simulation, Part III (May 11, 1995)
"Fiber Optics" (May 12, 1995) in work
“Computer Aided Instruction” (May 16, 1995) (in work)
"Mathematics" (May 18, 1995) Chapter 23. Mathematics
Quantum Mechanics (May 19, 1995) Chapter 24. Quantum Mechanics
Creativity (May 23, 1995). Translation: Chapter 25. Creativity
Experts (May 25, 1995) ready
“Unreliable Data” (May 26, 1995)
Systems Engineering (May 30, 1995) Chapter 28. System Engineering
"You Get What You Measure" (June 1, 1995) (in work)
How Do We Know What We Know (June 2, 1995)
Hamming, “You and Your Research” (June 6, 1995). Translation: You and Your Work

Who wants to help with the translation - write in a personal or mail magisterludi2016@yandex.ru

Source: https://habr.com/ru/post/348264/

All Articles

Richard Hamming: Chapter 9. N-Dimensional Space

Chapter 9. N-dimensional space

More articles: