Richard Hamming: Chapter 19. Modeling - II

“The goal of this course is to prepare you for your technical future.”

Hi, Habr. Remember the awesome article "You and your work" (+219, 2442 bookmarks, 389k reads)?

So Hamming (yes, yes, self-checking and self-correcting Hamming codes ) has a whole book based on his lectures. We translate it, because the man is talking.

This book is not just about IT, it is a book about the thinking style of incredibly cool people. “This is not just a charge of positive thinking; it describes the conditions that increase the chances of doing a great job. ”
')
We have already translated 22 (out of 30) chapters. And we are working on the publication "in paper."

Chapter 19. Modeling - II

(For the translation, thanks to V.Pinchuk, who responded to my call in the “previous chapter.”) Who wants to help with the translation - write in a personal or email magisterludi2016@yandex.ru

We now turn to the question of the reliability of the simulation. I think it is appropriate to start with a quote from the 1975 summer conference on computer modeling:

“Computer modeling is now widely distributed to analyze system models and evaluate theoretical solutions to observed problems. Since responsible decisions must be based on modeling, it is important that its validity is verified, and that its adherents are able to describe the level of authenticity of the presentation that they have achieved. ”

But, unfortunately, often when asked about the reliability of the simulation results, they will tell you how much effort went into it, how big and fast the computer was used, how important this task was, and other things that have absolutely nothing to do with the question asked.

I would mark the problem a little differently:

Why should someone believe in the relevance of modeling?

Do not start any simulation until you have thought carefully about this question and have not found the proper answers. Often, under any plausible pretexts, they try to postpone the answer to this question, but until a satisfactory answer is received, everything you take will be a waste of effort, or even worse, delusion, or simply a mistake.

This question covers both the accuracy of the simulation and the accuracy of the calculations.

Let me mention another story. It happened one evening after a technical meeting in Pasadena, California, when we all went to dinner together, and I had the opportunity to sit next to a man who talked about modeling the reliability of the first space flights for which he was responsible. It was at the time when eight cosmic launches were made. He said that the launch was not carried out until more than 99 percent of the reliability of, say, 99.44% of reliability was ensured. To which I noticed that there were eight cosmic launches; at the same time, with one full-scale ground-based experiment, astronauts died, and there was another clear failure, so where did you get such high reliability? He stubbornly stood on his own, but fortunately for me the man sitting on the other side of him joined my position, and we forced him reluctantly to admit that he expected not the reliability of the launch itself, but only the reliability of its simulation. He further claimed that everyone understood this. His refusal to answer my insistent question: “Including the director, who finally approves the flight?” Was an obvious admission that I was right, he himself knew that the director did not understand this difference, but he thought that the result reflected the reliability of the launch itself.

Later he tried to justify what he did, supposedly there was nothing left to do but something else could be done, to which I immediately pointed out a lot of things that he could have done to link modeling with reality much closer. It was on Saturday evening, but I am sure that on Monday morning he returned to his usual methods of identifying the conformity of modeling with reality, taking minimum or no independent testing at all that is quite accessible to him. Here is what you can expect from experts in modeling - they are immersed in the modeling process and pay little attention or do not pay it at all to reality, or even "observable reality."

Consider the ubiquitous modeling of business and military exercises that is happening today. Is everything material correctly included in the model, or are we teaching people to do the wrong things? How relevant are the reality of these game models? And many other models?

We have long had pilots simulators that provide much more comprehensive and useful training than can be provided in real life. In the simulator, we can expose the pilot to emergency situations that we would not venture to in reality, and simply could not even hope for the possibility of creating the variety of situations that the simulator provides. Obviously, these simulators are very valuable assets. They are relatively cheap, efficient to use and very flexible. In modern terms, they are examples of “virtual reality”.

But time goes on, and other types of aircraft are being developed. Will people then be as careful as they should be to reflect all new interactions in the model, or will some small, but vital, behavioral features of the new aircraft be overlooked, and this will create gaps in pilot training in such situations?

Here this problem is clearly visible. It is not that modeling is not necessary now and in the near future (due to possible errors), but for the current generation, with little experience with reality, it is necessary to clearly understand what is necessary to make sure that modeling takes into account all the basic details . How can you convince yourself that you were not mistaken somewhere in a huge amount of detail? Remember how many computer programs even after many years of operation still have serious errors in them! In many situations, such mistakes mean the difference between life and death for one or many people, not to mention the loss of valuable equipment, money and time.

Proper accuracy and reliability of the simulation is a serious problem. Unfortunately, there is no panacea or magic spells for this problem. All that you have is yourself.

Let me now tell you about my most superficial approach to modeling. In the summer of 1955, Bell Telephone Laboratories decided to hold an open door so that the people living next door, as well as relatives and friends of employees, could learn a little about what those people who worked there do. I was then at the head of a large analog differential analyzer, and I had to provide a demonstration of his work all Saturday afternoon. At the time, we were mainly engaged in the trajectories of guided missiles, and I did not want to risk with safety, trying to show some simplified versions. Therefore, I decided to tennis - a game that illustrates aerodynamics, trajectories, etc. - would just be a serious demonstration of what we did, in any case, I thought it would be much more attractive and interesting for visitors.

With the help of classical mechanics, I asked the equations, including the elastic rebound, set up the car to play on one side against the human player on the other. The angle of the racket and the force with which you send the ball were set on two convenient dials. Let me remind you that in those days (1955) slot machines have not yet become commonplace in many public places, so this view was a novelty for visitors. Then I invited a friend, an experienced physicist, who was also an avid tennis player, to check and adjust the constants for rebound (for the asphalt court) and air resistance. When he was satisfied, secretly from him, I asked another physicist to give a similar opinion, but without letting him change the constants. Thus, I got a reasonable model of tennis without the ball “spinning”.

If it were for something other than social entertainment, I would have to do much more. I would have to hang a tennis ball on a string in front of a variable power fan and carefully note the deflection angles for different flow rates, thus obtaining air resistance, including depending on the wear of the tennis ball. I would have to throw balls from different heights and measure the rebound in order to check the linearity of the elastic constants. If this were an important task, I could take some games off and see if I could reproduce the blows without spinning. I have not done any of these things! It was not worth the cost. Therefore, it was my most careless modeling.

However, the main part of the story is what happened! The visiting groups were told about what was happening, some helpers, while they were shown the display of the game on the graphic output devices. Then we let them play against the car, and I programmed the model so that the car could lose. Observing the whole process from primary sources, man and car, I noticed after a while that no adult understood what was happening enough to play successfully, and almost every child did it! Think about it! This indicates the flexibility of the young minds and the stiffness of the elderly! Now it is obvious that most older people can not use a VCR, and children can!

Remember this fact, older minds have more trouble adapting to new ideas than younger minds, as you will introduce new ideas and official presentations to older people throughout most of your career. The ability of your children to understand what you are showing has little to do with the perception of the audience to which you are giving a presentation. It was a cruel lesson that I had to learn, and I tried not to repeat this mistake again. Older people are not sensitive to the perception of new ideas - this does not mean that they are stupid, stupid, or anything else in this sense, just the elderly usually slowly adapt to radically new ideas.

I emphasized the need for complete control in the model of the basic laws of the area that you are modeling. But there are no such laws in the economy! The only law of economics that I trust is Hamming's law: “You cannot consume that which is not produced.” There is no other reliable law in the whole economy, they are all the mathematical tautology mentioned above, or simply false. Therefore, when you are engaged in modeling in economics, you do not have that reliability, as in exact sciences.

Let me give another example. A few years ago at the University of Berkeley, the following happened. Approximately an equal number of men and women applied for higher education, but many more men than women were accepted. There was no reason to believe that men were, on average, better prepared than women. Therefore, from the point of view of the ideal model of justice, there was no obvious discrimination. But the university president demanded to investigate which departments were guilty. Careful research has shown that no faculty was guilty! How can it be? Easy! Different faculties have a different number of vacancies for admission of school graduates and different ratios of the number of men and women applying for them. Those with many vacancies and many incoming men are exact sciences, including mathematics, and those with few vacancies and many incoming women are humanitarian, such as literature, history, drama, social sciences, etc. Thus, discrimination, if we say that it occurred, is explained by the fact that men in their youth more often focused on mathematics, as a base for exact sciences, and women could choose or not choose it at will. Those who did not choose mathematics deprived themselves of physics, chemistry, and engineering, so they simply could not go where jobs were more numerous, and had to apply where the competition was higher. People have problems adapting to these situations nowadays!

Here you see an not-well-known phenomenon, which, nevertheless, is comprehensively considered in many of its manifestations by statisticians: combining groups of data can create effects that were not in each of the groups. You are well aware of the notion that combining data can hide particulars, but that it can also create new effects is much worse known. You need to be careful in the future so that this does not happen to you - so that you will not be blamed for combining data for something you have nothing to do with. Simpson's paradox is a well-known example, when there are two groups of data, each of which has the same directional dependence, when these groups are merged, the direction of the relationship is reversed, for example, A exceeds individually B and C exceeds D, but the combined data B + D exceeds A + C. (Note of the Translator - The reason for the paradox lies in incorrect averaging of two groups of data with different proportion of control observations (non-representative sample). Since it is intuitively assumed that when applying the dependencies found, the proportion of controls will be the same in both groups, but in the initial data this is not done, they can not be applied arithmetic averaging.)

Now you can say that in simulating space flight, we combined data and sometimes interpreted the entire apparatus as a material point. Yes, we did that, but we knew the laws of mechanics and knew when we were right to do this and when we could not. For example, when correcting the flight path in flight, we orient the aircraft exactly in the required direction and then turn on the corresponding corrective rocket engines, and at this time the crew is forbidden to move in the aircraft, as this may result in its rotation and spoil the accuracy of the correction. We believed that we know enough theory, and we have many years of experience in this issue, so combining all the details of the mass in one material point gives reliable simulation results.

However, in many proposed areas of modeling, there are neither known experimental data nor a theory. Thus, if I were suddenly instructed to conduct environmental modeling, I would humbly ask about mathematically formulated patterns for each possible interaction, for example, tree growth depending on rainfall, what constants are used, and where I could get real data for comparison with test simulation results. Customers could soon decide to go look for someone more agreeable and willing to work on highly dubious modeling that will give the desired results for them, convenient for promoting their own ideas. I advise you to maintain your integrity and not allow yourself to be involved, to be used to promote other people; you need to be careful when you agree to conduct a simulation!

If in the humanities it is difficult to model with great reliability, then do not forget that people, because of their knowledge of modeling, can change their behavior and thus distort modeling. So, in the insurance business, the company puts on the fact that you will live a long time, and you bet that you will die early. By the way, for the provision of pensions, the parties change places, in case you did not think about it. Although, in principle, you can try to deceive the insurance company and commit suicide, but this is rare, and insurance companies take great care of it.

In the stock market, if there were any well-known strategy for making money, the very knowledge of it would destroy such a strategy! After all, then people will change their behavior to distort the predictions you made. It is not that some legally admissible strategy cannot exist (although I am sure that it would have to be a purely non-linear theory in order to be able to bring more than the usual growth of the stock market), but it must be kept very secret. The main problems are dishonesty in the stock market. Insiders have knowledge that, according to well-defined laws, they are not entitled to use, but they do so all the time! If you do not use insider information, then you have little chance against those who use it, and if you act on the basis of insider information, then you are acting illegally! This is a bad business in any case, and insiders resist any attempts to automate the trade with the use of machines, which will exclude the possibility of insider transactions, which they now earn. It is known that they do it, but this is not provable in court! Moreover, the false “insider information” is constantly disseminated among outsiders, so that they think of themselves as insiders and act to the advantage of the initiators of the rumors.

Thus, beware of any simulation of a situation that allows a person to use the output to change his behavior in his own interests, for he will do it whenever he can.

But all is not lost. We have developed a scripting method for overcoming many difficult situations. In this method, we are not trying to predict what exactly will happen, we just give a number of possible predictions. This is exactly what Spock did in his book about raising a child. From the observations of many children in the past, he suggested that the (intimate) future behavior of children would not be radically different from these observations, and he predicted not what your particular child would do, but only typical patterns with ranges of behavior, when babies start crawling, talking, saying no to everything, etc. Spock predicted mainly biological behavior and avoided, as he could, predicting the cultural behavior of the child.

In some models, the scripting method is the best we can do. This is what I do in these chapters; the future that I predict cannot be known in detail, but only in the form of scenarios about what can happen, in my opinion. More on this in the next chapter.

, , , , . -, ? , , , ? ? ? ? - , , ? , , .

I have not yet mentioned what initially seems trivial: our views and notes on paper describing the problem - how accurately are they transferred to the car? As you know, programming errors are quite common.

Let me tell you a story that illustrates how much can be done here. Once the department of chemistry was considering a contract from the federal government on the chemistry of the upper layers of the atmosphere immediately after the explosion of an atomic bomb. I was asked only for advice and recommendations. Having penetrated into the problem, I found that in each specific case, which is to be calculated, somewhere around 100 ordinary differential equations should be solved, depending on the specific chemical reactions they expected.

, , , , , , . , , , , , , ..

Looking back, it seems obvious; but at that time it was a surprise for them, although in the end it was worth the effort. All they had to do was select the necessary punch cards from the file for use in the private calculation that they were going to launch, and the machine automated everything else, including the interval of integration steps. My main idea, besides simplicity and accuracy, was to keep their minds focused on what they could do best - chemistry - and not distract them with a machine where they are not experts. Moreover, it was they who managed the actual calculations. I made it easier for them to program and use a computer, but I refused to save them from making decisions.

To summarize: the reliability of the simulation, which is becoming increasingly common, is of vital importance. This reliability does not lie in the fact that you can thoughtlessly accept the finished result only because a powerful computer produces beautifully printed tables, or displays colorful diagrams. You are responsible for your decisions and you cannot shift the responsibility for them to those who performed the modeling, no matter how much you would like. Reliability is a central question that does not have simple answers.

Let's go back to the ratio of analog and digital computers. This problem sometimes occurs due to neural networks. It is believed that analog machines can calculate things that are not available in the digital version. We need to take a closer look at the moment - this is really the same as it was years ago when analog computers were replaced with digital computers. In these chapters we have the necessary knowledge to thoroughly analyze this issue.

The basic fact is that the Nyquist theorem ( in the post-Soviet world this is the Kotelnikov theorem) about the sample says that to reproduce (up to rounding) the original continuous signal in the entire frequency range, it is sufficient to use discrete numbers following each other at a frequency of at least twice the highest frequency of the original continuous signal. In practice, most signals have a rather sharp cutoff at the upper frequency, without cutoff, the signal energy would be infinite!

In practice, we usually use a limited number of samples in a digital solution and thus require something around twice the required Nyquist number. In addition, we usually have discretes on only one side, and this reduces the number of numbers requirements even twice. Thus, something from seven to ten high frequency intervals are necessary. And there is still a little overlap of high frequencies in the band that is being processed (but it rarely contains information in the signal, usually it is noise). All this can be verified theoretically and experimentally.

(, ), . , , , . , , . .

, , . , « ». , .

Over time, it was possible to expand the bandwidth of analog computers, but this was primarily to speed up the calculations, and not to improve accuracy. In any case, the fundamental limitations of the accuracy of analog parts limit the achievable accuracy of analog computers. Old mechanical computers, such as the RDA # 2, spent about half an hour on the solution; electrical computers, leading their ancestry from sighting devices, still had some mechanical parts, but already spent minutes; then the fully electronic ones appeared in seconds; and now some of them can display the solution on the screen as soon as you enter the data.

Despite their relatively low accuracy, analog computers still have a certain value, especially when you can include part of the device under study in its circuit, so you don’t have to look for a proper mathematical description for its numerical simulation. Some of the fastest analog computers can react to parameter changes, both in the initial conditions and in the equations, and you can directly see on the screen the effect of such a change. Thus, you can “feel” the essence of the problem more easily than for digital machines, which usually require more time to be solved and you need to have a complete mathematical description. Analog machines are usually ignored these days, so I consider it my duty to remind you of their place in the arsenal of the scientist’s and engineer’s tools.

To be continued...

, — magisterludi2016@yandex.ru

By the way, we also launched another translation of the coolest book - “The Dream Machine: The History of Computer Revolution” )

Book content and translated chapters

Foreword

Intro to The Art of Doing Science and Engineering: Learning to Learn (March 28, 1995) : 1
Foundations of the Digital (Discrete) Revolution (March 30, 1995) Chapter 2. Basics of the digital (discrete) revolution
«History of Computers — Hardware» (March 31, 1995) 3. —
History of Computers - Software (April 4, 1995) Chapter 4. Computer History - Software
«History of Computers — Applications» (April 6, 1995) 5. —
«Artificial Intelligence — Part I» (April 7, 1995) 6. — 1
«Artificial Intelligence — Part II» (April 11, 1995) ()
«Artificial Intelligence III» (April 13, 1995) 8. -III
N-Dimensional Space (April 14, 1995) Chapter 9. N-Dimensional Space
«Coding Theory — The Representation of Information, Part I» (April 18, 1995) ( :((( )
"Coding Theory - The Representation of Information, Part II" (April 20, 1995)
«Error-Correcting Codes» (April 21, 1995) ()
«Information Theory» (April 25, 1995) ( :((( )
«Digital Filters, Part I» (April 27, 1995) 14. — 1
«Digital Filters, Part II» (April 28, 1995) 15. — 2
«Digital Filters, Part III» (May 2, 1995) 16. — 3
«Digital Filters, Part IV» (May 4, 1995)
Simulation, Part I (May 5, 1995) (in work)
«Simulation, Part II» (May 9, 1995) 19. — II
Simulation, Part III (May 11, 1995)
«Fiber Optics» (May 12, 1995) 21.
«Computer Aided Instruction» (May 16, 1995) ( :((( )
"Mathematics" (May 18, 1995) Chapter 23. Mathematics
Quantum Mechanics (May 19, 1995) Chapter 24. Quantum Mechanics
Creativity (May 23, 1995). Translation: Chapter 25. Creativity
Experts (May 25, 1995) Chapter 26. Experts
«Unreliable Data» (May 26, 1995) 27.
Systems Engineering (May 30, 1995) Chapter 28. System Engineering
«You Get What You Measure» (June 1, 1995) 29. ,
«How Do We Know What We Know» (June 2, 1995) :(((
Hamming, “You and Your Research” (June 6, 1995). Translation: You and Your Work

, — magisterludi2016@yandex.ru

Source: https://habr.com/ru/post/346564/

All Articles

Richard Hamming: Chapter 19. Modeling - II

Chapter 19. Modeling - II

More articles: