Report from breakfast with Charles Wetzerell, author of the cult book "Etudes for Programmers"

Breakfast with Charles Wetherell, author of the cult book “Etudes for Programmers”, lasted four hours. In the end, the waitress asked us from a restaurant in Palo Alto, saying that the restaurant had a long queue, and we’ve been here since eight in the morning. During this time, we discussed a lot of interesting things: the work of Charles at the Livermore Laboratory and Orakle, object-oriented and functional programming, compilers and hardware description languages, bookmarks to processors, the inefficiency of neural networks and the undeservedly forgotten Prologue, a visit to Charles of Russia, text processing by the state machine in the hardware coprocessor and the creation of video games on FPGAs by schoolchildren.

The content of four hours with Charles Wetzerell is enough for fifty articles on Habré, so I will list mainly the topics, after which I will give some details about three of them:

Object-oriented and functional programming. Single assignment, function values, get rid of mutations, get rid of timing.
Data structures and compiler algorithms. Muchnik SSA and Optimization Book. Bob Morgan (Compass) building optimizing compilers. Vector compilers and Randy Allen (my Wave colleague and Charles colleague about other companies).
The evolution of the Yacc parser, the internal representations of the Ada language (DIANA) and the VHDL frontend in Synopsys.
Attributive grammars and the unfortunate, in my opinion, their use in the MIPT training manual on the Theory of Implementing Programming Languages (TRNP).
JOVIAL programming language and Ada standardization. IDL language.
Programming at the Livermore Computing Lab for Physicists and Chemists at the CDC 7600 and Cray-1. Livermore Fortran is an expansion of Fortran-77 with structures and dynamic memory allocation. Use microfiche including for automatic search and production of animations. Harry Nelson. And how the Rubik's Cube got to the laboratory before it became known.
Soviet clone Cray-1 Electronics SS BIS. The FORTRAN compiler in IPM and the C compiler, which we worked on at MIPT.
Reverse engineering of a random number generator in Synopsys VCS. Congruential generator with register shift. LSFR.
The inefficiency of neural networks and undeservedly forgotten language Prolog.
Application of methods from Prolog for static text analysis of programs.
This includes analyzing the code of a processor written in Verilog or VHDL in order to find bookmarks in it. Bookmark scattered across parts of the processor description at the register transfer level. Finding "extra" code that does something outside the specification. For example, a finite state machine that waits for a key phrase, a text in registers visible to the programmer, and then puts the processor into a privileged mode.
Hybrid methods of code analysis - dynamic execution followed by a static study of the state space from a certain execution point.
List of Hakmem from MIT.
Most programmers in life use only five algorithms - quick sort, binary search, hashing, list insertion, and something else (AVL binary tree insertion?).
Unix troff history at Bell Labs.
The work of Charles Wetherell in Orakle over SQL.
A good example of using a hardware coprocessor for MIPS CorExtend / UDI is User Defined Instructions. Adding commands to the processor for fast lexical analysis, with a state machine inside the coprocessor and saving state between individual commands. Background from the days of IBM / 360 translate test and CDC STAR.
Using a hardware coprocessor to pre-clean the data stream before applying machine learning algorithms to it.
Game Rogue, Scientific American in the states and the USSR.
Summer School for Young Programmers in Novosibirsk and mosquitoes in it (according to my recollections and stories from colleagues of Charles Wetherell)
How Charles spent 36 hours in Moscow and two weeks in St. Petersburg. Hermitage. In St. Petersburg, he did not read lectures.
He offered Charles a summer school trip to MIET / Zelenograd in July or somewhere else in the fall (MSU? MIPT? ITMO?).
Education schoolchildren and younger students. The need to exit the template (for example, sequential programming) and the study of Verilog on the FPGA as one of the ways to exit such a template.
The use of chips with a small degree of integration before the exercises on the FPGA, so that the student or student intuitively understood that the code on Verilog is a description of the electronic circuit, and not the program (chain of instructions).
An example for RTL on the FPGA for the summer school at MIET / Zelenograd in July is a self-learning state machine that calculates the opponent's trends in the game "rock paper scissors".
Another example is the competition of finite automata (animals) that move a player to a target on a map (globe). Objects on the map have a "smell" - positive (food) or negative (electricity that can hit). Constructing a FPGA map, outputting and spriting a player to VGA using the sweep generation module.

')
Here we assorted recent disputes on Habré about OOP. Charles is agitating for both OOP and functional programming, where applicable. I showed Charles an example of unsuccessful design of classes I saw in two projects for representing the nodes of the parse tree and optimizations on this tree , after which Charles said that of course the tree transformation algorithms should not be scattered into small classes in this way, but instead quickly throw in the control flow graph, on which to use table driven transformations based on static single assignment, with a small number of exceptions. Charles enlightened me about the work of Muchnik, Bob Morgan and Randy Allen on vectorization:

Then I told Charles that the day after tomorrow we would be holding a seminar in Las Vegas at the electronics design automation conference , and I need his advice on the good example of the CorExtend / UDI - User Defined Instructions coprocessor. This protocol is used in MIPS kernels. CorExtend / UDI allows the processor to embed a block that decodes and executes instructions additional to the main command system that the system designer on the chip can determine. The block can be synthesized and become part of the chip or be configured in FPGA / FPGA.

Additional instructions move along the processor pipeline along with the main ones. They get data from the general-purpose registers visible by the programmer and can return the result to the register. These instructions can also save a state in the coprocessor. They can be killed by exceptions if an exception occurs, for example, in the instruction following this instruction in the pipeline:

The day after tomorrow in the presentation at the seminar I am going to use an example with a simple convolution instruction for a neural network. But the acceleration achieved with this is not impressive - only two times. Could you make a better example?

Charles immediately came up with a much more successful example: hardware lexical analysis. You can put a finite state machine in the coprocessor that will determine the numbers, identifiers, and comments in the text flow. It will save state saving between individual commands that transfer text from registers to the machine. The result of the current analysis (marked text) will also be returned to the register.

Charles also told me the story of instructions for parsing text from the days of IBM / 360 translate test and CDC STAR. He also told me that such a coprocessor can be used for machine learning, for pre-cleaning the data stream before applying machine learning algorithms to it.

Then I told Charles Saga how a group of engineers and teachers translated and implemented the textbook of David Harris and Sarah Harris “Digital circuit design and computer architecture” in various Russian universities (see posts on Habré 1 , 2 , 3). Now by the joint efforts of MIET, RUSNANO, MEPI teachers and other universities, we are planning a summer school at MIET where advanced schoolchildren design video games on the FPGA with a graphical display (section Between physics and programming ). For this purpose, ideas from the Designing Video Game Hardware in Verilog by Steven Hugg, December 15, 2018 are used:

Games can be developed both in the form of purely hardware finite automata, and in a combination of hardware graphics on the FPGA with programs on the simple schoolMIPS processor core, which is described in see Stanislav Gelnio's posts on Habré and wiki on schoolMIPS on GitHub . On the FPGA, you can quite easily generate a sweep generation for VGA , display a map from memory and moving sprites with figures:

In addition to games with tanks and races, Charles proposed to make a competition of finite automata (animals) that move a player to a goal on a map (globe). Objects on the map have a "smell" - positive (food) or negative (electricity that can hit). Schoolchildren can write finite automata that see the environment, embed them in a code that makes graphics output and supports a map, and then compete who has this code better:

You can use LFSR hardware blocks to generate elements of pseudo-random behavior:

At the end, Charles left two autographs - for Russian readers (I borrowed a Russian book from Sergei Vakulenko ) and readers at our company Wave Computing, from whose internal library I lent my original book in English:

Source: https://habr.com/ru/post/454512/

All Articles

Report from breakfast with Charles Wetzerell, author of the cult book "Etudes for Programmers"

More articles: