Interview with C # legend Eric Lippert

The material is taken from the DotNetCurry magazine dedicated to technologies based on the .NET platform.

Dear readers, we are very happy to see Eric Lippert in this issue of the DNC magazine. Eric doesn’t need to be introduced to people familiar with C #, but for the rest, Eric is known for his work in the C # compiler team. He devoted a significant part of his career to Microsoft, working in various positions. Before joining Microsoft, Eric worked for Watcom. Our "oldies" remember Watcom as a company that has created very good compilers for C ++ and Fortran languages. Erik currently works for Coverity to help create static code analysis products.

DNC: Hello Eric, we are very glad to see you here with us.
')
EL: Thank you. I'm glad to be here.

DNC: You have been working at Microsoft for a long 16 years. Describe your journey (if you can call it that) from an internship to working on VBScript, JScript, VSTO (Visual Studio Tools for Office) and becoming the main developer of the C # compiler team.

EL: I grew up in Waterloo, was interested in science and mathematics from an early age, so it was natural for me to go to the University of Waterloo. In addition, I had relatives in the staff, and I already knew a number of professors, and as you said, as a student, I worked for UW, which was a subsidiary of Watcom. UW had an excellent training program, one of the largest in the whole world, through which I managed to get three internships at Microsoft to the VisualBasic development team. They happily extended my job offer when I finished my internship, and I stayed in the tool development department throughout my career at Microsoft.

DNC: Before you started your internship at Microsoft, it's fair to assume that in the early years you received a lot of good advice from senior engineers. Which of these tips was the best programming tip you have ever received?

EL: I received a lot of good advice from senior engineers throughout my career, not just at the beginning; Microsoft encourages formal and informal mentoring. I recently talked about the best career advice that I received it at Microsoft: in principle, as I became a subject matter expert, I answer as many user questions as I can. But to say which of the programming tips was the best is not so simple. I learned so much from Microsoft — a world expert in programming language design, performance analysis, and many other things — it's hard for me to name one thing.

One thing I remembered before I got into Microsoft. One day, many years ago, Brian Kernighan gave a talk on programming in UW. One slide showed the code with which something was wrong. It was incorrect because the comment to the code and the code itself did not match each other. Kernigan asked the question: what actually works - the code or comments to it? I still ask myself this rhetorical question when I try to understand the code containing the error; It often happens that comments are misleading, because they are outdated or simply were not originally written, at the same time it often happens that the comments are correct, and you don’t even have to delve into the code containing the error. Kernighan's report completely changed my attitude to commenting code. From now on, I try to write comments that explain the purpose of some part of the code, before trying to explain how it works.

DNC: When your team started developing C #, what were your main goals? Are you happy with what language C # has become?

EL: To be honest, I started working on C # when the basic concepts of C # 3.0 were seriously developed and developed. I have been following C # since it appeared more than 10 years ago, but I was not part of the C # 1.0 or C # 2.0 team.

When we talk about goals, I try to distinguish “business” goals from “technical” ones; they are closely related, but different. From a business point of view, the main goal of C # was and is the creation of a rich language that would allow you to get all the benefits of the .NET platform and, moreover, improve the modern view of the Windows ecosystem as a whole. The best tools entail more productive developers, more productive developers create better applications for their users, better applications make the platform more attractive and everyone wins.

In terms of language design, there are a number of basic principles to which developers return again and again. The language should be modern, practical, general-purpose language, which is used by professional programmers who develop software.

C # 1.0 started out as a fairly simple, modern programming language. Obviously, he felt the influence of C and C ++; The language design team sought to smooth out some of the shortcomings of the C / C ++ languages, while at the same time allowing access to unsafe code. But 10 years later, the language grew, adding features such as: generic types, sequence generators, functional closures, query expressions, the ability to interact with dynamic languages, and later, a significant improvement in support for writing asynchronous code. I am excited about how the language has evolved over the past 12 years, and it is an honor for me to be part of some of the most exciting changes. I look forward to continuing work on the C # ecosystem.

DNC: As a member of the C # development team, what negotiations did you have with the Windows OS team? At what level is the operating system when developing certain language features? Do you largely work alone or do you have more joint efforts?

EL: Every time is different. In the distant past, this was usually the first; When developing C # 5.0, the Windows team was heavily involved. I personally, did not communicate much with the Windows team during the work on C # 5.0, but the C # project management team was almost always with my colleagues from the Windows team throughout the work on Windows RT. There were several technical issues that required special care to ensure that as few inconsistencies as possible were made between C # developers and the Windows RT programming model. In particular, it was important that “async / await” meets the needs of Windows RT developers using C #.

However, this is true relatively recently. C # historically did not have direct interaction with the Windows command, since it is based on the managed CLR, and also uses the BCL class library to provide access to the operating system functionality. Since the C # team realized that the CLR and BCL teams would act as an "intermediary" between the operating system services, they were able to concentrate more on the design of the language that would use all the power of the CLR and BCL, and let these teams interact with operating system.

DNC: We heard about your podcast with the StackExchange team in which you mentioned things that are on the top of your list - “if I had a Gene to fix in C # ...”. It was about unsafe covariance of arrays. Could you tell our readers about this?

EL: Of course. First, let's define what the term “covariance” means. Proper definition will require a touch on the category theory theory, but we don’t have to go so far to understand the meaning of this term. The idea of covariance, as the name implies, is when one statement is reduced to another, the truth of which is preserved when a certain transformation is performed on the original statement.

In C #, there is the following rule: if T and U are reference types and T is reduced to U by a reference transformation, then T [] is reduced to U [] as well by a reference transformation. This rule, as they say, is covariant, since the statement “T is reducible to U” you can lead to the statement “T [] is reducible to U []”, while the truth of the statement is preserved. This is not true for everything; for example, you cannot conclude that List <T> is reduced to List <U>, only because T is reduced to U.

Unfortunately, the covariance of arrays weakens type safety in the language. A language is type-safe when the compiler catches such errors as, for example, assigning an integer to a variable having a string type, that is, the program will not be compiled until all type mismatch errors are corrected. Covariance of arrays, an example of such a situation, when type mismatch cannot be caught at compile time, but can only be checked at run time.

static void M(Animal[] animals) { animals[0] = new Turtle(); } static void N(Giraffe[] giraffes) { M(giraffes); }

Since the transformation of the arrays is covariant, the giraffe array can be converted to an animal array. And since the turtle is an animal, we can put it in an array of animals. But this array actually contains giraffes. This type mismatch will result in throwing an exception at runtime.

The covariance of arrays has 2 negative consequences. First, the assignment of a value to a variable must always be checked at compile time, but in this case it is not possible. And secondly, this means that every time you assign an element to an array (a type that is an unprinted reference type) value is not an empty reference, the environment checks the actual type of the array elements with the type assigned to the reference. This check takes time! In order for the covariance of the arrays to work, the correct program will have to work slower each time the array elements are accessed.

// If I had a Gene who could fix any code, I would remove the insecure covariance of arrays completely.

The C # team has added type-safe covariance in C # 4.0. If you are interested in how this is done, I have written a long series of articles devoted to the implementation of this functionality; You can read them here. blogs.msdn.com/b/ericlippert/archive/tags/covariance+and+contravariance

If I had a Gene who could fix any code, I would completely remove the unsafe covariance of the arrays, and correct all the code using the type safe covariance added in C # 4.0.

DNC: Before we got to your current job, tell us a little about static code analysis.

EL: Static analysis refers to the analysis of a program based on its source code only. It differs from dynamic analysis, which analyzes the program at runtime. Compilers produce static analysis, while profilers produce dynamic analysis. Compilers use static analysis for 3 things: first, to determine if a program is a valid program, and if not, display the appropriate error messages. Secondly, to translate the correct program into any other language, usually in byte code, or machine language, but it can be any other high-level language. And, thirdly, to determine the constructions that are correct, but the use, of which is doubtful, and to draw the appropriate warnings.

In Coverity, we usually deal with the third type of static analysis, we assume that the code is syntactically correct; this will be verified by the compiler, and we conduct a much more in-depth analysis to identify questionable constructions and present them to your attention. The sooner you find an error, the cheaper it is to fix it.

// I spent about 15 thousand hours carefully studying the design and implementation of the C # compiler.

There are other things you can do with static analysis, for example, Coverity also makes a product that uses static analysis to find code changes that do not have the appropriate unit tests.

DNC: How does your deep C # knowledge benefit Coverity?

EL: Mostly in two ways. First, C # is a huge language, its specification takes about 800 pages. Ordinary developers, of course, should not know the whole language in detail in order to use it effectively, but the creators of the compilers should of course. I spent about 15 thousand hours carefully studying the design and implementation of the C # compiler, as well as learning from the designers of the language Anders Hejlsberg, Neil Gafter and Eric Meyer, so I have a fairly solid understanding of what a good C # static analyzer does. Secondly, I have seen thousands of lines of C # code containing errors. I know what kind of errors C # developers are making, which helps us to identify the places where we need to make special efforts in static analysis.

DNC: Ever since you have worked for Coverity, have you come up with a situation in which you thought - “mmm, will this help make C # more statically typed (provable)”?

EL: Some language functionality complicates static analysis, but at the same time makes the language more powerful. For example, virtual methods complicate static analysis, because the meaning of virtual methods is that the real method will be selected based on the real type at run time.

As I said earlier, C # was designed to take into account the shortcomings of C / C ++ languages. The C # 1.0 developers did a good job; for it is rather difficult to organize a buffer overflow or create a memory leak or use a variable prior to its initialization or accidentally use the same name for completely different things. But what really was instructive for me after the transition to Coverity was that most of the erroneous constructions that Coverity checks in C / C ++ are equally well applicable in modern languages such as Java and C #.

DNC: We heard you developed one of the first fan pages of the novel The Lord of the Rings. Tell us more about how this happened, as well as about your interests in books and movies.

EL: My father read the Hobbit to me when I was very young; from that time I began to get involved in the writer Tolkien; I collected his biography and it was my hobby when I was a teenager. When I studied mathematics at UW in the early 1990s, the world wide web (WWW) was something new; One day I was surfing the internet and I found a fan page for the original Star Trek series. I thought it would be a great idea to create something similar for Tolkien, I searched the entire Internet, which didn't take me long in 1993 and found all the FTP sites, newsgroups and much more about Tolkien, and created a simple a web page that was just a collection of links and put it on the server of the GOPHER computer science club. With the growth of the web, more and more people associated with it sent me the addresses of their pages. I continued to add more and more links until there were too many of them. I stopped supporting this page and, in the end, my membership came to an end. I think you can still find it in the archive of the Internet.

The fact is that at that time such companies as Open Text and Google started indexing the Internet and their search algorithms took into account such things as: how long the page existed, how often it changed over time, and how many external links to it were. Even after I stopped it actively maintain the performance of the page were at a high level. As a result, for many years my name was the first to make a request for Tolkien. When the film was presented, many people did just that. As a result, I ended up giving an interview to several newspapers, received an e-mail from one of Tolkien's grandsons, Jeopardy (the American name of the program “My Game”) to check the facts once called me, asking “Who are the Ents?”. I had a lot of fun.

However, in those days I read very little fiction and fantasy. Most of my free time reading is about popular science books.

I love watching movies and inviting friends to watch them, our nightly movie choices are very different. One month we watch films nominated for an Oscar, another month horror films.

// One of the ways to determine that a language really becomes big is when users send a request for a new functionality, and you already have it.

Here are three little features that not many people know about.

You can add the prefix "global:" to the namespace name to force the name resolution algorithm to start the search from the namespace marked with the word global. This is useful in situations where you have a conflict between global space and local space or type. For example, if your code is in the poorly named namespace "Foo.Bar.System", then referring to "String" in "Foo.Bar.System" will result in an error. If you add the prefix "global: System.String" then the String will be searched in the global namespace System.
C # prohibits "falling" from one section of a switch statement to another. What people don't know is that you don't have to use the word break in each section. You can force the operator to fall from one section to another using labels. You can also complete the switch section with goto, return, throw, yield break, continue, or even an infinite loop.
And the last clever function that few people know: you can combine the use of the unifying null operators to get the value of the first non-empty element in the sequence of expressions. If you have variables x, y, and z of type int? then the result of the expression x ?? y ?? z ?? - 1 is the first of x, y, or z is not a null number or -1 if they are all null.

I often talk about the unusual features of the language in my blog, so visit it if you want more examples.

Source: https://habr.com/ru/post/199212/

All Articles

Interview with C # legend Eric Lippert

// If I had a Gene who could fix any code, I would remove the insecure covariance of arrays completely.

// I spent about 15 thousand hours carefully studying the design and implementation of the C # compiler.

// One of the ways to determine that a language really becomes big is when users send a request for a new functionality, and you already have it.

More articles: