C # for system programming

From the translator. Recently on Habré was published an article " The Future of C # ", describing the new features that are likely to fall into C # 6.0. As a .NET programmer, I really liked this article, and I decided to look for additional information about where C # /. NET is going. And now, as if listening to my New Year wishes, on December 27, Joe Duffy published in his blog the article “ C # for Systems Programming ”, which tells about a research project under his leadership aimed at creating a new language and platform based on C # / .NET. Pleasantly impressed by the article, I decided to publish its somewhat free translation on Habré.

For the past 4 years, my team has been designing and implementing a set of extensions for C # for system programming. In the end, I decided to describe my experience on this work in a series of posts, and this post is the first in this series.

To the first question, “Why do we need a new language?” I readily admit that there are already a lot of programming languages in the world. But how I explain it. If we describe the spectrum of all popular languages in the coordinate area, where the abscissa axis means "Performance" and the ordinate axis - "Safety and Efficiency", this is what we get.

Please take this picture with some understanding and condescension. I understand that safety (Safety) is not equal to productivity (Productivity), that safety can be understood and interpreted in many different ways, etc. However, security and productivity often go hand in hand - remember how much time and effort developers usually spend on security bugs, toolboxes that help improve code in one way or another, etc.

So, as can be seen from the scheme, I argue that among the whole set of popular programming languages today there are two extensive dominant groups, expressed in the scheme by two quadrants.
')
In the upper left quadrant we have languages with garbage collection that dominate in terms of developer productivity, using them. Over the past few years, JavaScript performance has grown significantly, thanks to Google. Recently it happened with PHP. It is absolutely clear that there was a whole galaxy of dynamically typed languages that actively compete with C # and Java. Thus, the choice has now shifted from a performance issue to a preference for a dynamic or static type system.

This means that languages like C # are more and more “affected” by the Law of the Excluded Middle. Being in the middle of the good will not end.

In the lower right quadrant there are languages that squeeze the last drops of performance out of themselves. Let's be honest: most programmers will not put C # and Java in this quadrant, and I agree with them. I saw a lot of people who, with a sour aftertaste in their mouths, ran from tongues with garbage collection back to C ++. (I will be honest and say that the garbage collection itself is only partly to blame; this “flight” is largely due to poor design patterns, frameworks and missed opportunities to make the language even better.) Java is closer to the “performance quadrant” than C #, first queue due to the excellent work of HotSpot-like virtual machines that use code pitching and stack allocation. And yet, most hardcore system programmers choose C ++ instead of C # and Java solely because of the first in performance. And although C ++ 11 is a bit closer to languages like C # and Java in terms of productivity and security, it still shows a clear “reluctance” to add guaranteed type safety to C ++. Yes, now you have a lot less intersection with insecurity, but I firmly believe that, like with pregnancy, you can not be "half safe." Presence in the language of insecurity means that you should always count on the worst possible scenarios and use the tools to restore security already on the fact of its violation, instead of relying on the security of the type system.

Our first priority was to verify the reality of the dichotomy, the absolute contradiction and incompatibility between these two quadrants. In other words, we wanted to find out if, in principle, something can exist in the upper right quadrant. And based on the results of several years of work, including the application of these theses to a huge code base, I believe that the answer will be “Yes!”.

The result should be expressed in a set of extensions to the C # language, affecting the language itself to a minimum degree, and not in a completely new language.

The next question is: “Why base on C #?” In the language we want to develop, type safety is a “reinforced concrete” aspect, and C # provides a damn good canvas in the “modern type-safe C ++” style in which we will start to paint our picture. C # is closer to what we want compared to, say, Java, because it contains such modern things as delegates and lambda expressions. Now there are other candidates, such as D, Rust and Go. But when we started our work, these languages have not yet appeared or have not become sufficiently complete for use. In addition, my team works at Microsoft, where a bunch of talented C # developers are available at arm's length. I am willing to collaborate with experts in the other considered languages listed above, and even shared ideas with several key people from the communities of these languages. The good news is that all the languages we are considering have common roots in C, C ++, Haskell, etc.

Finally, you might ask, “Why not build on C ++?”. I have to admit that in the process of working I often wondered if we should not start with C ++ and cut out a “safe subset” of functionality from it. We often found ourselves doing “shuffling C # and C ++ in a blender, trying to see what was going on,” and I admit that sometimes C # pulled us back. In particular, when you start thinking about RAII, deterministic destructors, links, etc. The subtleties between generics and templates deserve a separate post. I really expect to consider switching to C ++ sooner or later for two reasons: (1) such a move will increase the number of users of the language (there are many more programmers in the world who know C ++ than those who know C #) and (2) I dream of standardizing to The open source community also did not need to choose between “security and productivity” and “performance.” But within the framework of the initial goals of the project, I am happy to use C #, and not for the latter reason, due to the rich functionality of the .NET Framework.

Over the past few years, I have given hints of our project several times (for example, here and here ). In the coming months, I will begin sharing even more detailed information. My ultimate goal is to transfer the results of our project to open source, but before that we need to put in order some aspects of the language, and also, more importantly, switch to using the code base of the Roslyn project. I hope that this goal will be achieved in 2014.

At a high level, I classify the functionality of the created language into six main categories.

Understanding the life cycle (Lifetime understanding). C ++ contains RAII, deterministic destructors and effective allocation of objects. C # and Java encourage developers to rely on a heap managed by the garbage collector, and support only fuzzy / loose support for deterministic destruction of objects through IDisposable. A part of my team constantly converts programs written in C # into our new language, and for us it is not uncommon that 30-50% of the program execution time is “eaten away” by the garbage collector. For server applications, this fact “kills” the bandwidth, for clients it leads to the degradation of usage sensations, as it reduces the responsiveness of the interface during its use. It can be said that we “stole” some of the C ++ features, namely rvalue references (rvalue references), move semantics, destructors, references and borrowing (references / borrowing). Along with this, we have preserved the safety of the language, as well as connecting ideas taken from C ++ with ideas of functional programming. All this allows us to apply aggressive allocation of objects on the stack, deterministically destroy them and much more.
Side-effects understanding. This paragraph is an evolution of the ideas that we published at OOPSLA 2012. It involves introducing into the language a part of the functional of the C ++ operator const (again, in a safe way), as well as immutability and first class isolation.
Asynchronous programming at scale (Async programming at scale). The community constantly revolves around this issue, determining whether to use continuation-passing or lightweight blocking coroutines. This question affects not only C #, but pretty much all existing languages. The key innovation here is the composable type system, which is “agnostic” (indifferent) with respect to the execution model, and can effectively interact with any of them. It would be presumptuous to say that we have only one approach to solving this problem, but, having tried many other approaches, I can say that I like the one we have chosen.
Type-safe systems programming. You can often hear statements that the type system and the loss of performance go hand in hand. There is no doubt that border checking takes time, and also that we prefer that overflow checking be enabled by default. But it's amazing how much can be achieved with a good optimizing compiler versus JIT compilation. We will also allow you to do more things without allocations, such as lambda-based APIs, the call of which is not accompanied by any allocations (now, as a rule, two allocations are required: one for the delegate and the second for display). Another feature is the ability to cut out sub-arrays and substrings without allocations.
Modern error model (Modern error model). This item is another one with which the community disagrees. We stopped at what I consider as the best possible option: all kinds of full and full use of contracts (preconditions, postconditions, invariants, assertions, etc.), fast processing of failures as a basic setting of the environment, exceptions only for rare dynamic failures (parsing , I / O, etc.), and typed exceptions only when absolutely necessary for extended exception information. All this is integrated into the type system as first-class objects, so you get everything you need in integrity and security.
Modern frameworks (Modern frameworks). This includes a whole bunch of things, including asynchronous LINQ expressions and improved enumerator support that will compete with C ++ iterators in terms of performance and will not require double-interface dispatch to retrieve elements. To be honest, in this area we have the largest list of things with the status of "designed, not yet implemented." This also includes: void type as a first class object , non-null types, traits , 1st class effect typing, and so on. I expect that we will have time to realize a small part of this list by the middle of 2014.

I believe that our project will interest you, and I want to hear what you think about this, what are your opinions on the project in general, and on its components, and also want to find out which aspects of it you are most interested in. I am extremely happy to share information with you, but the reality is that I don’t have much time for blogging, and the work is above the roof (by the way, we're hiring). But I will definitely take into account your opinion on what to tell me and in what order. In the end, I look forward to the day when I can share with you the real code, not the text. In the meantime, I wish you a Happy Hacking!

Post update

What was intended as an innocent blog entry to facilitate an open dialogue with the community turned into something definitely bigger.

I hope that it is clear from my biography that the language described in this post is a research project, no more, no less. Think of me as a boyfriend from Microsoft Research who published his blog report, not PLDI . I'm just not talented enough to perform at PLDI.

I very much hope that in the coming months I will be writing something new about the project, but only in the spirit of open cooperation with the community, and not for the sake of digging in a “deep sense” or in “high matters”. Do not need as many speculations!

I like your enthusiasm, so please stick with the technical side of the dialogue. If all the other thoughts and ranting disappear, I will become a happy person!

Afterword from the translator

Despite its relatively small size, this article was rather difficult to translate. Many terms simply do not have an adequate translation, many technical things are not completely understood. I tried to hyperlink all the terms that I could find an explanation of (or I suppose I found correctly). Nevertheless, I am sure that the article contains many inaccuracies / errors / distortions of the translation, both “classical” and technical in nature. Therefore, I will be extremely grateful to any of your comments / comments / suggestions expressed where you are comfortable and how you are comfortable.

If you are interested in Duffy's mysterious and monumental project, I advise you to read the comments on the original article. Duffy often answers questions from readers of the blog, and from his answers you can learn a lot of interesting things that are not covered in the article itself.

I also recommend that you read the translation of the interview with Joe Duffy " 10 questions about parallel programming and threads in .NET ", published on the RSDN. Although the interview dates back to 2007, I think Duffy’s answers are not out of date.

There is also another, no less interesting interview with Duffy “ Joe Duffy on Uniqueness and Reference Immutability for Safe Parallelism ”, dated April 2013 and published by InfoQ. In addition to concurrency issues, Duffy also briefly mentions the project described in this post. I can translate it, however, based on the size and complexity of this interview, I don’t want to do the translation, not knowing if the Habr community is interested. I don’t want to repeat the situation with my latest translations of John Skit’s articles that were interesting only to a few people. Therefore, I am waiting for your opinion.

Source: https://habr.com/ru/post/208608/

All Articles

C # for system programming

Post update

Afterword from the translator

More articles: