📜 ⬆️ ⬇️

How Microsoft rewrote the C # compiler in C # and made it open

The author of the article is Mads Torgersen, Lead C # Architect at Microsoft

Roslyn project

Roslyn is a codename assigned to open source by the compiler for C # and Visual Basic .NET. The project began in the deepest darkness of the last decade of Microsoft's corporate life - and ended as an open source project, a cross-platform, public universal C # engine (and VB, which I will take for granted in the rest of the article).
')
The first talk about the project, which would later become known as Roslyn, was already going on when I came to work at Microsoft in 2005 - shortly before the release of .NET 2.0. There was a conversation about rewriting C # to C #. This is a normal practice for programming languages ​​- proof of the maturity of a language. But there was a more practical and important motivation: we, the creators of C #, did not program in C #, we programmed in C ++! If you program on C # every day, then you change your mind: the great power of working on the tool you are developing (dogfooding).

Users depend on the behavior of the new compiler exactly like the old one. Writing a new compiler for C # is an attempt to find a bug-to-bug match.

The difficulty of rewriting the compiler, which has been in active use for several years, is that users depend on the behavior of the new compiler exactly as they did on the old one. Writing a new compiler for C # is an attempt to find a bug-to-bug match. And I’m talking not only about known bugs, but also about unknown errors and undesirable forms of behavior that developers have found and use, often unconsciously.

For many years, the scale of this problem did not allow us to even begin to implement the project.

And although the developers of the language group at Microsoft received many benefits from the new C # compiler written in C #, however, the value for the end users was not so obvious: how would the new compiler be useful to them? Perhaps the only people who care that the C # compiler is written in C # are the compiler developers themselves.

At the same time, another problem manifested itself more and more: duplication of effort between various tools running on C # code. In addition to the compiler, another team worked on IDE support for C # in Visual Studio, and they also had to write a bunch of code (at the time, also in C ++) to understand the syntax and semantics of C #.

At the same time, the number of tools from Microsoft and other companies, such as StyleCop, CodeRush, etc, grew: they must all implement meaningful processing of C # code. Each of these programs has its own slightly different errors, different levels of understanding, different compromises and concessions. And they all would have spent a lot of effort to come to a common understanding of the code.

And we decided on an important proposal: to make sure that there is only one codebase in the world - a single base for all tools that work with C # code!

The value of such a proposal stems from the increase in the number of available tools, and especially from the improvement in the quality of existing tools. All requirements for the correctness and performance of the language are imposed on a single code base. One-time effort is enough to make a stellar quality base and tremendous versatility. We will create a real language engine! Unified, open API for C # code. We will give a new definition to the concept of "compiler".

Of course, as soon as you create an API for a wide C # community, it goes without saying that it should be a .NET API implemented in C #. So, the old dream of writing C # in C # is almost like a random side effect.

Thus, Roslyn was born out of the open-mindedness mentality: sharing the inner workings of C # for software use by the whole world. This in itself was a bit of a bold suggestion for the still-rather-closed Microsoft corporate culture.

Will we share intellectual property for free? Will we empower tools that compete with us?

In a corporate discussion, we were helped to win arguments for strengthening the ecosystem and creating a language with the best tools on the planet. It was about the long-term growth of C # and .NET compared to short-term monetization and the protection of Microsoft assets. Thus, without even mentioning open source, the Roslyn bet was a big and bold step for Microsoft.

Of course, developing something like this cannot be easy. Roslyn's perspectives were very ambitious and fraught with technical problems, and it took us half a decade to deal with everything. But that's another story.

For most of the initial development, Roslyn remained a closed source project.

From the very beginning of serious work on the project in 2009, we had ideas to make compilers open, but Microsoft was simply not ready yet.

Since the 1970s, Microsoft has had a closed-source culture and protecting the source code with patents. And although the changes were in the air, they were slower than our team hoped.

In fact, for some time it seemed that the company was going in a completely opposite direction.

The Windows 8 project has greatly influenced the entire company. Thanks to the new programming model, its tentacles penetrated deep into the teams of developers of tools and languages, and everything was covered with utmost secrecy, not only outside, but even inside the company. As an example, the async function that we developed at that time was coordinated and mixed with the Windows 8 programming model, and I would not dare to publish notes about its design even within the company, for fear of accidentally leaking information about Windows 8 and problems on my head ! This created a terrible climate for innovation, and of course, did not allow us to hope for the open source code of the C # compiler.

But in the end, when Windows 8 went its way, the company began to transform and found a new direction, a new leadership and a completely different philosophy - the Microsoft we know today. Open source is now rapidly spreading inside Microsoft.

F # was released in 2010 with an open license and its own organization - F # Software Foundation . An outstanding community formed around it, which soon became the envy of all of us. Our team insisted on getting a free license for Roslyn - and, finally, the corporate infrastructure allowed it.

By 2012, Microsoft created the Microsoft Open Tech organization, specifically focused on open source projects. Roslyn came under her wing and officially became an open source project. Roslyn is quite ripe for this: all development resources were internal and well known, and the project itself did not suffer from a large number of dependencies that could have caused licensing conflicts.

In April 2014, at the Build developers conference in San Francisco, Anders Hejlsberg presented Roslyn as an open source project , and the source files were published on April 3 on CodePlex (the former Microsoft platform for repositories) under the Apache 2.0 license.



At the same time, the .NET Foundation was declared the base for .NET projects, including Roslyn.

This release has become a breath of fresh air! We began to reap the benefits of openness in CodePlex, and then the remaining procedural obstacles to open source at Microsoft were eliminated, so today open source is a natural and integral part of how we work in many of our teams.

We don’t see GitHub as a place to publish source codes - it’s just our place of work.

On other fronts, the company also realized that it was not necessary to strive to control everything. It became clear that there were no compelling reasons for the existence of CodePlex - and Roslyn, along with other projects, migrated to GitHub, by which time the de facto main platform for open source projects. Not only the code itself, but also the process of its creation is conducted in GitHub: we no longer consider GitHub as a place for publishing source codes - it is just a place for our work.



C # language design and compiler implementation are now fully open processes, with significant third-party participation. They create including whole language functions. The value of C # simply rolls over not only due to the scaling of efforts in writing functions and correcting errors, but also due to understanding and correcting the course, which we produce thanks to an instantaneous daily feedback loop with the community.

It was a long and crazy journey, and for me it symbolizes the tremendous changes that Microsoft has undergone in the last decade. Nugget Roslyn was born in the dark, but grew up on ideas of openness - and today exploded with a million different uses due to the power of open source.

Learn Roslyn and C # Language Design:

Roslyn on github
C # on github

Source: https://habr.com/ru/post/426961/


All Articles