📜 ⬆️ ⬇️

Why I choose D

Instead of introducing


Good afternoon, Habraludi.
I would like to share with all my humble experience of choosing a programming language for my projects. I want to stress right away - I chose a language based on my own needs, and it is quite likely that your choice under similar conditions may be different. Nevertheless, I sincerely hope that this article will be useful, since it contains in some detail and convincingly a comparison of D with C ++ and C #, as well as over ten different languages ​​belonging to different classes and implementing different paradigms are mentioned. D itself is being developed as a high-level language for system and application programming.

It should also be noted that there are two versions of D: D1 and D2. D1 is stable, its updates only affect bug fixes, and D2 is currently adding new features. Since D2 is recommended for new projects, it is he who will be considered in the article, unless otherwise indicated.

On the scope of programming languages ​​in general


So let's get started. First of all, I want to note that the choice of programming language depends primarily on the range of tasks - C / C ++ / Java / C # will most likely not be able to replace JavaScript for web pages or SQL for relational databases. Although here, perhaps, it is necessary to make a reservation: in principle, you can make a compiler or write a class library that would convert one language to another according to the rules. So, for example, Microsoft entered the development platform ASP.NET - scripts for the browser can be written in any language of the .NET platform (for example, in C #) and then automatically converted to the corresponding JavaScript code. However, I personally checked this functionality only for relatively simple examples of generating form validators for browsers, so it is possible that JavaScript will have to be used for some non-standard case. On the other hand, given the number of freely distributed libraries and frameworks (for example, jQuery, MooTools, Prototype, etc.) tested with all modern browsers, you may not need to reinvent the wheel. And since we are talking about the applicability of languages ​​for various technologies, then, for example, you can choose a functional programming language, say Haskell, for writing an operating system, but the choice of JavaScript as a system programming language will probably be unwise. By the way, one OS on Haskell has already been written, called House. Well, it is not possible to write an application program to SQL, since it does not have Turing completeness. Transact-SQL already has such fullness, but is supported only for Microsoft SQL Server and Sybase ASE, so this language is not suitable for writing application programs.

The second is, of course, the choice of the project manager. There is an opinion that C ++ is so popular only because it was popular 5 years ago. In other words, when choosing a programming language, the project manager is likely to decide in favor of a better known language (the same Java and C ++) than a less well-known one (for example, Haskell or Common Lisp), even if the latter is better suited for this particular project. The question is why? The philosophy is very simple: if a manager collapses a project in C ++, he can try to justify himself with the fact that hundreds of projects die in C ++ every year. And with Haskell, this will not work, since in this case the manager himself insisted on using a relatively rarely used technology. From my point of view, this is not an excuse at all, since the manager should offer the most suitable programming language for this problem area, and not be guided by some possible excuses because of the popularity of the language. Of course, this does not mean that you need to program in abandoned and useless languages. I just want to emphasize that the choice of language must be approached creatively. For example, some of the functional languages ​​(Common Lisp, Scheme, Haskell, etc.) may be suitable for solving problems of computer algebra, since mathematical formulas in such languages ​​appear in a more usual, “mathematical” form than in imperative languages . Here is a sample of Haskell code to calculate factorial as a recurrent formula:
 factorial :: Integer -> Integer
  factorial 0 = 1
  factorial n |  n> 0 = n * factorial (n - 1)

Isn't it very similar to textbook formulas? Actually, the free computer algebra system Maxima is written in Common Lisp and is comparable in its capabilities to Maple and Mathematica.
Speaking all this, I just want to emphasize that the choice of language is really important. Not all languages ​​are equal, among them there are more equal. Do not believe? Here is an example of a program that prints "Hello World!" In Brainfuck:
 ++++++++++ [> +++++++> +++++++++> +++> + <<<< -]> ++
  .> +. +++++++ .. +++.> ++. << ++++++++++++++.>. +++.
  ------.--------.> +.>.

I want to clarify right away - this esoteric language containing only 8 commands was created by Urban Muller for fun, but, nevertheless, is Turing complete, and therefore allows you to write something that SQL cannot do. It just asks: who needs it when there are dozens of more worthy candidates? Perhaps it was precisely this idea that Mueller wanted to emphasize, giving the language a name consonant with the well-known English swear word. Another possible explanation is to give the language a memorable name in order to somehow increase its popularity.
')
Finally, your own preferences may influence your choice of language. Well, this is if the first two restrictions are not enough.

Next, we will discuss the choice of language primarily for application programming. So why is D so attractive? In fact, everyone should answer this question himself, but I will just try to describe what factors were guided by his choice. At the same time, in order to understand what kind of language you need, you need to figure out what disadvantages existing languages ​​have and decide on a list of possible candidates.

Few of the drawbacks of C #


Let's start in order. Why not choose C # as the language for application programming, because that’s what it was intended for? The main problem, which is extremely important to me, is a compilation into the intermediate language Common Intermediate Language (CIL), which makes decompiling a trivial task. Moreover, there are special tools for this. For example, .NET Reflector generates C # / VB.NET / F # source code without errors, including the preservation of the original names of variables, functions and classes. The resulting source code is different from the author except the lack of comments. Fans can experiment to download the trial version of the program from the official site .

So, if for the analysis of a compiled program on the same C ++ you need sufficiently deep knowledge, including the ability to work with the disassembler, then with C # getting the source code is a matter of a couple of minutes. Of course, all this is not important for the Open Source supporters, but here you can also argue: if I want to make the program free, I will publish its source codes, otherwise I have reason to leave the program closed. In the end, not all software is free. Here it should be noted that there are third-party obfuscators that can change the source code of the program beyond recognition with various additional instructions, changing internal interfaces and variable names, generating additional exceptional situations, encrypting code, etc. All this makes it very difficult to analyze the program in C #. Of the significant shortcomings of obfuscators, it can be noted: the generation of a slower code than the original one, the potential possibility of introducing an error into the program and the possibility of hacking the obfuscator itself (after which all its protection becomes useless). Think about it for a moment: people have been improving compilers for years trying to create better and more optimized development tools, and then some kind of regression is happening.

However, the question of the importance of compiling it into binary codes depends on the application - if the program is not planned to be used outside the developer company, then most likely the danger of decompiling will not be a decisive factor when choosing a programming language. And to summarize, I would like to add: in the standard delivery of the C # compiler from Microsoft, there is a tool for compiling programs into binary codes, but the program will not work without source codes for CIL. So for protection against hacking it is impossible to use.

The second thing to think about before using C # is the portability of the resulting code. Let me quote Oktal: "I think Microsoft called .Net technology so that it does not appear in the Unix directory listings." I don’t know if the author is right, but from C # it’s very easy to call WinAPI functions, COM objects and other components that will make a program intolerable. Even without the use of Windows-specific components, the program itself will not run on Unix-systems — Mono is required to run. Yes, of course, Mono is a well-known product, but problems with its use may still appear. The first thing to remember is that Mono developers will always be one step behind Microsoft, as they begin to implement already released libraries and standards. Second: will they not accidentally get bored? Yes, I understand perfectly well that this is a project with a free license that the community is developing, but the risk of discontinuing support still exists. This will not affect existing applications, but new programs are likely to become intolerable due to the lack of support for new libraries. The third thing to remember is the lack of standards for WinForms, ADO.NET and ASP.NET components from Microsoft. Their use may entail legal claims from Microsoft, therefore it is not recommended to use them together with Mono. The latter means that an excellent component of ASP.NET, which is used to build sites of various scales, does not have any advantages over the same PHP. Well, if suddenly you still want to use it in your projects, be mentally prepared to purchase licenses for the server version of Windows.

And finally, the third thing to ask yourself: will Microsoft continue to develop this language? Currently, all C # / CLI patents belong to Microsoft, and the refusal of support from her will mean the death of this language. Personally, I believe that C # will continue to evolve - Microsoft has invested too much in this language. But no one can give guarantees: at one time, Microsoft abandoned Visual Basic support, instead creating a completely new Visual Basic .NET language without backward compatibility with its predecessor. As a result, this led to a tragedy for thousands of programmers working in Basic. In addition, according to press reports, in 2015-2016, Microsoft plans to abandon the Windows brand , creating a new OS for tablets, smartphones, computers, consoles, TVs and other devices, which means languages ​​for developing for this OS can also sink into oblivion.

For me personally, these arguments are enough to stop using C #. Don't get me wrong: C # is a beautiful modern language, but it is suitable primarily for Windows only, and not in all cases. A little thought, you can come to the conclusion that using the .NET platform is just as dangerous as C #. Therefore, my personal opinion is that you should not write in languages ​​focused primarily on this platform. For example, Nemerle is a good language, somewhat superior to C #, the main feature of which is an advanced metaprogramming system that combines functional and object-oriented programming. But the design of the language is focused primarily on the .NET platform, which calls into question the possibility of using it in a number of projects. F # is a great example of a functional programming language for the .NET platform, somewhat similar to Haskell, which may be suitable for the development of mathematically oriented systems. But, again, the .NET platform limits its applicability.

Dear adherents of C # and the .NET platform! I know perfectly well that these technologies have a lot of advantages, and, first of all, this is a huge class library with which you can (almost) do anything. Unfortunately, a detailed review of the advantages of the .NET platform will require writing an article of about the same size, so please excuse me for the obviously incomplete coverage of this technology and its capabilities.

A little about the disadvantages of C ++


Why not choose C ++? Of course, this is an industrial standard about which everyone has heard. But in fact, this language has a number of significant flaws, some of which I will try to consider. The first thing that comes to mind is that this language is complex. More precisely, it is very complicated. Compared to him, working on C # is starting to seem like child's play. Judge for yourself: the standard on C only takes about 500 villages, C ++ - about 800, C ++ 11 - about 1300. If you compare the volume of technical documentation, this language is clearly more complex than a mixer, a sewing machine and a car, approaching rather the planes. For comparison, the standard C # 4.0 takes only 505 pages. At this point, I want to recall the quote by Alan Curtis Kay (Alan Curtis Kay): "I invented the concept of" object-oriented ", but I can say that I did not mean C ++ at the same time." In contrast, one can, of course, recall the creator of C ++, Bjarne Stroustrup: “There are only two kinds of languages: those that everyone complains about and those that nobody uses”, but this quote sounds more like an excuse. D in this sense is a fairly convenient language - its design was designed primarily based on C # and Java. The specification D1 is 223 pages. The specification for D2 comes in the form of html pages along with a compiler, which is also available on the official website www.d-programming-language.org . In addition, there is a book by Andrei Alexandrescu (Andrei Alexandrescu) "The D Programming Language", which is actually a description of the D2 standard (volume - 492 pages, currently being translated into Russian). So, the complexity of the language itself does not facilitate programming on it. In D everything is made simpler, smarter and more understandable.

Safety first


The next serious minus of C ++ is poor code checking at compile time for errors. In other words, in C ++ it is very easy to write code that will be compiled, but will not work correctly. Let us leave the compilation of a complete list of such controversial possibilities of the language to specialists, limiting ourselves only with the operator for obtaining the element of the array operator [] (the indexing operation). The fact that this language does not check the boundaries of the array elements, I think, is no secret to anyone. However, despite this, buffer overflow errors were, are and will appear in programs in C and C ++. And although programmers make these mistakes, I believe that the reason is precisely in the language that contributes to their appearance. For example, C # always makes out-of-array checks, which makes such errors much less likely, but losing performance (buffer overflow is still potentially possible due to possible errors in the implementation of the compiler or standard libraries). The STL library solves many problems, but this is only true when used correctly. For example, the template class does not check if it goes beyond its bounds when using the indexing operation (operator []); to check the bounds, you must use the at function (H. M. Deytel, PJ. Deytel, “How to program in C ++”). In other words, STL is not a panacea.

In fact, buffer overflow errors are a more serious problem than it seems at first glance. The famous Morris Worm, which paralyzed the Internet in 1988, used this type of error (the damage is estimated at $ 96 million). Another example: when using the Return-Oriented Programming Technology, only 2-3 assembly instructions are sufficient for the subsequent hacking of the system. In other words, a buffer overflow of even one byte can be a vulnerability. Allow a quote from security experts Michael Howard and David Leblanc from Protected Code: “In the course of many security campaigns at Microsoft, we strongly advocated identifying and, if necessary, transferring dangerous components written in C or C ++ to C # or another controlled language . This does not mean that the code will automatically become safe, but some classes of attacks — most likely buffer overflow attacks — will become much more difficult to exploit. The same is true of DoS attacks on servers, which are possible due to memory leaks and other resources. ” Consider for a moment: it is proposed to use a slower, less functional and easily disassembling language only because of the presence of built-in array boundaries and garbage collection.

Here it is necessary to emphasize the difference between the DoS-attack (Denial of Service, Denial of Service) and the DDoS-attack (Distributed Denial of Service, Distributed Denial of Service). DoS-attack, as a rule, uses "smart" tricks and vulnerabilities in the service and is carried out from a small number of computers (perhaps even from one). For example, as stated above, a DoS attack can be based on a memory leak. Another example is an attack on a file server in which all downloaded files are scanned for viruses. First, a file of several gigabytes in size consisting of only zeros is created (archived size is several kilobytes), uploaded to the attacked server, after which the server unpacks it and tries to check it with an antivirus ... a DDoS attack is an attempt to cause a denial of service simply due to overload service so many requests for which he obviously is not designed. In other words, this is an attack "in the forehead," carried out immediately with a large number of machines. Despite the fact that there are a number of ways to protect against DDoS attacks, it is impossible to protect yourself 100% from a properly organized DDoS attack, therefore the performance of the service itself becomes extremely important: the more requests we can process, the smaller the consequences of the attack. As a result, a paradox arises: C #, due to better control of resources, makes it much more difficult to conduct a DoS attack, but it makes the service more vulnerable to a DDoS attack due to the lower speed of operation.

But back to checking the array boundaries in C ++. I understand perfectly well that this is a system programming language, and any additional costs can make the program ultimately not sufficiently productive. However, it should be emphasized that the performance criterion is only important for the final release-version of the product. The debugging version is designed to work in a narrow circle of program developers on test examples; therefore, it can (and should) contain various checks and debugging messages that can improve its quality and reduce the likelihood of error (for more details, see Steve McConnell’s “Perfect Code”). Therefore, I am saddened by the lack of similar tools in the C ++ standard at least for the debug version of the program. However, it should be noted that there are third-party products to protect the stack against changes in order to detect overflow errors in it, for example, the gcc Stackguard and Stack-Smashing Protector compiler extensions (formerly ProPolice), Microsoft Visual Studio and IBM Compiler, which can significantly complicate the exploitation of vulnerabilities. But the potential possibility of hacking the system still remains. In addition to the stack overflow, heap overflow is possible, which is no less dangerous. Conclusion: the only reliable way to protect against hacking is to write the correct code.

It's nice to know that D learned the lesson: it has built-in support for checking the boundaries of arrays for the debug version of the program, which turns off when compiling an optimized version. In other words, D combines the qualities of two worlds - an ideal choice for a system and application programming language. But the security opportunities in D are just beginning. It would be simply wrong not to mention SafeD, a subset of the D language that forbids potentially dangerous memory operations, such as pointer arithmetic. This ensures that the memory remains intact. In combination with the built-in garbage collector, the language acquires the features typical of C #: buffer overflow errors and DoS attacks will become less troubling to you. Allow me to rephrase Howard and Leblanc's advice a little: write potentially dangerous fragments of a program on D, and enjoy a safe, efficient and compiled language.

Ability to call code in C / C ++ - industry standard


The time has come to mention another important feature of D: it is fully compatible with C / C ++ code at the object file level, which allows you to directly access functions and classes written in C / C ++ and vice versa. In fact, the standard C library is part of the D standard, though it is better to use the corresponding functions of the D library instead. This is done simply because no one is going to rewrite tons of C ++ code to D — just arm yourself with your favorite C ++ compiler — and that's it. For comparison, the call of managed code written, for example, in C #, from unmanaged, written, for example, in C ++, is possible through COM objects, which, from my point of view, is more complicated than just linking. The truth is that it should be noted that the C ++ / CLI standard adds advanced ways for C ++ to interact with managed code, but this means using only the Visual C ++ compiler from Microsoft.

Garbage collection against destructor


The topic of garbage collection should be covered in more detail. I have often heard the phrase: “if garbage collection was added to the language, it means they spoiled the performance”. Just want to ask these people: "and you did the tests?". Being a system programming language, D allows the use of manual memory management (albeit, this feature is no longer included in the scope of the SafeD subset). In your toolkit, an overload of the operators new and delete and the C-style memory management functions malloc and free. In addition, as a rule, the speed can be increased by 90% due to the change of only 10% of the code (according to the materials of Martin Fowler's book “Refactoring. Improving the existing code”). In other words, a large part of the program can be written using instructions that work safely with resources, and a small speed-critical fragment can be carefully checked. It is important to realize that the same programming language is used, which simplifies the creation, support and maintenance of code. In addition, in large programs, garbage collection speeds up development, since a C ++ programmer spends a lot of time (according to some sources - up to 50%) on memory management. So here's a recommendation: write the code, and leave the performance questions to the profiler, which will help you to identify the problem code fragment in terms of performance.

There are also a number of algorithms that simply cannot work correctly without garbage collection.Let me give you the simplest example: you have an abstract class Number, symbolizing a certain number, and its subclasses: Integer, Long, BigInt, Float, Double, Real, Complex, etc. Now imagine that somewhere in your program a line like this appears:

Number * c = a + b;

where a and b are pointers to Number, i.e. the actual type of the variables is not known. It is assumed that if a and b are Long, then as a result we get Long or BigInt (to avoid overflow errors), if Integer and Complex, then Complex, if Double and Double, then Integer, Long, Float or Double (depending from the received number of decimal places, for example 0.5 + 0.5 = 1), etc. Tell me, how to implement the function operator + (Number * a, Number * b) correctly? A detailed analysis of implementations is beyond the scope of this article, but everyone can learn about several possibilities in Jeff Alger’s book C ++ for Real Programmers, in the chapters on multi-pass (and double-pass in particular). I just want to note that the function operator + will have to create an object on the heap using the operator new,since the code that calls it cannot know anything about the actual type of the object c, and therefore cannot allocate memory in the stack. As a result, we need a garbage collection mechanism to free up memory. In this case, you can use smart pointers with reference counting, but they have fundamental limitations. In other words, nothing replaces the built-in garbage collection language, and the fact that the C ++ 11 standard only supports its basic functions (the implementation of advanced garbage collection is not included in the standard) does not please me. At the end, I would like to recall the quote of Francis Bacon: "Those who do not want to resort to new means, should expect new troubles." Yes, if suddenly you were interested in the previous example with numbers, I recommend also to pay attention to CLOS - the Common Lisp object system,which already has built-in support for multiple dispatching.

Destructor vs garbage collection


The next thing you need to know is that in D there is RAII (resource acquisition is initialization) - a characteristic C ++ resource management model, but missing in C # and Java. In other words, there is a destructor in D, and it is called independently of the garbage collector. As a result, this paradigm facilitates the management of resources (first of all, their release).

As you know, C # and Java abandoned this model, replacing destructors with trailers that are called after garbage collection. The latter makes it impossible to use trailers for freeing system-critical resources, since a call to the trailer can occur after a long period of time. Therefore, it is necessary to release resources explicitly. “What about hiding information? After all, I now have to remember which classes require an explicit release of resources, and which ones do not! It is here that mistakes appear, ”you ask, and you will be right. In response to this, it remains only to shrug and answer that it was in C ++ and is in D. However, in the case of a wrongly written program, freeing resources is “unknown when” is still better than “never”, so the unequivocal choice between models resource management in C ++ and C # / Java is difficult to do.Perhaps that is why so much controversy arises between C ++ and Java lovers.

C# , , , ( , ..). , FileStream, Stream , StreamWriter, (, , , ..) ( Stream, , FileStream, MemoryStream, NetworkStream ..). , (. , . , . , . , « - . »). , , ( .NET). : . , FileStream StreamWriter , . .NET ! , 50% FileStream , StreamWriter . StreamWriter, ? : , , , ( ). , , Microsoft : . , , FileStream StreamWriter… C++: « , ». : RAII , D .

Instead of conclusion


Unfortunately, this article did not have enough space to review a number of features of D. However, I would like to briefly but still list them:

?





.

Source: https://habr.com/ru/post/138635/


All Articles