man! (C => D)

Each C-programmer with experience accumulates familiar baggage of technicians and idioms. It is often difficult to understand how to do the same in a new language. So, your attention is invited to a collection of common patterns in C and their equivalents in D. If you are going to transfer your program from C to D, or if you still doubt whether this should be done, then this article is for you.

Get type size in bytes

In C, we use a special operator:

sizeof( int ) sizeof( char * ) sizeof( double ) sizeof( struct Foo )

In D, each type has a special property:

 int.sizeof (char*).sizeof double.sizeof Foo.sizeof

We get the maximum and minimum value of the type

It was on C:

 #include <limits.h> #include <math.h> CHAR_MAX CHAR_MIN ULONG_MAX DBL_MIN

It became on D:

 char.max char.min ulong.max double.min_normal

Type Conformity Table C => D

 bool => bool char => char signed char => byte unsigned char => ubyte short => short unsigned short => ushort wchar_t => wchar int => int unsigned => uint long => int unsigned long => uint long long => long unsigned long long => ulong float => float double => double long double => real _Imaginary long double => ireal _Complex long double => creal

Special values of floating point numbers

It was on C:

 #include <fp.h> NAN INFINITY #include <float.h> DBL_DIG DBL_EPSILON DBL_MANT_DIG DBL_MAX_10_EXP DBL_MAX_EXP DBL_MIN_10_EXP DBL_MIN_EXP

It became on D:

 double.nan double.infinity double.dig double.epsilon double.mant_dig double.max_10_exp double.max_exp double.min_10_exp double.min_exp

The remainder of dividing real numbers

In C, we use a special function:

 #include <math.h> float f = fmodf( x , y ); double d = fmod( x , y ); long double r = fmodl( x , y );

D has a special operator for this operation:

 float f = x % y; double d = x % y; real r = x % y;

Processing NaN Values

In C, comparing with NaN is an indefinite behavior and different compilers react differently (from ignoring to throwing an exception), so you have to use special functions:

 #include <math.h> if( isnan( x ) || isnan( y ) ) { result = FALSE; } else { result = ( x < y ); }

In D, the comparison with NaN always returns false:

 result = ( x < y ); // false if x or y is nan

Acerta - a useful error detection mechanism

In C, there is no built-in acert mechanism, but it supports the FILE , LINE pseudo-constants and macros with which you can implement acerts (in fact, these constants have no other practical application):

 #include <assert.h> assert( e == 0 );

D supports asperts at the language level:

 assert( e == 0 );

Array iteration

On C in, you specify the length of the array as a constant, and then run through the array with a cumbersome for-loop:

 #define ARRAY_LENGTH 17 int array[ ARRAY_LENGTH ]; for( i = 0 ; i < ARRAY_LENGTH ; i++ ) { func( array[i] ); }

You can also use a clumsy expression with sizeof (), but this does not change much:

 int array[17]; for( i = 0 ; i < sizeof( array ) / sizeof( array[0] ) ; i++ ) { func( array[i] ); }

In D, arrays have a length property:

 int array[17]; foreach( i ; 0 .. array.length ) { func( array[i] ); }

But, if possible, it is better to use iteration over the collection:

 int array[17]; foreach( value ; array ) { func( value ); }

Array elements initialization

On C, you had to go through the array in a loop (or use a macro again):

 #define ARRAY_LENGTH 17 int array[ ARRAY_LENGTH ]; for( i = 0 ; i < ARRAY_LENGTH ; i++ ) { array[i] = value; }

D has a special simple notation for this particular case:

 int array[17]; array[] = value;

Creating variable length arrays

C does not support such arrays, so you have to set up a separate variable for the length and manually manage memory allocation:

 #include <stdlib.h> int array_length; int *array; int *newarray; newarray = (int *) realloc( array , ( array_length + 1 ) * sizeof( int ) ); if( !newarray ) error( "out of memory" ); array = newarray; array[ array_length++ ] = x;

D has built-in support for variable-length arrays and itself ensures proper memory handling:

 int[] array; int x; array.length = array.length + 1; array[ array.length - 1 ] = x;

String concatenation

C has to solve many problems like “when memory can be freed”, “how to handle null pointers”, “how to know the length of a string”, “how much memory to allocate” and others:

 #include <string.h> char *s1; char *s2; char *s; // Concatenate s1 and s2, and put result in s free(s); s = malloc( ( s1 ? strlen( s1 ) : 0 ) + ( s2 ? strlen( s2 ) : 0 ) + 1 ); if( !s ) error( "out of memory" ); if( s1 ) { strcpy( s, s1 ); } else { *s = 0; } if( s2 ) { strcpy( s + strlen( s ) , s2 ); } // Append "hello" to s char hello[] = "hello"; char *news; size_t lens = s ? strlen( s ) : 0; news = realloc( s , ( lens + strlen( hello ) + 1 ) * sizeof( char ) ); if( !news ) error( "out of memory" ); s = news; memcpy( s + lens , hello , sizeof( hello ) );

In D, there are special overloaded operators ~ and ~ = designed to join lists:

 char[] s1; char[] s2; char[] s; s = s1 ~ s2; s ~= "hello";

Formatted output

In C, the main formatted output method is the printf () function:

 #include <stdio.h> printf( "Calling all cars %d times!\n" , ntimes );

What do we write in D? Yes, almost the same:

 import std.stdio; writefln( "Calling all cars %s times!" , ntimes );

But unlike printf, writef is type-safe, that is, the compiler checks the correspondence between the types of the passed parameters and the types in the template.

Appeal to functions before the announcement

In C, the compiler does not allow accessing the function before it has met its declaration, so you have to either transfer the function itself, or, if the transfer is not possible, then insert a special declaration telling the compiler that the function will be declared later:

 void forwardfunc(); void myfunc() { forwardfunc(); } void forwardfunc() { ... }

Compiler D analyzes the entire file, while ignoring the order of declarations in the source code:

 void myfunc() { forwardfunc(); } void forwardfunc() { ... }

Functions without Arguments

It was on C:

 void foo( void );

It became on D:

 void foo() { ... }

Exit from multiple code blocks

In C, the break and continue statements allow you to go up only one level. To exit several blocks of code at once, you have to use goto:

 for( i = 0 ; i < 10 ; i++ ) { for( j = 0 ; j < 10 ; j++ ) { if( j == 3 ) goto Louter; if( j == 4 ) goto L2; } L2:; } Louter:;

In D, you can mark a code block and then exit it from any nesting depth:

 Louter: for( i = 0 ; i < 10 ; i++ ) { for( j = 0 ; j < 10 ; j++ ) { if (j == 3) break Louter; if (j == 4) continue Louter; } } // break Louter goes here

Structure namespace

In C, it is somewhat annoying that structures have a separate namespace, which is why each time you have to specify the keyword struct before the name of the structure. Therefore, a typical way to declare structures is:

 typedef struct ABC { ... } ABC;

In D, the struct struct is used to declare structures in the same namespace as all other declarations, so it is enough to write simply:

 struct ABC { ... }

Branching by string values (for example, processing command line arguments)

On C, you have to start an array of strings for this, a list of constants that are synchronous with it, sequentially iterate over the array in search of the desired string, and then make a switch-case for these constants:

 #include <string.h> void dostring( char *s ) { enum Strings { Hello, Goodbye, Maybe, Max }; static char *table[] = { "hello", "goodbye", "maybe" }; int i; for( i = 0 ; i < Max ; i++ ) { if( strcmp( s , table[i] ) == 0 ) break; } switch( i ) { case Hello: ... case Goodbye: ... case Maybe: ... default: ... } }

With a large number of options, it becomes difficult to maintain the synchronism of these three data structures, which leads to errors. In addition, sequential enumeration of options is not very effective with a large number of them, which means that even more complex code is required to search non-linearly, but, for example, by a binary search or through a hash table.

D, on the other hand, extends the switch functionality to strings, which simplifies the source code and allows the compiler to generate the most optimal machine code:

 void dostring( string s ) { switch( s ) { case "hello": ... case "goodbye": ... case "maybe": ... default: ... } }

Alignment of structure fields

In C, alignment control occurs through compiler arguments and affects the entire program right away and God forbid you not to recompile any module or library. To solve this problem, preprocessor directives #pragma pack are used, but these directives are not portable and depend heavily on the compiler used:

 #pragma pack(1) struct ABC { ... }; #pragma pack()

In D, there is a special syntax with which you can fine tune how to align certain fields (By default, fields are aligned in a C compatible manner):

 struct ABC { int z; // z is aligned to the default align(1) int x; // x is byte aligned align(4) { ... // declarations in {} are dword aligned } align(2): // switch to word alignment from here on int y; // y is word aligned }

However, the alignas keyword has already appeared in C11:

 #include <stdalign.h> struct data { char x; alignas(128) char cacheline[128]; // over-aligned array of char, not array of over-aligned chars };

Anonymous structures and associations

C to version 2011 requires all structures to give names, even if they are unnecessary:

 struct Foo { int i; union Bar { struct Abc { int x; long y; } _abc; char *p; } _bar; }; #define x _bar._abc.x #define y _bar._abc.y #define p _bar.p struct Foo f; fi; fx; fy; fp;

This code is not just cumbersome, but using macros to encapsulate the internal structure, which leads to the fact that the symbolic debugger does not understand what is happening here, and even these macros have a global scope, and are not limited to just the structure.

D (as well as C11) supports anonymous structures, which allows nested entities to be expressed in a more natural way, while maintaining a flat front end:

 struct Foo { int i; union { struct { int x; long y; } char* p; } } Foo f; fi; fx; fy; fp;

Defining structures and variables

In C, you can declare both a structure and a variable with one expression:

 struct Foo { int x; int y; } foo;

Or separately:

 struct Foo { int x; int y; }; // note terminating ; struct Foo foo;

D always uses separate expressions:

 struct Foo { int x; int y; } // note there is no terminating ; Foo foo;

Getting the offset field structure

In C, again, macros are used:

 #include <stddef> struct Foo { int x; int y; }; off = offsetof( Foo , y );

In D, each field has a special property:

 struct Foo { int x; int y; } off = Foo.y.offsetof;

Initializing associations

In C, the first suitable field is initialized, which can lead to hidden bugs when changing their composition and order:

 union U { int a; long b; }; union U x = { 5 }; // initialize member 'a' to 5

In D, you need to explicitly indicate to which field you assign the value:

 union U { int a; long b; } U x = { a : 5 };

Initialization of structures

In C (up to the version of 1999), the fields are initialized in the order they are declared, which is not a problem for small structures, but becomes a real headache in the case of structures that are large, as well as in cases when it is necessary to change the sequence and composition of the fields:

 struct S { int a; int b; int d; int d; }; struct S x = { 5 , 3 , 2 , 10 };

In D (and in C99) you can also initialize the fields in order, but it is better to explicitly specify the names of the fields being initialized:

 struct S { int a; int b; int c; int d; } S x = { b : 3 , a : 5 , c : 2 , d : 10 };

Array initialization

In C, arrays are initialized in the order of the elements:

 int a[3] = { 3 , 2 , 2 };

Nested arrays in C may not be surrounded by curly braces:

 int b[3][2] = { 2,3 , { 6 , 5 } , 3,4 };

In D, of course, the elements are also initialized in order, but you can explicitly specify offsets. The following advertisements produce the same result:

 int[3] a = [ 3, 2, 0 ]; int[3] a = [ 3, 2 ]; // unsupplied initializers are 0, just like in C int[3] a = [ 2 : 0, 0 : 3, 1 : 2 ]; int[3] a = [ 2 : 0, 0 : 3, 2 ]; // if not supplied, the index is the previous one plus one.

Explicitly specifying indexes is very useful when you need to have a value from a set as offsets:

 enum color { black, red, green } int[3] c = [ black : 3, green : 2, red : 5 ];

Brackets for nested arrays are required:

 int[2][3] b = [ [ 2 , 3 ] , [ 6 , 5 ] , [ 3 , 4 ] ]; int[2][3] b = [ [ 2 , 6 , 3 ] , [ 3 , 5 , 4 ] ]; // error

Escaping special characters in strings

In C, it is problematic to use the backslash character, since it means the beginning of a special sequence, so it must be duplicated:

 char file[] = "c:\\root\\file.c"; // c:\root\file.c char quoteString[] = "\"[^\\\\]*(\\\\.[^\\\\]*)*\""; // /"[^\\]*(\\.[^\\]*)*"/

In D, in addition to the usual lines with C-style shielding, there are also so-called “raw lines”, where shielding does not work, and you get exactly what you entered:

 string file = r"c:\root\file.c"; // c:\root\file.c string quotedString = `"[^\\]*(\\.[^\\]*)*"`; // "[^\\]*(\\.[^\\]*)*"

ASCII versus multibyte encodings

C uses a separate type of characters wchar_t and a special prefix L for string literals with "wide characters":

 #include <wchar.h> char foo_ascii[] = "hello"; wchar_t foo_wchar[] = L"hello";

But because of this, there is a problem with writing universal code that is compatible with different types of characters, which is solved by special macros that add the necessary conversions:

 #include <tchar.h> tchar string[] = TEXT( "hello" );

Compiler D removes constant types from the context of use, removing the burden on the programmer to specify the types of characters manually:

 string utf8 = "hello"; // UTF-8 string wstring utf16 = "hello"; // UTF-16 string dstring utf32 = "hello"; // UTF-32 string

However, there are special suffixes that indicate the type of characters for string constants:

 auto str = "hello"; // UTF-8 string auto _utf8 = "hello"c; // UTF-8 string auto _utf16 = "hello"w; // UTF-16 string auto _utf32 = "hello"d; // UTF-32 string

Display enumeration on an array

In C, you separately declare an enumeration, a separate array, which is rather difficult to maintain when the number of elements grows:

 enum COLORS { red , blue , green , max }; char *cstring[ max ] = { "red" , "blue" , "green" };

In D, such a mapping is given in key-value pairs, which is much easier to maintain:

 enum COLORS { red, blue, green } string[ COLORS.max + 1 ] cstring = [ COLORS.red : "red", COLORS.blue : "blue", COLORS.green : "green", ];

Creating new types

In C, the typedef operator actually creates not a new type, but just an alias:

 typedef void *Handle; void foo( void * ); void bar( Handle ); Handle h; foo( h ); // coding bug not caught bar( h ); // ok

At the same time, to set the default value, you have to use macros:

 #define HANDLE_INIT ( (Handle) -1 ) Handle h = HANDLE_INIT; h = func(); if( h != HANDLE_INIT ) { ... }

To really create a new type in C, which will work with both type checking and function overloading, you need to create a structure:

 struct Handle__ { void *value; } typedef struct Handle__ *Handle; void foo( void * ); void bar( Handle ); Handle h; foo( h ); // syntax error bar( h ); // ok ``          : ```c struct Handle__ HANDLE_INIT; // call this function upon startup void init_handle() { HANDLE_INIT.value = (void *)-1; } Handle h = HANDLE_INIT; h = func(); if( memcmp( &h , &HANDLE_INIT , sizeof( Handle ) ) != 0 ) { ... }

D also has powerful metaprogramming capabilities that allow you to implement typedef yourself and connect from the library:

 import std.typecons; alias Handle = Typedef!( void* ); void foo( void* ); void bar( Handle ); Handle h; foo( h ); // syntax error bar( h ); // ok

The second parameter of the Typedef template can specify the default value, which will fall into the standard property of all types - init:

 alias Handle = Typedef!( void* , cast( void* ) -1 ); Handle h; h = func(); if( h != Handle.init ) { ... }

Comparison of structures

In C, there is no simple way to compare two structures, so you have to use a comparison of memory ranges:

 #include <string.h> struct A x , y; ... if( memcmp( &x , &y , sizeof( struct A ) ) == 0 ) { ... }

Lack of type checking is not the most serious problem with this code. The fact is that the structure fields are stored aligned to the boundaries of the machine word for performance reasons, but the C compiler does not guarantee that there will be no garbage left between the fields that were previously stored in the same data memory, which will result in identical structures would be recognized as different.

In D, you simply compare the values, and the compiler takes care of everything (in D, memory is always initialized to zero by default, so under the hood you just use a quick comparison of memory ranges):

 A x , y; ... if( x == y ) { ... }

String comparison

In C, a special function is used that sequentially compares bytes to the zero byte, with which all lines end:

 char str[] = "hello"; if( strcmp( str , "betty" ) == 0 ) { // do strings match? ... }

In D, you simply use the standard comparison operator:

 string str = "hello"; if( str == "betty" ) { ... }

A string in D is nothing more than an array of characters in front of which its length is stored, which allows you to compare strings with much greater efficiency by comparing memory ranges. Moreover D supports row relation operations:

 string str = "hello"; if( str < "betty" ) { ... }

Sorting arrays

Although many C programmers bicyclic sorting from time to time, the right way is to use the library function qsort ():

 int compare( const void *p1 , const void *p2 ) { type *t1 = (type *) p1; type *t2 = (type *) p2; return *t1 - *t2; } type array[10]; ... qsort( array , sizeof( array ) / sizeof( array[0] ), sizeof( array[0] ), compare );

Unfortunately, the compare () function must be explicitly declared and be applicable to sortable types.

D has a powerful library of algorithms that works with both built-in and custom types:

 import std.algorithm; type[] array; ... sort( array ); // sort array in-place array.sort!"a>b" // using custom compare function array.sort!( ( a , b ) => ( a > b ) ) // same as above

String literals

C does not support multi-line string constants, however, using string escaping can be used to achieve their similarity:

 "This text \"spans\"\n\ multiple\n\ lines\n"

In D, only quotation marks need to be escaped, which allows you to insert text into the source code almost as it is:

 "This text \"spans\" multiple lines "

Crawling data structures

Consider a simple search function for strings in a binary tree. In C, we are forced to create the auxiliary function membersearchx, which is used to directly traverse the tree. So that she not only went, but did something useful, we give her a link to the context in the form of a special Paramblock structure:

 struct Symbol { char *id; struct Symbol *left; struct Symbol *right; }; struct Paramblock { char *id; struct Symbol *sm; }; static void membersearchx( struct Paramblock *p , struct Symbol *s ) { while( s ) { if( strcmp( p->id , s->id ) == 0 ) { if( p->sm ) error( "ambiguous member %s\n" , p->id ); p->sm = s; } if( s->left ) { membersearchx(p,s->left); } s = s->right; } } struct Symbol *symbol_membersearch( Symbol *table[] , int tablemax , char *id ) { struct Paramblock pb; int i; pb.id = id; pb.sm = NULL; for( i = 0 ; i < tablemax ; i++ ) { membersearchx( pb , table[i] ); } return pb.sm; }

In D, everything is much simpler - it is enough to declare an auxiliary function inside the implemented one, and the first will get access to the second variables, so we don’t have to pass an additional context into it through parameters:

 class Symbol { char[] id; Symbol left; Symbol right; } Symbol symbol_membersearch( Symbol[] table , char[] id ) { Symbol sm; void membersearchx( Symbol s ) { while( s ) { if( id == s.id ) { if( sm ) error( "ambiguous member %s\n" , id ); sm = s; } if( s.left ) { membersearchx(s.left); } s = s.right; } } for( int i = 0 ; i < table.length ; i++ ) { membersearchx( table[i] ); } return sm; }

Dynamic closures

Consider a simple container type. To be reusable, it needs to be able to apply some third-party code to each element. In C, this is implemented by passing a function reference, which is called with each element as a parameter. In most cases, in addition, it needs to transmit some context with the state. For example, let's pass a function that calculates the maximum value of numbers from the list:

 void apply( void *p , int *array , int dim , void (*fp) ( void* , int ) ) { for( int i = 0 ; i < dim ; i++ ) { fp( p , array[i] ); } } struct Collection { int array[10]; }; void comp_max( void *p , int i ) { int *pmax = (int *) p; if( i > *pmax ) { *pmax = i; } } void func( struct Collection *c ) { int max = INT_MIN; apply( &max , c->array , sizeof( c->array ) / sizeof( c->array[0] ) , comp_max ); }

In D, you can pass a so-called delegate — a function bound to some context. When you pass somewhere a reference to a function that depends on the context in which it is declared, it is actually the delegate that is passed.

 class Collection { int[10] array; void apply( void delegate( int ) fp ) { for( int i = 0 ; i < array.length ; i++ ) { fp( array[i] ); } } } void func( Collection c ) { int max = int.min; void comp_max( int i ) { if( i > max ) max = i; } c.apply( &comp_max ); }

Or the option is simpler, with an anonymous delegate:

 void func( Collection c ) { int max = int.min; c.apply( ( int i ) { if( i > max ) max = i; } ); }

Variable number of arguments

A simple example of how to write a function in C that summarizes all the arguments passed to it, no matter how many:

 #include <stdio.h> #include <stdarg.h> int sum( int dim , ... ) { int i; int s = 0; va_list ap; va_start( ap , dim ); for( i = 0 ; i < dim ; i++) { s += va_arg( ap , int ); } va_end( ap ); return s; } int main() { int i; i = sum(3, 8 , 7 , 6 ); printf( "sum = %d\n" , i ); return 0; }

As we see, we had to explicitly indicate in the call how many parameters we are going to pass to the function, which is not only redundant from the programmer’s point of view, but also a potential source of subtle bugs. Well, where without the traditional problem - the check of the types transferred to the function lies entirely on the programmer’s conscience.

In D, there is a special "..." construct that allows you to take several parameters as a single typed array:

 import std.stdio; int sum( int[] values ... ) { int s = 0; foreach( int x ; values ) { s += x; } return s; } int main() { int i = sum( 8 , 7 , 6 ); writefln( "sum = %d", i ); return 0; }

Conversely, you can pass an array to a function that accepts a variable number of parameters:

 int main() { int[] ints = [ 8 , 7 , 6 ]; int i = sum( ints ); writefln( "sum = %d", i ); return 0; }

Conclusion

In this article, we looked at the predominantly low-level capabilities of the D language, in many respects being a small evolutionary step relative to the C language. Stay in touch.

Source: https://habr.com/ru/post/276227/

All Articles

man! (C => D)

Get type size in bytes

We get the maximum and minimum value of the type

Type Conformity Table C => D

Special values ​​of floating point numbers

The remainder of dividing real numbers

Processing NaN Values

Acerta - a useful error detection mechanism

Array iteration

Array elements initialization

Creating variable length arrays

String concatenation

Formatted output

Appeal to functions before the announcement

Functions without Arguments

Exit from multiple code blocks

Structure namespace

Branching by string values ​​(for example, processing command line arguments)

Alignment of structure fields

Anonymous structures and associations

Defining structures and variables

Getting the offset field structure

Initializing associations

Initialization of structures

Array initialization

Escaping special characters in strings

ASCII versus multibyte encodings

Display enumeration on an array

Creating new types

Comparison of structures

String comparison

Sorting arrays

String literals

Crawling data structures

Dynamic closures

Variable number of arguments

Conclusion

More articles:

Special values of floating point numbers

Branching by string values (for example, processing command line arguments)