The most correct safe printf

Under the cut, you will find a fascinating story about how I was very upset, getting to know the user literals (from the new standard), but at the same time I still implemented the above function, and also figured out the constexpr , and later also rehabilitated those literals .

Story

Back in 2009, myths about upcoming user literals appeared that allow you to do absolutely everything, namely, parsing a string at compile time. (By the way, thanks to ikalnitsky for the appetite - I recommend to look before reading.) I mean their version, which is with the template. But it was not there. The patterned implementation is allowed for digital literals only. This means that during the compilation in this way you can only parse the numbers.

The plot. First steps towards a solution

Then I was upset. But googling, I learned that it is possible to parse a string without templates during compilation.
')
So, part of the solution to safe printf .

template< class > struct FormatSupportedType; #define SUPPORTED_TYPE(C, T) \ template<> struct FormatSupportedType< T > { \ constexpr static bool supports(char c) { return c == C; } } SUPPORTED_TYPE('c', char); SUPPORTED_TYPE('d', int); template< std::size_t N > constexpr bool checkFormatHelper(const char (&format)[N], std::size_t current) { return current >= N ? true : format[current] != '%' ? checkFormatHelper( format, current + 1 ) : format[current + 1] == '%' ? checkFormatHelper( format, current + 2 ) : false; } template< std::size_t N, class T, class... Ts > constexpr bool checkFormatHelper(const char (&format)[N], std::size_t current, const T& arg, const Ts & ... args) { return current >= N ? false : format[current] != '%' ? checkFormatHelper( format, current + 1, arg, args... ) : (format[current] == '%' && format[current + 1] == '%') ? checkFormatHelper( format, current + 2, arg, args... ) : FormatSupportedType< T >::supports(format[current + 1]) && checkFormatHelper( format, current + 2, args... ); } template< std::size_t N, class... Ts > constexpr bool checkFormat(const char (&format)[N], const Ts & ... args) { return checkFormatHelper( format, 0, args... ); } int main() { static_assert( checkFormat("%c %d\n", 'v', 1), "Format is incorrect" ); }

The logic of the work, I think, is very clear: we look through the possible options and, depending on the relationship between types and characters, we return the result. To check for the existence of support and symbol compliance, an additional class is used. As a result, at compile time we can verify the format correctness (if it is of course known at the compilation stage). Next, using the classic printf print the result.

 template< std::size_t N, class... ARGS > int safe_printf(const char (&format)[N], ARGS... args) { static_assert( checkFormat(format, args... ), "Format is incorrect" ); return printf( format, args... ); }

But my gcc-4.7 does not want to eat it! I suddenly decided to be upset again, but the inspiration came. To go further we need to understand constexpr . Below, I think the most interesting part of the article.

The climax. Understanding constexpr

What happened before? There used to be a compilation stage and an execution stage, it should also be noted (although everyone knows this) that typing occurs at the compilation stage.
What is now? Now there is constexpr , which allows you to perform functions at the compilation stage - some kind of pun comes out. We need to introduce clarifying definitions: we will consider not just compilation and execution, but compilation and execution of specific parts of the program (in our case, functions, because it is still possible to use objects during compilation). For example, “compilation of function f”, “execution time of function f”, “compilation of the whole project”, “project execution time”.
That is, now the compilation stage of the entire project has crashed on compiling and executing various project units. Consider an example

 template< int N > constexpr int f(int n) { return N + n; } int main() { constexpr int i0 = 1; constexpr int i1 = f<i0>(i0); constexpr int i2 = f<i1>(i1); static_assert(i2 == 4, ""); }

At once I will say that it compiles ~~but does nothing useful~~ . Let's take a closer look at the process of compiling the function main () . First, the variable i0 is assigned a value, then this variable is used to calculate the value of the variable i1 , but in order to calculate it we need to execute the function f <i0> (i0) , but for this we need to compile it, and for compilation it needs the value i0 . Similarly with f <i1> (i1) . That is, we have the following: The process of compiling the function main () contains a sequential compilation of the function f <1> (int) , then its execution, then a compilation of the function f <2> (int) , and, accordingly, its execution.
What happens? The function designated as constexpr behaves like the most common function. Let's look at the function f : N known at the stage of its compilation, and n - at the stage of its execution.

The outcome. Implementing a secure printf

That's why it didn't want to compile!

 template< class... ARGS > int safe_printf(const char* format, ARGS... args) { static_assert( checkFormat(format, args... ), "Format is incorrect"); return printf( format, args... ); }

static_assert is resolved at the compilation stage of the safe_printf function, and the format will be known only at the time of its execution (even if for some other thing at this moment there is a compilation stage).
And how to get around this? Neither how, or insert the format characters into the template parameters so that they are visible at the compilation stage (and as we remember, using custom literals does not allow for this) or recall that when all super cool, powerful and invincible C + tools + (and even C ++ 11) become helpless, macros appear on the scene!

 #define safe_printf(FORMAT, ...) \ static_assert(checkFormat( FORMAT, __VA_ARGS__ ), "Format is incorrect"); \ printf(FORMAT, __VA_ARGS__) int main() { safe_printf("%c %d\n", 'v', 1); }

Victory!

Decoupling - what really happened or shove unpicked

As usual, we first show a happy ending, and then how it all turned out. Below is the correct implementation of a secure printf .

 template< char... > struct TemplateLiteral { }; template< char... FORMAT, class... ARGS > int safe_printf_2(TemplateLiteral<FORMAT...>, ARGS... args) { constexpr char format[] = {FORMAT... , '\0'}; static_assert( checkFormat(format, args... ), "Format is incorrect"); return printf( format, args... ); } int main() { safe_printf_2(_("%c %d\n"), 'v', 2); }

That is, a variable is passed to the function, the TYPE of which is of interest to us (and NOT the value), and the arguments that need to be output. It remains to implement a mechanism for turning a literal into a template. Ideally, it would be cool if in the context in which there is a literal there would still be a pack of indices for this literal (something like enumerate), so that it could be repacked later, that is,

 template< std::size_t... INDXs > //... TemplateLiteral<"some literal"[INDXs]...> //...

But the length of the literal and the length of the pack 'a must match, and since the pack can only be entered from the outside, then the literal must be passed outside, and if it is passed outside (but there is NO mechanism to insert it into the template as a parameter), then it is passed as simple the function argument is therefore not known at the compilation stage of the function in which it should wrap itself in the template, since templates are types, and types are compilation - in short, it is impossible.
But remember again about macros. You can ask boost :: preprocessor to generate a list of numbers. Of course, their number will be static, and it can only be changed at the preprocessing stage. It is also necessary to provide that the taking of an element by index from the literal at the compilation stage is controlled; therefore, it is necessary to provide some kind of protective mechanism, and, also, it will be necessary to clean the end of the line. It is also necessary to check whether the string is all captured, i.e. Has the programmer entered a literal that is too long? Below is the code.

 template< char... > struct TemplateLiteral { }; //       ; //      ,    template< std::size_t LEN, char CHAR, char... CHARS > struct TemplateLiteralTrim { private: //      //   - ,       template< bool, class, char... > struct Helper; template< char... C1, char... C2 > struct Helper< false, TemplateLiteral<C1...>, C2... > { //  , static_assert(sizeof...(C1) == LEN, "Literal is too large"); typedef TemplateLiteral<C1...> Result; }; template< char... C1, char c1, char c2, char... C2 > struct Helper< true, TemplateLiteral<C1...>, c1, c2, C2... > { typedef typename Helper< (bool)c2, TemplateLiteral<C1..., c1>, c2, C2...>::Result Result; }; public: typedef typename Helper<(bool)CHAR, TemplateLiteral<>, CHAR, CHARS..., '\0' >::Result Result; }; template< class T, std::size_t N > constexpr inline std::size_t sizeof_literal( const T (&)[N] ) { return N; } //      N-   template< std::size_t M > constexpr inline char getNthCharSpec( std::size_t N, const char (&literal)[M] ) { return N < M ? literal[N] : '\0'; } #define GET_Nth_CHAR_FOR_PP(I, N, LIT) ,getNthCharSpec(N, LIT) //      //      , // -      , //    #define TEMPLATE_LITERAL_BASE(MAX, LIT) \ (typename TemplateLiteralTrim< sizeof_literal(LIT) - 1 \ BOOST_PP_REPEAT(MAX, GET_Nth_CHAR_FOR_PP, LIT) >::Result()) // MAX_SYM         #define TEMPLATE_LITERAL(LITERAL) TEMPLATE_LITERAL_BASE(MAX_SYM, LITERAL) int main() { //  safe_printf_2(TEMPLATE_LITERAL("%c %d\n"), 'v', 2); }

By the way, it was very interesting for me to look at boost :: preprocessor - I didn’t imagine what they could do (such as arithmetic operations). So macros are really terrible power.

Unreleased frames. Rehabilitation of user literals

The time has come to show why I started to respect them (literals). Once upon a time, about two years ago, I learned about tuples. They seemed to me very comfortable, BUT the tuples were from Python, Nemerl and Haskell. And when I found out about C ++ tuples, I was very upset with std :: get <N> (tuple) - fu as cumbersome, I thought, and since then I wanted to develop a mechanism for getting the element, but through the operator of square brackets. And this is where custom literals came to the rescue.

 template< std::size_t > struct Number2Type { }; template< class... Ts > class tupless: public std::tuple<Ts...> { public: template< class... ARGS > tupless(ARGS... args): std::tuple<Ts...>(args...) { } template< std::size_t N > auto operator[](Number2Type<N>) const -> decltype(std::get<N>(std::tuple<Ts...>())) const& { return std::get<N>(*this); } template< std::size_t N > auto operator[](Number2Type<N>) -> decltype(std::get<N>(std::tuple<Ts...>())) & { return std::get<N>(*this); } }; template< std::size_t N > constexpr std::size_t chars_to_int(const char (&array)[N], std::size_t current = 0, std::size_t acc = 0) { return (current >= N || array[current] == 0) ? acc : chars_to_int(array, current + 1, 10 * acc + array[current] - '0'); }; template<char... Cs> constexpr auto operator "" _t() -> Number2Type<chars_to_int((const char[1 + sizeof...(Cs)]){Cs..., '\0'})> { return {}; //      }; int main() { tupless<char, int, float> t('x', 10, 12.45); safe_printf_2(TEMPLATE_LITERAL("%c %d %f"), t[0_t], t[1_t], t[2_t]); }

What's so interesting? Well, first of all, in order not to write the type twice, which returns a literal (namely, the type is interesting to us), an empty initialization list was used, and the compiler will try to bring it to an object of the desired type and insert the constructor there itself.
This user literal is very interesting because its type is directly dependent on the value, i.e. for example, the literal type 2_t would be Number2Type <2> . So here, I hope everyone will be comfortable.
It would, of course, be nice to add it to the standard library ...

UPDATE: it is better to use the function instead of the macro.

UPDATE: Transfer the topic to "Abnormal programming", I think it will be more comfortable here.

Source: https://habr.com/ru/post/142352/

All Articles