📜 ⬆️ ⬇️

Work with binary files in the style of STL

I would like to talk about the solution of one problem that arose in the process of teaching senior high school students and younger students to programming. Naturally, I am writing about this, because I believe that this experience may be of interest to a wider audience.

Formulation of the problem


Working with binary files is a traditional topic in teaching programming - at least in Russia. An important role here is played by the widespread Pascal programming language in Russian schools, which has built-in support for working with so-called typed files (such as file of integer , file of real , etc.). In some cases (with in-depth study of programming in schoolchildren or junior university), when the study of the C ++ language begins, there is a desire to solve problems for processing “T-type files” already in the new environment of this language. And then the question arises, what means of this use.

Unfortunately, in C ++, only low-level tools are provided for working with binary files - the read / write methods of the standard stream types istream / ostream . In addition to other obvious flaws, this fact does not allow the full use of STL-style programming (that is, first of all, part of the standard C ++ library related to algorithms and iterators).

So, the task is to provide work with binary files that store a sequence of values ​​of type T , as with the STL sequence ( vector<T> , etc.). For T , meanings of basic types are implied, as well as the so-called POD-types (everywhere you can only think of basic types, if you are not familiar with the concept of POD).
')

Possible solutions


ios_base :: binary: fail # 0

If you have never met with a similar task (which would be strange!), You may recall something about the flag ios_base::binary , but those who met, know well that this tool will not help in any way. The language standard is extremely brief on what effect can be expected from specifying this flag when opening a stream, but you can find an explanation in the network that it simply disables the platform-specific translations of newline characters and some more characters, which is not has a direct relationship to our task.

Static polymorphism: fail # 1

Probably, people who know the standard library a little deeper than at the base level remember that the standard types for ofstream / ifstream file streams are synonymous with the explicit instances of the basic_ofstream / basic_ifstream . These templates have two similar type parameters: the type of characters of the stream (we call this parameter Ch ) and the type of characteristics of the type of characters — the default is std::char_traits<Ch> . For ofstream / ifstream , the type char taken as Ch .

Here, the thought immediately arises of trying to instantiate these templates with the type T we want to read from a binary file, specifying it as the value of the template parameter Ch . The simplest code that tries to read from this type of stream values ​​of type int falls with the exception of the runtime bad_cast . Perhaps something could be changed by writing my own specialization for char_traits<int> and transferring it with the second parameter to the class template of the file stream, however this path seemed to be unpromising (to write the specialization of a very extensive template char_traits for each type T ...) and I did not deal with it further.

OOP and dynamic polymorphism: fail # 2

After the first failures, you can come to the idea that getting the desired behavior “completely free” from standard tools will not work and you will have to write some code. Let's try to solve this problem in the OOP paradigm, that is, by writing a couple of other classes. Classes of threads naturally. Inherit from the standard ofstream / ifstream , saving the maximum definitions from the ancestors, and see what happens. (I note in parentheses that this task in itself is not devoid of meaning, if only because it is marked by a rather high rating of difficulty in the list of exercises from the book by B. Straustrup - exercise 15, paragraph 21.10 in the third and special editions of the book C ++ ".)

From the very beginning, the need to overload the operations << and >> for its thread classes was obvious. It seemed to be enough. The problem arose in the following. To work with a stream using standard library algorithms, you should use input / output iterators. According to the statements of the authors of STL, its means are all generalized from themselves and I expected that as soon as my flow class meets some implicit library requirements, it will gladly work with it — static polymorphism ... In particular, I expected the standard iterators to be parameterized by the flow type, with whom they work. Not here it was! In the definition of the I / O iterator patterns, the standard types basic_ofstream / basic_ifstream are basic_ofstream - basic_ifstream .

Hope for salvation on this path remains, if we pay attention to one feature of the implementation of the operations << and >> : for the base types, they are implemented as member functions of the flow class templates. If, in addition, they were declared virtual, then one could rely on dynamic polymorphism (standard iterators would store objects of my streams by reference to the base class) - a partial solution of the original problem would work, which worked only for basic types (int, double etc.). However, these member functions are not virtual. Here one could speculate about the logic of the standard library device or the lack thereof (for example, it is known that STL did not initially assume to use the full power of OOP, inheritance and polymorphism, but the stream library is built on OOP ...), but let's move on to the final solution.

Ad-hoc polymorphism (overload): win


In the end, all that is required is to call special versions of the << and >> operations, which would hide the low-level work with files using read / write . It is enough to provide your overload of these operations and make sure that she was called. This can be achieved by using special types in the arguments. We have already failed to manipulate the types of threads - it remains to come up with special types for input / output. This suggests the use of what is called "wrappers."

Fortunately, we don’t need to write new wrapper classes for different types of T : we can restrict ourselves to one class template that stores the field of type parameter T and that can be converted to a reference to this field - constant and non-constant. A constructor with one parameter of type T , which is not declared explicit , will allow implicitly converting values ​​of type T to a wrapper type. Parameterless constructor is an STL requirement. The resulting code is shown below.
 #include <iostream> using std::istream; using std::ostream; template<typename T> class wrap { T t; public: wrap() : t() {} wrap(T const & t) : t(t) {} operator T&() { return t; } operator T const &() const { return t; } }; template<typename T> istream & operator>>(istream & is, wrap<T> & wt) { is.read(reinterpret_cast<char *>(&static_cast<T &>(wt)), sizeof(T)); return is; } template<typename T> ostream & operator<<(ostream & os, wrap<T> const & wt) { os.write( reinterpret_cast<char const *>(&static_cast<T const &>(wt)), sizeof(T)); return os; } 

Using static_cast requires the compiler to call the type conversion operation defined in the class template body to get a link to the information field, and reinterpret_cast leads the address of this field to a pointer to char , preparing us for low-level work with read / write .

Here is an example demonstrating the use of wrappers. He bears the imprint of the ideas that were laid initially, namely, programming in the style of STL.
 #include <algorithm> #include <fstream> #include <functional> #include <iostream> #include <iterator> #include <numeric> #include <cassert> int main() { int arr[] = {1, 1, 2, 3, 5, 8}; //  std::ofstream out("f.dat"); std::copy(arr, arr + 6, std::ostream_iterator< wrap<int> >(out)); out.close(); // : ,       std::ifstream in("f.dat"); assert( std::inner_product( std::istream_iterator< wrap<int> >(in), std::istream_iterator< wrap<int> >(), arr, true, std::equal_to<int>(), std::logical_and<bool>()) ); } 

Conclusion


It is clear that the result was an “absolute bike”, which was probably written by many C ++ programmers, but I did not notice anything like it on the web or in some well-known libraries (for example, Boost, in particular, Boost.Iostreams).

I would also like to note that I intend to leave behind discussion the relevance of the task. Probably, there are people who cannot imagine working with files at a higher level than read / write . Perhaps there are those who say that binary files are the past, that they are terribly intolerable, or something similar. Maybe this is partly so, but the exercise itself in solving such a task seemed to me interesting and informative.

For the formulation of the problem and the discussion of the solution, I sincerely thank Vitaly Nikolaevich Bragilevsky.

UPD1 : in the comments they asked why, instead of type conversions, do not write ordinary get member functions. Transformations are essentially used in examples like the following.
  std::ifstream in("f.dat"); int arr2[6]; std::copy(std::istream_iterator< wrap<int> >(in), std::istream_iterator< wrap<int> >(), arr2); 

Source: https://habr.com/ru/post/134788/


All Articles