📜 ⬆️ ⬇️

More to the question of sets

Alice: Why is this place a VERY strange place?
Dodo: But because all the other places are not so strange. There must be at least one VERY strange place.



So, we will consider the text of the BitSet template class in order to adapt it to the requirements of the MC, the main directions of optimization were defined earlier. You can, of course, write your own class from scratch, but let's not neglect the opportunity to familiarize yourself with good solutions, because the library STL (not to be confused with spl) has long been known and used everywhere. First we need to find the source code, after a short journey through Internet, I simply opened the directory with my MinGW and found the required file, which I intend to discuss further.

At the beginning we see a little more than a page with copyright and license text, as well as various warnings - well, you can skip this, this is for lawyers. Next come the inclusions and I express my gratitude to the authors - each one is accompanied by a comment, that we want to get from each file - everyone would do that, the work of Indiana Jones the researcher would be simpler. Next come the defains, I’m not a fan of the purity of the code and will not argue - let them be and immediately see the definition of the number of bits in the storage unit, which is unsigned long. Continuing on the slippery path of the Define, I’m looking for my own definition of the storage unit, which is equal to uint8_fast, although I understand that this will not be enough. By the way, I immediately bring one more gratitude to the authors for the style - at the end of the module they clean up after themselves, destroying internal defines - no longer hoped that I would see such a thing in modern production.

But then light neponyatki begin further - first define the template support structure struct _Base_bitset (why exactly the structure, not the class, since they are interchangeable), which define the type for data storage and it is again specified directly - change it to your own. Next, we define two instantiations of this auxiliary structure - for one storage unit (well, this is understandable, to increase the code efficiency) and for a degenerate example without storage at all (probably, for error handling), until we touch this code.

Next, we find the main class class bitset (now it is a class), derived from the template structure (yes, in C, you can, it’s not clear why you should do this, but you can), in which the storage type is again determined and again changed to your own. But here I wondered - why didn’t the type inherit from the parent structure and discovered with amazement (yes, with amazement, the template classes are not what I do in everyday life) that the inheritance of the template classes is somewhat different from the ordinary ones. That is, inheritance is just the same and all the entities of the parent are available in the child (I hope no one will take me for a gender chauvinist) class, but they must be specified with their full name. Well, it can still be understood, otherwise the compiler will not guess which of the instantiated options to apply, but if you use the types defined in the class, you must also add the keyword typename. Absolutely not an obvious solution, especially I was pleased with the compiler diagnostics, which means “maybe you meant the type from the parent class” - yes, thank you, captain, it was I who meant it, glad that we understand each other, but it doesn't get easier for me. However, such a diagnosis takes place in the old version of GCC, in the current message it becomes more understandable “you may have forgotten the keyword typename”. Understanding compiler diagnostic messages for template classes is generally a topic for a single song, and this song is anything but joyful.
')
So, if we don’t want to constantly use the construction in the child class

std::_Base_bitset<Nb>::_WordT 

(and we don't want, nig?) then we have three ways

1) trivial

 #define _WordT typename _Base_bitset<_Nb>::_WordT 

2) obvious - type definition at the global level and I liked it more

3) for lovers of strange

 typedef typename _Base_bitset<_Nb>::_WordT _WordT; 

(a persistent feeling of a magnificent chrome-plated crutch). Personally, in all three cases, I am not satisfied with the direct indication of the parent class, because the implementation should not take into account the chain of inheritance, but maybe I don’t understand something.

There is also a method 4)

  using _WordT = std::_Base_bitset<Nb>::_WordT; 

but in my version of the compiler it does not work, so there is none.

A note in the margin (PNP) - in a wonderful book “C ++ for real programmers” (I uploaded it by mistake in due time, having decided that it was devoted to C ++ in real-time systems) about the language said “Its elegance is not in simplicity (the words C ++ and simplicity cut the ears with their apparent contradiction), and in its potential possibilities. Behind every ugly problem hides some kind of clever idiom, an elegant language trick, thanks to which the problem melts right before our eyes. ”Since the book is really wonderful, it means that the quotation is correct (I understand that there is a certain logical stretch), but personally I Unfortunately, I see the problem, but with a smart idiom tension.

Despite the remaining feeling of slight bewilderment, the problem is solved and you can enjoy the result - but it was not there. When you change the size of the test set from 15 to 5, a compilation error suddenly occurred. No, everything is clear, the unmodified instantiation with the parameter of template 1 worked, but from the outside it looks very strange - we change the constant and the program stops compiling.

You can, of course, modify this implementation and the other, but this approach looks like a gross violation of the DRY principle. There are possible solutions and there are more than one of them 1) the obvious - again define and 2) trivial - again the definition of the type at the global level, but in this case it gets out of the module, which, however, it does in the implementation in question, protruding from the basic structure it’s just not necessary to write a qualifier.

Therefore, I tend to option 3) the base class to determine the type and inherit all instantiations from it. And the entities of the parent class are protected, again, so that they do not stick out. Next, I find a funny property, probably, C ++ must be praised for flexibility - for the variant with no storage, the type is not needed, and the language allows using the implementation without inheritance, although this is not particularly necessary in this particular case. Immediately I discover another drawback of the library - in all three variants, the operations of calculating the number of the storage unit are set each time

 static size_t _S_whichword 

and the bit masks in it are _S_maskbit and they are completely identical - we also put them into the base class. At the same time, the “dead” code is found - the _S_whichbyte method, I don’t even know what to do - on the one hand, the rules of good tone require its removal, on the other hand, in the case of a template, this does not affect the resulting code. I will use the rule - “if you don’t understand something, don’t touch it” and leave this method.

In principle, the modifications in terms of storage type are complete and testing can begin. And immediately I discover a lack of implementation - for the MSP430 architecture, for some reason, two-byte words are allocated instead of bytes. Of course, not double words, as before, but still we are fighting for minimal (in all senses) code. It turns out that the compiler is confident that in this architecture the type uint_fast8_t has a size of 2, although the system of commands contains operations on bytes and the compiler itself uses them completely. The idea of ​​using this type is compromised and you have to set the data type uint8_t directly. Well, if there is an architecture in which the work with the bytes is implemented unsuccessfully and the size of the uint_fast8_t type is different from 1 (none of those found in the compiler), it will have to suffer for the speed of all the others.

Testing of the corrected version shows its correct operation on various architectures and with different parameters, but there is still the question of calculating bit masks for MCs without developed shifts, in our case it is MSP430 and AVR. In principle, you can simply make a bitmask reading out of an array the method for all cases, regardless of the MK architecture. The solution is quite working, developed architects with indexing are fine, but there will still be a loss in time compared to fast shifts, and I wouldn’t like to point fingers in my direction with the words “the class will be faster, he said, we optimize he said.

Therefore, we need a different implementation for our two weak architectures, which differ from all other sizes in the size of the uint_fast16_t type - it is 2 versus 4, or 8 for 64 bit versions. A conditional compilation does not help us, and setting a condition in the hope of optimization is not the way of a samurai, there are patterns. I am trying to create a template class method, but partial specialization is not available for it - it turns out that this is how it should be, and even find an article that says why this is a good solution. Honestly, I did not understand why this decision is good, most likely, it is due to religious considerations. However, we were not asked, and what remains to be done - you can make a friendly template function (it turned out that it is impossible, and for the same reason - partial specialization is prohibited), there is another solution that looks funny - partial specialization of the method outside the class body. You can even create a separate template function and access it from the class methods, I don’t know which way is more correct, I chose this one. Yes, it does not have access to the internal entities of the class, but in this particular case it is not necessary, what to do if necessary, I do not know yet, “we will solve problems as they arise.”

Now we have everything we need, we put the tested fragments together and we get the code that solves the initially set optimization problem. At the same time, we made the code more compact (and therefore more understandable, although the latter is controversial), removed numerous repetitions and eliminated redundant entities sticking out. The only thing that is not fixed is the confusion of structures and classes, but I don’t understand the reasons for this implementation, therefore, just in case, I will not touch this part.

A separate post will be devoted to the implementation of the second variant of the set, and without that something has happened a lot.

Source: https://habr.com/ru/post/450176/


All Articles