Hello. I want to talk about my coursework or what causes curiosity.
It’s been a long time from nothing to do writing a program for simbian. And from time to time faced with the oddities in the assembly. Everything pointed to the elf2e32 utility. Her task is to convert the input binary file of elf format into another, specific for Symbian - e32 image. I was curious for a long time - how does this utility work at all and why is it sometimes buggy? A little later, I began to pester another question - the topic of the course work =) I decided to combine business with pleasure and downloaded its source code. And it started ...
The first commit is not going to
Second, we include non-standard gcc extensions, add the missing classes, functions, constants from the sources. Subject happily going and falling. Progress however. We start under the debugger - the debugger enters the class that only initializes another, which initializes the next ... Hurray! This function! We enter. Oops. Where are we?!!! Stop debugger! Surgeon! Scalpel! Alcohol! Cucumber! Classes, appendix f furnace! You give nullptr instead of NULL! We have C ++ 14! Wow what awesome constructor initializes everything with zeros! And also, and also, and again - but with us C ++ 14 calls for initialization by default for classes! What is all laconic now ...
Laadno. We fix as much as possible at a time. I figured out why the debugger jumps on the sources of the aki-prichin source code - the aftar hit its head on abstraction, having grown the inheritance of 80 levelers from the UseCaseBase class :) Then, apparently, the constructor classes of static instances for Message & ParameterManager classes flew out of their eyes. Singleton Myers? No, not heard. F furnace abstraction! Viva revolucion !!! Viva POD !!!
Wow! How interesting this tree was growing. The main work is done by the BuildAll () function. If all parameters are specified, the function collects the import library, the file specifying the names of functions and variables and the order in which they are available in the import library and the binary itself. All descendants of UseCaseBase changed its algorithm through overload. Sometimes in descendants we prepare auxiliary data, but more often we simply turned off the creation of some files. For example, the file name for building something is not specified - a new class is created. Idiots It is enough to interrupt the execution of such a collector function if necessary. Easy to understand my actions B-)
We continue to delete empty classes, replace NULL with nullptr_t, replace range iterators with for (auto x: *).
We correct errors in the processing of command line parameters.
It is necessary to check the code with a static analyzer. Where to begin? Hmm, under the XPshka, the selection is small - cppcheck, and the codeblock supports it out of the box. Wow, what a catch! There is even a delete for char []! Damn, I know where half the gig of free RAM has gone =)
So we add the files generated from the elf-file libcrypto.dll and the file itself describing the parameters of the command line to create them.
Oops. CPPCheck was wrong ... It must be (a || b) ...
I will try to collect in Visual Studio 15 and Win10 would be poked with a stick. We put on a virtual machine. Made, download and run the online installer studio. What? Doesn't want to save the jump to the shared folder with the host ?! Yes, you choke! Download to where you were taught ... And now we transfer the downloaded to the folder and run the installation. What? Again ignores shared folder ??! Yes, you choke! Become where you were taught ...
In principle, a dozen well-whispers on one core and 3 gigs of the frame. Studio in the studio! Wondered, but not for long. Open my project in the studio Again, swears at the folder ... How much can you already? Yes, you choke ... We collect, swears on non-standard extensions STL hash_set. Remote Deleted ??? Turn on the brain =)
Wow what zaboristy code:
int ElfFileSupplied::UnWantedSymbolp(const char * aSymbol) { static hash_set<const char*, hash<const char*>, eqstr> aSymbolSet; int symbollistsize=sizeof(Unwantedruntimesymbols)/sizeof(Unwantedruntimesymbols[0]); static bool FLAG=false; while(!FLAG) { for(int i=0;i<symbollistsize;i++) { aSymbolSet.insert(Unwantedruntimesymbols[i]); } FLAG=true; } hash_set<const char*, hash<const char*>, eqstr>::const_iterator it = aSymbolSet.find(aSymbol); if(it != aSymbolSet.end()) return 1; else return 0; }
Let's think a little ... And voila:
int ElfFileSupplied::UnWantedSymbolp(const char * aSymbol) { int symbollistsize = sizeof(Unwantedruntimesymbols) / sizeof(Unwantedruntimesymbols[0]); for (int i = 0; i<symbollistsize; i++) { if (strstr(Unwantedruntimesymbols[i], aSymbol)) return 1; } return 0; }
My preliness ...
So why the program throws an exception if this flag is incorrectly set or not set at all? Why are you so cruel, beautiful far away ... Let's just drop this flag to a safe value. And this flag would also be nice ... And this, and this, and these. Or maybe it is better to make a separate function? A good idea! Let's call it ParameterManager :: CheckOptions ()!
Step to the left - fall, step to the right - unreported exception, jump on the spot - thanks at least BSOD =)
Dull ... Glitches and curvature ...
Olya-la !!! SymbU CleanUpStack emulation on STL ?:
In principle, nothing special:
std::vector<char*> cleanupStack;
Cleaning:
std::vector<char*>::iterator aPos; char *aPtr; aPos = cleanupStack.begin(); while( aPos != cleanupStack.end() ) { aPtr = *aPos; delete[] aPtr; ++aPos; }
Some kind of light head instead of left / right used l / r. Thank you cppcheck.
Ay, lazily in front of the monitor, the cppcheck logs can be disassembled ... What will the gitkhb offer us? .. Codacy ... We are connecting the project ... I have thought a little and are ready! Now you can read messages about success in dealing with errors lying on the couch ^^
So, with the like is not buggy ... Let's collect something, such as libcrypto.dll. It works, although the uncompressed file is more than one hundred bytes than the one created by the utility from the SDK. Further, the binaries created by this version of the utility and from the SDK will be constantly compared. The command line parameters are themselves identical.
Tax, where can I get analog diff for binary files? Hmm, I'm writing a script on the piston. Too much information - you need something much simpler. Dll to recognize pdf / djvu - AlternateReaderRecog.dll - a good option, the exhaust is less than 4 kilobytes. Taxes, offsets are different in the import section. Open them in a hex editor. The beginning is the same, in my version there is more garbage, just after the end of the section in the original version. But in my version the next section starts 100 bytes later. On the same value in bytes files and differ! Offsets further indicate the correct addresses ... The binary is correct !!! Ahhhh !!!
A mounth later. So where did this one hundred byte come from?
Well, if it is not clear how it works, we are starting to break the algorithm for creating E32Image. We continue to mock AlternateReaderRecog.dll. Increasing the size of the binary at the output - no way, overwriting the memset of the section - no way, reducing the size of the binary - no way. Grrrr. What the?!!! I break the exhaust in the release version, and run the debug? !!! Hi bast, start over ... Taaak section is wiped up - good! Increased the size of the binary! Good!!! Reduce the size of the import section! There is!!! It is byte-identical to the same section in the exhaust of this utility from the SDK!
We look into the creation code of this section. "sizeof (char *)" - something was remembered by the articles of Andrei Karpov, one of the developers of Pvs-studio, that types can occupy different memory sizes - and how much space does it take? MinGW - 8 bytes, Visual Studio - 4 !!! We divide in half these 8 bytes, business. Ffse! And how is the code section? This dllka without global variables. There are no global variables - there is no section either ... Take something heavier - libcrypto.dll.
The file on the output of my utility is now less than 100+ bytes ... What the ??? The import section is byte identical - good. Code section - no? !!
I don’t bother to compare such a wall of text ... I’m going to look for diff for a byte comparison ...
After a couple of days of playing with Google, I still found it. vbindiff is a console utility with the Norton Commander interface, showing the difference between two files in two horizontal panels. To go to the place of difference, press enter. Good! You can drag two files to the icon for comparison and the program will open them! Fine!!!
Compare - soooo in the title differ in its crc and creation time. Nothing. That baytik is different, another hundred ... Wow !!! Tens, hundreds, thousands of bytes of difference? !!! Taak, we look we look what section they belong to ... We look at offsets ... Aha, data section ...
We cranked up the trick, as for the import section ... We reset the memset, there is. Increasing section size ... Falling ... Increasing. Offers the hand and heart of the debugger ... Damn. We open function creating section - porridge from functions ... Grr.
... Aw, tomorrow ... For the time being, I'll fix something else ...
For example, add tests, but there is such a mess that it is impossible to divide the program into small modules. You can’t insert tests directly into the code - then the hell will figure it out. Idea! Constant launches of programs with different arguments - I have been testing the program all the time ... But let's do it better, we will issue a separate python script. Yes, a great idea, just great. The script for test execution errors should continue to work, reporting them but not falling. That's it!
We return to our sheep ... This function calls this, then this, go here ... So, where did it go? Ugh, confused ... ... Ay, tomorrow ... While I’ll fix something else ...
And so it took two months ...
Damn, where is this section of code formed? I had to go on academic leave, so at least I will deal with you !!! Taak. This is where the characters for the section come in ... What will printf show? I’m not putting everything in the console buffer ... Let's save the exhaust to a file ... So, so far nothing special ... Stop! Same lines !!! Many identical lines !!! Where from ?! We add printf on each data source (patience was enough for 3 of five, ha). Is empty We look at one of the remaining function calls ... Taak. Incrementing iterator after loop ??? And todo on warning codacy ??? Transferred to the loop. Run !!! There is a size match! There is a byte coincidence !!! Fixed!!! git blame the name of the hero refuses to name ... We look at the original - I did not create this. Or was it a “bomb” for non-Nokia developers? Grrr.
Carefully check the exhaust tests, check byte-byte files. Everything works as it should! In the release!
Olya! It is time for a Great Purge !!! It's time to uproot the UseCaseBase tree with the root !!!
Most of the descendants have already exhausted, we bring useful functions to the class generator. Only UseCaseBase and its descendant ElfFileSupplied are left. UseCaseBase - is a wrapper for a class that processes command parameters and declares several pure virtual functions for the ElfFileSupplied class. In short, the violinist is not needed ... What a sky is blue, well ... Another hour ... I will deal with this class and you can go for a walk ... And get some air, warm up, well ... Let's go! So, comment out this feature. We collect! Soooo, you have to think about how beautiful it is to remake ... Done !!! Next feature! Done! Next! Done! Done! Yes! Yes! Yes! The last function ... Ufff. We start after assembly ... Seven-fold acceleration of work? !!! The exhaust is correct ... It's funny. Debug version also shrunk by 2 meters? !!! Wow!!! You can walk. At night?!!! Kaak ??? Where is my day? !!! Laadno run tests and relax ... Tests have quietly worked - you can relax ...
Let me write something now ... Oh, the class that works with functions and variables accessible from the outside looks scary. The principle of operation: reading from a file, parsing lines and saving to a file. Under the line analysis, a whole class of selected noodles was already allocated to C ... Soooo ... Let's think ... What beauty came out:
read the string std :: getline (), remove spaces from the edges of the lines and the parsim.
To be continued ... The source code is https://github.com/fedor4ever/elf2e32
Source: https://habr.com/ru/post/430794/
All Articles