... or what to do if "Hello world!" Fell.
Everything that follows is mainly written for Linux OS and console debugging, although something can be used in other conditions.
It may sound strange, but it’s worth starting to write a program with installing a version control system (if not already installed) and creating a repository. This is necessary so that in the process of writing not to lose a lot of time trying to remember where, what, how, when and why the writer corrects / adds. Today the most popular are svn (subversion), git and mercurial. The latter, personally, I like more than others, because, subjectively, it is simpler and more convenient, especially for personal use.
')
Next, you need to make sure that the gdb (debugger) and strace (system call monitor) commands are present. If not, then install. And when compiling your program, do not forget to include the addition of debug information.
So, it happened - she fell. One of the main points that I was taught is to carefully read what the system / program is writing. Much can be learned by using the commands:
- dmesg (kernel information), lspci (devices on a PCI bus), lsmod (list of loaded drivers) - when working with drivers;
- tail / var / log / messages - shows ten lines at the end of the system log;
- ps ax - a list of running processes (keys can be others);
Suppose there is nothing necessary there. Then it is worth executing the command “ulimit -c 50000", ulimit (shell built-in command that sets / shows shell resource restrictions. You can read a detailed description using man or
here ), 50,000 are the bats, the size of the core file, which is mostly cases will be created after the fall of the program running in the same console, and is a memory dump of the fallen program. Next, run the program again in the console, where there was a ulimit. She falls again, but with the creation of the bark (usually). Alternatively, all this can be done in advance, because if the error is floating, then the second time may fall shortly, sometimes people have been waiting for months.
With the help of the debugger, you should try to examine the lukewarm bark:
gdb provides an interactive console interface, in which a lot of things can be done, but here I’m not going to describe everything - in mans and the Internet it is on a bunch of languages. For now, it will be enough to execute the “bt” command (from backtrace) there, which will show the call stack and, if there was no memory (see below), you can see where it broke. And with the help of the frame N command, where N is the call number (on the left), you can see more. The “print <variable>” command will help you see (not always) the value of the variable. If your “Hello world!” Is multi-threaded, then everything is more complicated, but you can try using the debugger's “thread N” command to go to the stack of the Nth thread, though, as a rule, this does not help much.
If the debugger draws only a bunch of questions in the call stack, then this is an obvious memory record and it (the debugger) will not help you. In most such cases, you need to open the text of the program and
pay special attention to the functions memcpy, memset, sprintf and others, of a similar type, working with data blocks that can go beyond the array and write on top of everything that goes on in memory. Most likely the error is somewhere there. At a minimum, it’s worth replacing them with more secure counterparts (if any), for example, snprintf. If this did not help, then the old, time-tested methods go into battle:
- commenting everything and uncommenting small parts, followed by compilation and checking for errors;
- insertion of debug printing (often almost after each line of a suspicious section of code (memset! memcpy!)) with further analysis - after which printing broke.
This helps for both single-stranded and multi-stranded programs (there can be dances with a tambourine as usual).
What if the program enters an infinite loop and the system falls into a stupor (sometimes it happens)? We unload the window manager and in text mode on one of the consoles it is launched under the superuser (root). For complete confidence, it can be a little higher priority using nice / renice, and the program can be run on another console / terminal as a regular user. Then it, while looping, can be removed from the superuser console (the “kill” command) and see what it is writing.
What if nothing helps? The only answer is to sit down and carefully, very carefully, study the code and think.
If there are no errors in the program (it is not visible on the 1001st view and analysis), then maybe the place has run out of sections? (true story)
At the end I will add that if you decide to move the bark to another place to figure it out (for example, home), then transfer along with those sources and the compiled program as they lie or the output of gdb will be incorrect or will not be at all.
PS In the Linux kernel sources, the Documentation directory has the file CodingStyle. It has something to take note of the novice C-programmer.
PPS If I had known all this at first, it would be much easier for me.