📜 ⬆️ ⬇️

How to find a line in CEE dump in which the program crashes on the mainframe

Once I had to learn working in C ++ with the mainframe, and there was a problem how to figure out where the program crashes and what is the reason. At once I make a reservation that everything relates to programming on mainframes under the z / OS operating system in the USS. On the one hand, we get elementary, but it's not so easy to find all this in the IBM documentation. In addition, you must at least be able to read HLASM.

In the document below, I will try to describe how to find a line using the CEE dump in which the program crashes.

Below is a short program that is written so that it will definitely fall as soon as it is launched. I did not complicate the program, and for C ++ programmers it will seem too simple, but the task is not to immediately find the error in the code, but to come to it through the listings. In reality, the program is much larger and longer.

int f1(int a, char *b) { char *value = 0L; value[0] = '\0'; return(0); } int main() { f1(1, ""); return(1); } 

')
To find a string, you need to compile the program with the listing, it is done this way.

1. without XPLINK
 c++ -Wc,list'(t1.list)' -c t1.C c++ -o t1 t1.o 


2. with XPLINK
 c++ -Wc,xplink -Wc,list'(t2.list)' -c t1.C c++ -Wl,xplink -o t2 t1.o 


The key is -Wc, list '(t1.list)', it makes generate a listing for this file.
In the makefile this is added like this.
 # Here is where we get convlit(iso8859-1) and __LIBASCII # XCCFLAGS = $(CCFLAGS_ASCII) -Wc,list'($*.list)' XCPPFLAGS = $(CPPFLAGS_ASCII) -Wc,list'($*.list)' 


The first line for C programs, the second in C ++.
Run the program, received a dump and messages below.

 /u/mddegt/sb/omni/cpp/test:>t1 CEE3204S The system detected a protection exception (System Completion Code=0C4). From entry point f1(int,char*) at compile unit offset +0000002A at entry offset +0000002A at address 1000A95A. Segmentation fault /u/mddegt/sb/omni/cpp/test:>t2 CEE3204S The system detected a protection exception (System Completion Code=0C4). From entry point f1(int,char*) at compile unit offset +00000016 at entry offset +00000016 at address 100063D6. Segmentation fault 


If you don’t go into details, then we already have the information we need. We see that the program fell in the f1 function, and we have an offset. Now we need a listing that is generated in the same directory as the source (t1.list, t2.list. T1 for a program without XPLINK, and t2 for a program with XPLINK).

In the listing look for the beginning of the function in which the program fell.

1. without XPLINK

 15694A01 V1 R6 z/OS C++ t1.C: f1(int,char*) 02/14/06 15:42:32 3 OFFSET OBJECT CODE LINE# FILE# PSEUDOASSEMBLYLISTING 000001 | * void f1(); 000002 | * 000003 | * int f1(int a, char *b) f1(int,char*) 000018 000003 | DS 0D 000018 47F0 F001 000003 | B 1(,r15) 00001C 01C3C5C5 CEE eyecatcher 000020 000000C8 DSA size 000024 000000D8 =A(PPA1-f1(int,char*)) 000028 5050 D028 000003 | ST r5,40(,r13) 00002C 5850 D04C 000003 | L r5,76(,r13) 000030 End of Prolog 


2. with XPLINK

 15694A01 V1 R6 z/OS C++ t1.C: f1(int,char*) 02/14/06 16:02:24 3 OFFSET OBJECT CODE LINE# FILE# PSEUDOASSEMBLYLISTING 000001 | * void f1(); 000002 | * 000003 | * int f1(int a, char *b) 000018 @1L0 DS 0D 000018 00C300C5 =F'12779717' XPLink entrypoint marker 00001C 00C500F1 =F'12910833' 000020 00000090 =F'144' 000024 00000088 =F'136' f1(int,char*) 000028 000003 | DS 0D 000028 9067 4788 000003 | STM r6,r7,1928(r4) 00002C End of Prolog 


The first column is the offset, the first is the code that was generated, the third is the row number in the source file.
As you can see, a simple program, without XPLINK, starts immediately, and it got an offset of 000018, and the function with XPLINK first comes the header (blue), and then the program, and it starts with offset 000028. Thus, we calculated the initial offset of the function. Now we add the offset to the initial offset where our program has fallen. All calculations are in hexadecimal form.

1. 18 + 2A = 42
2. 28 + 16 = 3E

Now in the listing it remains to find the line in which everything fell. To do this, we go down the function until we see the resulting offset. At the same time it is necessary not to go beyond the function itself. If you go beyond, then there may be several reasons for this:
1. The source is different from the one that was when building the program.
2. Somewhere in the calculations of the bias, we made a mistake and found it wrong.
3. What we generally thought.

one.
  000034 5020 50C0 000003 | ST r2,b(,r5,192) 000004 | * { 000005 | * .char *value = 0L; 000038 4110 0000 000005 | LA r1,0 00003C 1821 000005 | LR r2,r1 00003E 5020 50C4 000005 | ST r2,value(,r5,196) 000006 | * .value[0] = '\0'; 000042 9200 2000 000006 | MVI (char)(r2,0),0 000007 | * .return(0); 000008 | * } 000046 000008 | @1L6 DS 0H 

2
  000030 5020 4844 000003 | ST r2,b(,r4,2116) 000004 | * { 000005 | * .char *value = 0L; 000034 4130 0000 000005 | LA r3,0 000038 1813 000005 | LR r1,r3 00003A 5010 47E0 000005 | ST r1,value(,r4,2016) 000006 | * .value[0] = '\0'; 00003E 9200 1000 000006 | MVI (char)(r1,0),0 000007 | * .return(0); 000008 | * } 000042 000008 | @1L6 DS 0H 

In both cases, we got the same row number 6. Ignoring the assembler, we look at the row number in the 3rd column, 000006. If you wish, you can continue to search for an error in this listing, but I would go to the sources.

Further I will not give 2 versions without XPLINK and with XPLINK, I will be limited to the first.

If we did not have the stderr output with the information that was higher (I mean the output of the error message), then the same can be found in the CEE dump file. To do this, open the CEE dump and look for the very first table.

  Traceback: DSA Addr Program Unit PU Addr PU Offset Entry E Addr E Offset Statement Load Mod Service Status 10020CF0 CEEHDSP 046C0B00 +000048DA CEEHDSP 046C0B00 +000048DA CEEPLPKA UK10749 Call 100202B0 1000A930 +0000002A f1(int,char*) 1000A930 +0000002A *PATHNAM Exception 10020210 1000A968 +0000006E main 1000A968 +0000006E *PATHNAM Call 100200F8 044EFCB6 +000000B4 EDCZMINV 044EFCB6 +000000B4 CEEEV003 Call 10020030 CEEBBEXT 046C69E8 +000001A6 CEEBBEXT 046C69E8 +000001A6 CEEPLPKA HLE7709 Call 

In this table we look for Exception in the last row, this is our function where the program fell, in the Enrty column the name of the function, in the E Offset column the offset by which the program fell.

In the same table, we see a full stack of functions (Entry column). In the same way, we can determine which line of the calling function was the call to the called function.

What to do if we do not see the code where the error is.

To do this, we look at the CEE dump, the assembler code (a description of the assembly instructions for z / OS can be found in the Principal of Operation, chapter 7).

  000006 | * .value[0] = '\0'; 000042 9200 2000 000006 | MVI (char)(r2,0),0 

As you can see, this instruction tries to write 0 at the address (r2 + 0). Instead of zero, there may be any other offset, but in this case 0. The question arises, what about the register r2.

To do this, in the CEE dump there are printouts of registers and memory sections to which these registers point.

  Condition Information for Active Routines Condition Information for (DSA address 100202B0) CIB Address: 10021630 Current Condition: CEE3204S The system detected a protection exception (System Completion Code=0C4). Location: Program Unit: Entry: f1(int,char*) Statement: Offset: +0000002A Machine State: ILC..... 0004 Interruption Code..... 0004 PSW..... 078D1400 9000A95E GPR0..... 100087E8 GPR1..... 00000000 GPR2..... 00000000 GPR3..... 00000001 GPR4..... 1000A9A8 GPR5..... 100202B0 GPR6..... 1000AAAC GPR7..... 1000A0F0 GPR8..... 00000030 GPR9..... 80000000 GPR10.... 844EFCAA GPR11.... 846C69E8 GPR12.... 1001A7D0 GPR13.... 100202B0 GPR14.... 9000A9D8 GPR15.... 1000A930 FPC...... 00000000 FPR0..... 4DBE5D7C A198209F FPR1..... 00000000 00000000 FPR2..... 00000000 00000000 FPR3..... 00000000 00000000 FPR4..... 00000000 00000000 FPR5..... 00000000 00000000 FPR6..... 00000000 00000000 FPR7..... 00000000 00000000 FPR8..... 00000000 00000000 FPR9..... 00000000 00000000 FPR10.... 00000000 00000000 FPR11.... 00000000 00000000 FPR12.... 00000000 00000000 FPR13.... 00000000 00000000 FPR14.... 00000000 00000000 FPR15.... 00000000 00000000 Storage dump near condition, beginning at location: 1000A94A +000000 1000A94A 50BC5020 50C04110 00001821 502050C4 92002000 5850D028 47F0E004 070747F0 |&.&.&.......&.&Dk....&...0.....0| 

or else you can find further

  f1(int,char*) (DSA address 100202B0): UPSTACK DSA Saved Registers: GPR0..... 100087E8 GPR1..... 00000000 GPR2..... 00000000 GPR3..... 00000001 GPR4..... 1000A9A8 GPR5..... 100202B0 GPR6..... 1000AAAC GPR7..... 1000A0F0 GPR8..... 00000030 GPR9..... 80000000 GPR10.... 844EFCAA GPR11.... 846C69E8 GPR12.... 1001A7D0 GPR13.... 100202B0 GPR14.... 9000A9D8 GPR15.... 1000A930 GPREG STORAGE: Storage around GPR0 (100087E8) -0020 100087C8 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 |................................| +0000 100087E8 1001C028 00000000 00000000 00000000 10017728 10017732 00000000 00000000 |................................| +0020 10008808 00000000 00000000 00000000 00000000 00000001 00000000 00000001 00000000 |................................| Storage around GPR1 (00000000) +0000 00000000 Inaccessible storage. +0020 00000020 Inaccessible storage. +0040 00000040 Inaccessible storage. Storage around GPR2 (00000000) +0000 00000000 Inaccessible storage. +0020 00000020 Inaccessible storage. +0040 00000040 Inaccessible storage. Storage around GPR3 (00000001) -0001 00000000 Inaccessible storage. +001F 00000020 Inaccessible storage. +003F 00000040 Inaccessible storage. Storage around GPR4 (1000A9A8) -0020 1000A988 05404140 401E07F4 90E5D00C 58E0D04C 4100E0A0 5500C314 4140F040 4720F014 |. . ..4.V.....<......C.. 0 ..0.| +0000 1000A9A8 5000E04C 9210E000 50D0E004 18DE5800 C1F45000 D098C050 00000021 5800D098 |&..<k...&.......A4&..q.&.......q| +0020 1000A9C8 58F04050 41300001 18131825 4DE0F010 47000008 18F35800 D0985000 C1F4180D |.0 &........(.0......3...q&.A4..| 

Please note that a listing of registers would refer to this function.

Looking at the values ​​of the registers, we get r2 = 0 and therefore (0 + 0). Now everything has become clear, the program is trying to record data at address 0, which is inaccessible neither for reading nor for writing.

Everything, now it remains to find in the source code, why it happens, why value indicates 0, but this is another topic.

Literature: z / Architecture Principles of Operation

Source: https://habr.com/ru/post/134045/


All Articles