We break haks completely. We read machine codes as an open book.

If haXe is translated in C ++, and from it into machine codes, this may seem hopeless, especially since at first glance this code is replete with calls to virtual methods, which, without running the debugger, are difficult to correlate with the addresses of the method bodies.

But it's not so bad. Even with script support disabled (HXCPP_SCRIPTABLE), strings with method and field names can be detected in the file. We analyze how to unwind this tangle, compare the names of the methods with their addresses and offsets in the table of virtual methods.

Some theory

After translation in C ++, all haXe classes are inherited from hx.Object (defined in hxcpp / include / hx / Object.h ). Of particular interest are the following methods:

virtual Dynamic __Field(const String &inString, hx::PropertyAccess inCallProp); virtual Dynamic __SetField(const String &inField,const Dynamic &inValue, hx::PropertyAccess inCallProp);

These methods are redefined in classes translated in C ++, and their implementation everywhere looks something like this:
')

src / openfl / geom / Matrix.cpp

 Dynamic Matrix_obj::__Field(const ::String &inName,hx::PropertyAccess inCallProp) { switch(inName.length) { case 1: if (HX_FIELD_EQ(inName,"a") ) { return a; } if (HX_FIELD_EQ(inName,"b") ) { return b; } if (HX_FIELD_EQ(inName,"c") ) { return c; } if (HX_FIELD_EQ(inName,"d") ) { return d; } break; case 2: if (HX_FIELD_EQ(inName,"tx") ) { return tx; } if (HX_FIELD_EQ(inName,"ty") ) { return ty; } break; case 5: if (HX_FIELD_EQ(inName,"clone") ) { return clone_dyn(); } if (HX_FIELD_EQ(inName,"scale") ) { return scale_dyn(); } if (HX_FIELD_EQ(inName,"setTo") ) { return setTo_dyn(); } break; case 6: if (HX_FIELD_EQ(inName,"concat") ) { return concat_dyn(); } if (HX_FIELD_EQ(inName,"equals") ) { return equals_dyn(); } if (HX_FIELD_EQ(inName,"invert") ) { return invert_dyn(); } if (HX_FIELD_EQ(inName,"rotate") ) { return rotate_dyn(); } break; case 7: if (HX_FIELD_EQ(inName,"__array") ) { return __array; } if (HX_FIELD_EQ(inName,"toArray") ) { return toArray_dyn(); } break; case 8: if (HX_FIELD_EQ(inName,"copyFrom") ) { return copyFrom_dyn(); } if (HX_FIELD_EQ(inName,"identity") ) { return identity_dyn(); } if (HX_FIELD_EQ(inName,"toString") ) { return toString_dyn(); } break; case 9: if (HX_FIELD_EQ(inName,"copyRowTo") ) { return copyRowTo_dyn(); } if (HX_FIELD_EQ(inName,"createBox") ) { return createBox_dyn(); } if (HX_FIELD_EQ(inName,"translate") ) { return translate_dyn(); } break; case 10: if (HX_FIELD_EQ(inName,"to3DString") ) { return to3DString_dyn(); } break; case 11: if (HX_FIELD_EQ(inName,"copyRowFrom") ) { return copyRowFrom_dyn(); } if (HX_FIELD_EQ(inName,"setRotation") ) { return setRotation_dyn(); } if (HX_FIELD_EQ(inName,"toMozString") ) { return toMozString_dyn(); } if (HX_FIELD_EQ(inName,"__toMatrix3") ) { return __toMatrix3_dyn(); } break; case 12: if (HX_FIELD_EQ(inName,"copyColumnTo") ) { return copyColumnTo_dyn(); } if (HX_FIELD_EQ(inName,"__transformX") ) { return __transformX_dyn(); } if (HX_FIELD_EQ(inName,"__transformY") ) { return __transformY_dyn(); } break; case 13: if (HX_FIELD_EQ(inName,"__cleanValues") ) { return __cleanValues_dyn(); } break; case 14: if (HX_FIELD_EQ(inName,"copyColumnFrom") ) { return copyColumnFrom_dyn(); } if (HX_FIELD_EQ(inName,"transformPoint") ) { return transformPoint_dyn(); } break; case 16: if (HX_FIELD_EQ(inName,"__transformPoint") ) { return __transformPoint_dyn(); } break; case 17: if (HX_FIELD_EQ(inName,"createGradientBox") ) { return createGradientBox_dyn(); } break; case 19: if (HX_FIELD_EQ(inName,"deltaTransformPoint") ) { return deltaTransformPoint_dyn(); } if (HX_FIELD_EQ(inName,"__transformInverseX") ) { return __transformInverseX_dyn(); } if (HX_FIELD_EQ(inName,"__transformInverseY") ) { return __transformInverseY_dyn(); } break; case 22: if (HX_FIELD_EQ(inName,"__translateTransformed") ) { return __translateTransformed_dyn(); } break; case 23: if (HX_FIELD_EQ(inName,"__transformInversePoint") ) { return __transformInversePoint_dyn(); } } return super::__Field(inName,inCallProp); }

As you can see, the fields in the understanding of the haX translator turn out to be not only what is usually considered fields, but also methods, however, in dynamic wrappers from which they have yet to be pulled out.

Training

Accordingly, it is worth starting with finding the __Field method. For example, it can be accessed by return link to the string with the name of the method. If you read what lines are in the file, then you can get back links, for example, in __ToString or RTTI. Of these, the reverse link must go to VMT. If the string is a field name, then instead of __Field you can get into a similar __SetField method, which is worse, since there are no references to dynamic wrappers for methods. While in VMT, open the overridden methods (allocated to addresses) and look for which ones are similar to __Field (you can see a large switch at the beginning):

Start __Field

 .text:010B3DB8 var_30 = -0x30 .text:010B3DB8 var_2C = -0x2C .text:010B3DB8 var_28 = -0x28 .text:010B3DB8 var_20 = -0x20 .text:010B3DB8 .text:010B3DB8 PUSH.W {R4-R9,LR} .text:010B3DBC SUB SP, SP, #0x14 .text:010B3DBE MOV R7, R2 .text:010B3DC0 MOV R4, R0 .text:010B3DC2 LDR R0, [R7] .text:010B3DC4 MOV R9, R3 .text:010B3DC6 MOV R5, R1 .text:010B3DC8 SUBS R0, #4 ; switch 28 cases .text:010B3DCA CMP R0, #0x1B .text:010B3DCC BHI.W def_10B3DD0 ; jumptable 010B3DD0 default case .text:010B3DD0 TBH.W [PC,R0,LSL#1] ; switch jump .text:010B3DD0 ; --------------------------------------------------------------------------- .text:010B3DD4 jpt_10B3DD0 DCW 0x1C ; jump table for switch statement .text:010B3DD6 DCW 0x35

Start __SetField

 .text:010B48DC var_38 = -0x38 .text:010B48DC var_30 = -0x30 .text:010B48DC var_28 = -0x28 .text:010B48DC var_24 = -0x24 .text:010B48DC var_20 = -0x20 .text:010B48DC arg_0 = 0 .text:010B48DC .text:010B48DC PUSH.W {R4-R9,LR} .text:010B48E0 SUB SP, SP, #0x1C .text:010B48E2 MOV R7, R2 .text:010B48E4 MOV R8, R0 .text:010B48E6 LDR R0, [R7] .text:010B48E8 MOV R6, R3 .text:010B48EA LDR R5, [SP,#0x38+arg_0] .text:010B48EC MOV R9, R1 .text:010B48EE SUBS R0, #6 ; switch 13 cases .text:010B48F0 CMP R0, #0xC .text:010B48F2 BHI.W def_10B48F6 ; jumptable 010B48F6 default case .text:010B48F6 TBH.W [PC,R0,LSL#1] ; switch jump .text:010B48F6 ; --------------------------------------------------------------------------- .text:010B48FA jpt_10B48F6 DCW 0xD ; DATA XREF: .text:01329970↓o .text:010B48FA ; jump table for switch statement .text:010B48FC DCW 0x25

__Field in the virtual method table is earlier than __SetField, and there are usually fewer options. In this example, 13 vs 28.

First stage: looking for dynamic wrappers

When both methods are found, you need to go to the __Field, look where the branch goes after 0 == memcmp and give the names to the wrappers. This can come across as ordinary fields, and wrappers. It’s easy to learn how to distinguish them, here’s an example of a normal field, then a dynamic wrapper for a method:

 .text:010B44B0 loc_10B44B0 ; CODE XREF: __Field+16A↑j .text:010B44B0 LDR R0, [R5,#0x20] .text:010B44B2 B loc_10B4582 .text:010B44B4 ; --------------------------------------------------------------------------- .text:010B44B4 .text:010B44B4 loc_10B44B4 ; CODE XREF: __Field+1B0↑j .text:010B44B4 LDR R2, =(get_error_dyn+1 - 0x10B44BA) .text:010B44B6 ADD R2, PC ; get_error_dyn .text:010B44B8 B loc_10B44D2

There was, but not in this file, such a problem that pointers to wrappers are not recognized. It looks like an abnormally large orange integer operand. Via Ctrl + R, in IDA it is necessary to make it offset.

The second stage: the most simple cases

First, let's look at how methods and wrappers for them are located after translation in C ++:

src / openfl / geom / Matrix.cpp

 // … ::lime::math::Matrix3 Matrix_obj::__toMatrix3( ){ HX_STACK_FRAME("openfl.geom.Matrix","__toMatrix3",0xaf6ed17e,"openfl.geom.Matrix.__toMatrix3","openfl/geom/Matrix.hx",480,0xa0d54189) HX_STACK_THIS(this) HX_STACK_LINE(482) Float tmp = this->a; HX_STACK_VAR(tmp,"tmp"); HX_STACK_LINE(482) Float tmp1 = this->b; HX_STACK_VAR(tmp1,"tmp1"); HX_STACK_LINE(482) Float tmp2 = this->c; HX_STACK_VAR(tmp2,"tmp2"); HX_STACK_LINE(482) Float tmp3 = this->d; HX_STACK_VAR(tmp3,"tmp3"); HX_STACK_LINE(482) Float tmp4 = this->tx; HX_STACK_VAR(tmp4,"tmp4"); HX_STACK_LINE(482) Float tmp5 = this->ty; HX_STACK_VAR(tmp5,"tmp5"); HX_STACK_LINE(482) ::lime::math::Matrix3 tmp6 = ::lime::math::Matrix3_obj::__new(tmp,tmp1,tmp2,tmp3,tmp4,tmp5); HX_STACK_VAR(tmp6,"tmp6"); HX_STACK_LINE(482) return tmp6; } HX_DEFINE_DYNAMIC_FUNC0(Matrix_obj,__toMatrix3,return ) Void Matrix_obj::__transformInversePoint( ::openfl::geom::Point point){ { HX_STACK_FRAME("openfl.geom.Matrix","__transformInversePoint",0xde42fb73,"openfl.geom.Matrix.__transformInversePoint","openfl/geom/Matrix.hx",487,0xa0d54189) // …

It can be seen that the body of the method goes first, then the macro constructs a dynamic wrapper, then the next method, then a dynamic wrapper for it, and so on. Since the names were given to the wrappers at the first stage, but not to the methods proper, the IDA subroutine list should have a “striped” picture when the named subroutines are interspersed with the named ones.

This is not quite true, but at this stage only the most obvious cases need to be handled - when there is exactly one subprogram between the dynamic wrappers, and, most likely, this is the method. He is given the name on the wrapper, which is below him.

Caution : there were cases, but not in this file, when IDA did not recognize the body of the method as a subroutine, but it did recognize something auxiliary, going after the method. This method is backlink from VMT.

The third stage: when there are two subprograms between the wrappers

Dynamic wrappers are created by a macro that looks like this :

 #define HX_DEFINE_DYNAMIC_FUNC0(class,func,ret) \ ::Dynamic __##class##func(hx::Object *inObj) \ { \ ret reinterpret_cast<class *>(inObj)->func(); return ::Dynamic(); \ }; \ ::Dynamic class::func##_dyn() \ {\ return hx::CreateMemberFunction0(this,__##class##func); \ }

As you can see, two wrappers are created here at once, typed and untyped, but typed is usually thrown out by the C ++ translator as unnecessary. If between dynamic wrappers there are two nameless subroutines at once, then, most likely, the first of them is the desired method, and the second is a typed wrapper.

By the beginning of the third stage, most of the methods should already be named, so if you look from VMT, these will be single spaces, and at this stage they will be eliminated.

Stage Four: Close Large Spaces in VMT

It happens that there are large gaps in the VMT, two or more methods. Once again, the convenience of looking from VMT can be noted. So, if you miss one method while traversing __Field, it will appear in the list of IDA subroutines as three unnamed subroutines between dynamic wrappers, but haX can generate additional subroutines for other needs, and then three unnamed subroutines between dynamic wrappers can also be generated.

From VMT, you can see: if there is a gap of two elements, then this is the missing dynamic wrapper in __Field. We find in the list of subroutines where this space is, go to the middle subroutine, it should be a wrapper. With the help of X, we open the list of backlinks, among them should be __Field. We go there, find out the name of the wrapper, the space in the list of subroutines is “tightened” by a strip, and then follow the described algorithm with the names of the methods.

Hx.Object methods

For completeness, you can open hxcpp / include / hx / Object.h , write out all the virtual methods in order, and identify the methods at the beginning of the VMT.

Defining data types of fields and arguments

When the fields and arguments are called methods (like all virtual), you need to understand in which VMT to look for them, and for this you need to understand what types they are. If you do not run the debugger, dynamic wrappers help to do this. At the input, they receive arguments of formal types (Dynamic, Dynamic, Dynamic, ...) and, in order to make a call, they first lead Dynamic to the actual method expected by the method. During this conversion, it is just possible to find out these same types.

For example, if in the body of the wrapper we see:

 .text:010B3884 LDR R1, =(off_23DE1D4 - 0x10B388E) .text:010B3886 MOVS R3, #0 .text:010B3888 LDR R2, =(off_23E04A0 - 0x10B3890) .text:010B388A ADD R1, PC ; off_23DE1D4 .text:010B388C ADD R2, PC ; off_23E04A0 .text:010B388E LDR R1, [R1] ; hx_Object_ci .text:010B3890 LDR R2, [R2] ; off_22D9DE0 .text:010B3892 BLX.W __dynamic_cast

... it can be seen that the cast from hx.Object to something else is done. If you have not yet identified hx_Obejct_ci, then both classes will be unknown, but this is solvable. We look at whose pointers they are writing in RTTI (in this example, off_22D9DE0), we put down the names, we draw conclusions.

Similarly, for fields, __SetField comes in handy, which is forced to cast the Dynamic type to the actual field type, thereby giving a hint.

Static fields and methods

If a class has static elements, it will override the static methods __GetStatic and / or __SetStatic. For obvious reasons, in VMT, they are not visible, but if the class contains both static and normal elements, then the translated code goes in order __Field, __GetStatic, __SetField, __SetStatic, so that, knowing where __Field and __SetField, you can calculate __GetStatic and __SetStatic next to them. There is also at the beginning of the switch on the length of the line, and then the comparison operation.

Screencast

00:00 Find __ Field and __ Set Field
03:00 First stage: we are looking for dynamic wrappers
21:30 Second stage: the simplest cases
30:48 The third stage: when there are two subprograms between the wrappers
33:15 Stage Four: Close Large Spaces in VMT
49:00 hx.Object Methods

Source: https://habr.com/ru/post/335970/

All Articles