📜 ⬆️ ⬇️

How to disassemble the network protocol mobile MMORPG

Over the years of playing one mobile MMORPG, I have gained some experience in its reverse engineering, which I would like to share in a series of articles. Sample topics:

  1. Parsing message format between server and client.
  2. Writing a listening application to view game traffic in a convenient way.
  3. Interception of traffic and its modification using a non-HTTP proxy server.
  4. The first steps to your own ("pirated") server.

In this article I will consider the analysis of the message format between the server and the client . Interested please under the cat.

Required Tools


To be able to repeat the steps described below, you will need:


Additionally, it is highly desirable to have at hand data from the game in a readable form, such as a list of objects, creatures, etc. with their identifiers. This greatly simplifies the search for key points in the packets, and sometimes allows you to filter the desired message in a constant data stream.
')

Parsing message format between server and client


First, we need to see mobile device traffic. It’s quite simple to do (although I’ve got to this obvious solution for a very long time): on our PC we create a Wi-Fi access point, connect to it from a mobile device, select the interface you need in Wireshark - and all mobile traffic is in front of us.

Having entered the game and waited some time so that requests not connected with the game server itself stopped, the following picture can be observed:


At this stage, we can already use Wireshark filters to see only the packets between the game and the server, and also only with the payload:

tcp && tcp.payload && tcp.port == 44325 

If you stand in a quiet place, away from other players and NPCs, and do nothing, you can see constantly repeated messages from the server and client (76 and 84 bytes, respectively). In my case, the minimum number of different packages was sent on the character selection screen.


The frequency of the request from the client is very similar to pinging. Take a few messages to check (3 groups, on top - a request from the client, below it - the server response):


The first thing that strikes you is the identity of the packages. 8 additional bytes of the response when translating into the decimal system are very similar to the timestamp in seconds: 5CD008F8 16 = 1557137656 10 (from the first pair). We check the clock - yes, it is. The previous 4 bytes are the same as the last 4 bytes in the request. When translating, we get: A4BB 16 = 42171 10 , which is also very similar to the time, but in milliseconds. It roughly coincides with the time since the launch of the game, and most likely it is.

It remains to consider the first 6 bytes of the request and response. It is easy to notice the dependence of the value of the first four bytes of the message (let's call this parameter L ) on the size of the message: the response from the server is more by 8 bytes, the value of L also increased by 8, but the packet size is more by 6 bytes of value L in both cases. You can also notice that two bytes after L retain their value both in requests from the client and from the server, and given that their value differs by one, it is safe to say that this is the message code C (the associated message codes will most likely be determined consistently). The general structure is understandable enough to write a minimal template for 010Editor:


 struct Event { uint payload_length <bgcolor=0xFFFF00, name="Payload Length">; ushort event_code <bgcolor=0xFF9988, name="Event Code">; byte payload[payload_length] <name="Event Payload">; }; 

Hence, the format of the client ping message: send local ping time; server response format: send the same time and response time in seconds. It seems not difficult, right?

Let's try to disassemble the example more difficult. Standing in a quiet place and hiding the ping packages, you can find messages of teleport and craft creation. Let's start with the first. Owning the data of the game, I knew what value of the teleport point to look for. For the tests, I used points with the values 0x2B , 0x67 , 0x6B and 0x1AF . Compare with the values ​​in messages: 0x2B , 0x67 , 0x6B and 0x3AF :


Disorder. Two problems are visible:

  1. values ​​are not 4-byte, and different sizes;
  2. not all values ​​coincide with data from files, and in this case the difference is 128.

Additionally, comparing with the format of the ping, you can notice some difference:


Some of you, I think, could already guess the reason for the discrepancy between the expected values, but I will continue. Let's see what's going on in crafting:


The expected values ​​of 14183 and 14285 also do not correspond to the actual 28391 and 28621, but the difference here is already much more than 128. After conducting many tests (including with other types of messages), it became clear that the larger the expected number, the greater the difference between the value in the packet. What was strange was that the values ​​up to 128 remained all by themselves. Understood what is the matter? The obvious situation is for those who have already encountered this, and unknowingly, I had to disassemble this “cipher” for two days (ultimately, analysis of the values ​​in binary form helped in the “hacking”). The behavior described above is called Variable Length Quantity (variable length value) - a representation of a number that uses an indefinite number of bytes, where the eighth bit of a byte (the continuation bit) determines the presence of the next byte. From the description it is obvious that reading the VLQ is possible only in the order of Little-Endian. Coincidentally, all the values ​​in the packets are in this order.

Now that we know how to get the original value, we can write a template for the type:

 struct VLQ { local char size = 1; while(true) { byte obf_byte; if ((obf_byte & 0x80) == 0x80) { size++; } else { break; } } FSeek(FTell() - size); byte bytes[size]; local uint64 _ = FromVLQ(bytes, size); }; 

And the function of converting an array of bytes to an integer value:

 uint64 FromVLQ(byte bytes[], char size) { local uint64 source = 0; local int i = 0; local byte x; for (i = 0; i < size; i++) { x = bytes[i]; source |= (x & 0x7F) * Pow(2, i * 7); //   <<   , ..     ,  uint32,        uint64 if ((x & 0x80) != 0x80) { break; } } return source; }; 

But back to the creation of the subject. D appears again and 0x08 again 0x08 front of the changing value. The last two bytes of the message 0x10 0x01 suspiciously similar to the number of craft items, where 0x10 has a role similar to 0x08 , but still incomprehensible. But now you can write a template for this event:

 struct CraftEvent { uint data_length <bgcolor=0x00FF00, name="Data Length">; byte marker1; VLQ craft_id <bgcolor=0x00FF00, name="Craft ID">; byte marker2; VLQ quantity <bgcolor=0x00FF00, name="Craft Quantity">; }; 

Which will look like this:


Still, these were simple examples. It will be more difficult to disassemble the event of movement of the character. What information do we expect to see? At a minimum, the coordinates of the character where he looks, the speed of movement and the state (standing, running, jumping, etc.). Since the lines in the message are not visible, the state is most likely described by enum . By iterating through the options, comparing them with the data from the game files, as well as through a lot of tests, one can find three XYZ vectors with the help of such a bulky template:

 struct MoveEvent { uint data_length <bgcolor=0x00FF00, name="Data Length">; byte marker; VLQ move_time <bgcolor=0x00FFFF>; FSkip(2); byte marker; float position_x <bgcolor=0x00FF00>; byte marker; float position_y <bgcolor=0x00FF00>; byte marker; float position_z <bgcolor=0x00FF00>; FSkip(2); byte marker; float direction_x <bgcolor=0x00FFFF>; byte marker; float direction_y <bgcolor=0x00FFFF>; byte marker; float direction_z <bgcolor=0x00FFFF>; FSkip(2); byte marker; float speed_x <bgcolor=0x00FFFF>; byte marker; float speed_y <bgcolor=0x00FFFF>; byte marker; float speed_z <bgcolor=0x00FFFF>; byte marker; VLQ character_state <bgcolor=0x00FF00>; }; 

Visual result:


The green trio turned out to be location coordinates, the yellow triplets most likely show where the character is looking at and its speed vector, and the last solitary one is the status of the character. You can see constant bytes (markers) between coordinate values ​​( 0x0D before X value, 0x015 before Y and 0x1D before Z ) and before the state ( 0x30 ), which are suspiciously similar in meaning to 0x08 and 0x10 . After analyzing many markers from other events, it turned out that it determines the type of the value following it (the first three bits) and the semantic meaning, i.e. In the example above, if you swap the vectors, while retaining their markers ( 0x120F before the coordinates, etc.), the game (theoretically) should parse the message normally. Given this information, you can add a couple of new types:

 struct Packed { VLQ marker <bgcolor=0xFFBB00>; //    VLQ! local uint size = marker.size; //       ( , )          switch (marker._ & 0x7) { case 1: double v; size += 8; break; //     case 5: float v; size += 4; break; default: VLQ v; size += v.size; break; } }; struct PackedVector3 { Packed marker <name="Marker">; Packed x <name="X">; Packed y <name="Y">; Packed z <name="Z">; }; 

Now our traffic message pattern has dropped significantly:

 struct MoveEvent { uint data_length <bgcolor=0x00FF00, name="Data Length">; Packed move_time <bgcolor=0x00FFFF>; PackedVector3 position <bgcolor=0x00FF00>; PackedVector3 direction <bgcolor=0x00FF00>; PackedVector3 speed <bgcolor=0x00FF00>; Packed state <bgcolor=0x00FF00>; }; 

Another type that we may need in the next article is the lines that are preceded by a Packed value of their size:

 struct PackedString { Packed length; char str[length.v._]; }; 

Now, knowing the approximate message format, you can write your own listening application for ease of filtering and analyzing messages, but this is a topic for the next article.

Upd: thanks to aml for the hint that the message structure described above is Protocol Buffer , and also Tatikoma for the link to the useful relevant article .

Source: https://habr.com/ru/post/451512/


All Articles