📜 ⬆️ ⬇️

Prefixes in the IA-32 Command System

Today I want to tell you about the prefixes in the Intel IA-32 command system in 32-bit and 64-bit variants (also referred to as x86 and x86_64). But first, let me remind you briefly the general structure of IA-32 instructions:





A more detailed description of the structure of the instructions can be found in the article "Disassembler do it yourself" and, of course, in the Intel 64 and IA-32 Architectures Software Development Manuals . This article will cover the IA-32 prefixes, the features associated with their use, and trends in their development.
')

Single-byte prefixes


Practically from the very first Intel processors, single-byte prefixes were used in the IA-32 command system. About them it was already written on Habré , for this reason I will not talk about them.

Mandatory Prefixes


With the advent of the SSE extension, the part of single-byte prefixes, namely, 0xf2 , 0xf3 , 0x66 in some cases, the part of the opcode became meaningful. Appeared the so-called mandatory prefixes (persistent . Mandatory prefixes). Examples of such instructions are given below.

EncodingInstructionMandatory prefix
0x0f 0x10MOVUPS-
0xf2 0x0f 0x10MOVSD0xf2
0xf3 0x0f 0x10MOVSS0xf3
0x66 0x0f 0x10MOVUPD0x66


It is not difficult to notice that the encodings of these instructions differ only in prefix. Their opcode matches - 0x0f 0x10 . At the same time, the semantics of these instructions is different. For example, MOVSD copies 64 bits from one operand to another, and 128 bits from MOVUPD .

REX prefix


At a certain point, it became necessary to support 64-bit address space and expand the number of addressable registers. AMD developers successfully coped with this task by adding a prefix called REX . This prefix is ​​also single-byte, and has the form 0x4* . Its bits are used to extend the already existing fields encoded in the Mod_R/M byte, as well as the width of the operand. The figure shows an example of using the REX prefix to address registers.



It is worth noting several features associated with the use of this prefix. The encoding 0x4* corresponds to the prefix only in the 64-bit mode, in all other modes it corresponds to the variants of the INC/DEC instructions. An interesting feature of this prefix is ​​that it must be located immediately before the opcode byte, otherwise it is ignored. If the REX prefix is ​​used together with an instruction requiring the presence of another mandatory prefix, it must be located between that prefix and the opcode byte.

VEX prefix


With the introduction of the AVX extension, a new prefix called VEX appeared in the IA-32 command system. It is no longer single byte. It can consist of either two or three bytes, depending on the first byte of the prefix. 0xc4 and 0xc5 respectively.



The R , X , B , W fields carry the same meaning as the corresponding REX prefix fields. The pp field provides functionality equivalent to the mandatory SIMD prefixes (for example, b01 = 0x66 ). And the m-mmmm field can correspond to two whole bytes of the opcode (for example, 0b00011 = 0x0f 0x3a ). The L field defines the length of the vector: 0 - 128 bits, 1 - 256 bits.

Using the VEX prefix provides the following benefits:


It should be noted that the use of the VEX prefix together with some single-byte prefixes ( 0xf0 , 0x66 , 0xf2 , 0xf3 , REX ) is prohibited and results in a #UD exception.

EVEX Prefix


Not so long ago, Intel announced the emergence of a new command set extension called AVX3 or AVX512 . With the advent of this extension, a new prefix has also appeared, called EVEX . Its description can be found in the Intel Architecture Instruction Set Extensions Programming Reference .



It is an improved version of the VEX prefix, has a length of 4 bytes and starts with byte 0x62 , which in all modes except 64-bit corresponds to the BOUND , rarely used in modern programs.

Here are some, in my opinion, interesting features of the EVEX prefix:


Conclusion


In conclusion, I would like to mention some reasons for the appearance of so many complex and, in places, not logical system of commands. The history of the development of the Intel IA-32 command system begins in the 70s of the last century, when there was no talk of any 64-bit modes. In addition to Intel, AMD has made a significant contribution to the evolution of IA-32. Much effort has been spent on maintaining backward compatibility between different processor models. Many interesting facts related to the development of the IA-32 architecture can be found in the article A. Fog'a .

Thanks to Atakua for commenting on the drafts of this article.

PS All illustrations are from Intel 64 and IA-32 Architectures Software Development Manuals .

Source: https://habr.com/ru/post/200598/


All Articles