memcpy
is that they correctly handle exceptions that occur during data transfer between different address spaces.
copy_to_user
function), it is possible that situations arise when the user’s process page into which the recording is attempted is in a swap or is inaccessible to the process. And if in the first case the correct solution to the problem is to load this page and continue copying, in the second case it is necessary to interrupt the operation and return the error code to the user (for example, -EINVAL
).
#PF
). At this moment, the kernel saves the context of the current task and executes the code of the corresponding handler, do_page_fault . Anyway, by eliminating the problem, the kernel restores the context of the interrupted task. However, depending on the result of processing the exception, the return address may differ from the address of the instruction that caused the exception. In other words, thanks to the mechanism provided for in the kernel, it is possible to set an address for a potentially “dangerous” instruction from which work will be continued in the event of an exception generated during its execution.
62 ENTRY(__put_user_4) 63 ENTER 64 mov TI_addr_limit(%_ASM_BX),%_ASM_BX 65 sub $3,%_ASM_BX 66 cmp %_ASM_BX,%_ASM_CX 67 jae bad_put_user 68 ASM_STAC 69 3: movl %eax,(%_ASM_CX) <- 70 xor %eax,%eax 71 EXIT 72 ENDPROC(__put_user_4) ... 89 bad_put_user: 90 CFI_STARTPROC 91 movl $-EFAULT,%eax 92 EXIT ... 98 _ASM_EXTABLE(3b,bad_put_user)
movl
data (the movl
instruction on line 69). It is here that an exception can be expected, since in addition to the fact that the target address really belongs to the range of user-space addresses, nothing more is known about it. Next, you should pay attention to the _ASM_EXTABLE macro, which is the following:
43 # define _ASM_EXTABLE(from,to) \ 44 .pushsection "__ex_table","a" ; \ 45 .balign 8 ; \ 46 .long (from) - . ; \ 47 .long (to) - . ; \ 48 .popsection
__ex_table
two values ​​- from
and to
, which, as it is not difficult to see, correspond to the addresses of the “suspicious” instruction in line 69 and the instruction that will be executed after processing the exception, namely, bad_put_user
. Adding an entry to the __ex_table
table makes the point of failure manageable, since This table is used by the kernel when handling exceptions.
97 struct exception_table_entry { 98 int insn, fixup; 99 };
_ASM_EXTABLE
macro. The first element describes the instruction, the second - the code to which control will be transferred in the event of an exception. Each time a page __ex_table
occurs, the Linux kernel, among other things, checks whether the address of the command that caused the exception is in the __ex_table
kernel table, or in one of the tables of loaded modules. If such a record is found, then the corresponding action is taken. Otherwise, the kernel executes some kind of standard logic for completing exception handling.
THIS_MODULE->extable
, whereas the number of elements of the table is contained in the variable THIS_MODULE->num_exentries
. The macro itself THIS_MODULE gives a link to the structure-descriptor of the module:
223 struct module 224 { ... 276 /* Exception table */ 277 unsigned int num_exentries; 278 struct exception_table_entry *extable; ... 378 };
50 /* Given an address, look for it in the exception tables. */ 51 const struct exception_table_entry *search_exception_tables(unsigned long addr) 52 { 53 const struct exception_table_entry *e; 54 55 e = search_extable(__start___ex_table, __stop___ex_table-1, addr); 56 if (!e) 57 e = search_module_extables(addr); 58 return e; 59 }
__ex_table
kernel and only then, if there is no result, continues among the exception tables of the modules. If none of the handlers matches the instruction address, the result of the kernel’s execution of this function is NULL
. Otherwise, the result will be a pointer to the corresponding element of the exception table.
static void raise_page_fault(void) { debug(" %s enter\n", __func__); ((int *)0)[0] = 0xdeadbeef; debug(" %s leave\n", __func__); }
exception_table_entry
elementextable
module static int fixup_page_fault(struct exception_table_entry * entry) { ud_t ud; ud_initialize(&ud, BITS_PER_LONG, \ UD_VENDOR_ANY, (void *)raise_page_fault, 128); while (ud_disassemble(&ud) && ud.mnemonic != UD_Iret) { if (ud.mnemonic == UD_Imov && \ ud.operand[0].type == UD_OP_MEM && ud.operand[1].type == UD_OP_IMM) { unsigned long address = \ (unsigned long)raise_page_fault + ud_insn_off(&ud); extable_make_insn(entry, address); extable_make_fixup(entry, address + ud_insn_len(&ud)); return 0; } } return -EINVAL; }
raise_page_fault
). Further, with a given search depth, commands are searched. The required command (what the operation is translated into ((int *)0)[0] = 0xdeadbeef;
) is the usual movl $0xdeadbeef, 0
with the first operand of the UD_OP_MEM
type and the second one of the UD_OP_IMM
type. As soon as the address of the command is found, a table element is formed. At the same time, the following functions are performed:
static void extable_make_insn(struct exception_table_entry * entry, unsigned long addr) { #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,5,0) entry->insn = (unsigned int)((addr - (unsigned long)&entry->insn)); #else entry->insn = addr; #endif } static void extable_make_fixup(struct exception_table_entry * entry, unsigned long addr) { #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,5,0) entry->fixup = (unsigned int)((addr - (unsigned long)&entry->fixup)); #else entry->fixup = addr; #endif }
exception_table_entry
, namely, the dimension of its fields has been reduced - insn
and fixup
for 64-bit architectures. This made it possible to reduce the amount of memory required for storing addresses, but the logic of calculation has changed slightly. So, after the 3.5 kernel, the insn
and fixup
store 32-bit values ​​corresponding to the address offsets relative to these elements. For those who are interested, I bring a commit, which spoiled everything 706276543b699d80f546e45f8b12574e7b18d952 .
exables
module. The latter circumstance made it possible to eliminate the abnormal termination and continue the execution of the program with the command following the emergency instruction.
struct { const char * name; int (* fixup)(struct exception_table_entry *); void (* raise)(void); } exceptions[] = { { .name = "0x00 - div0 error (#DE)", .fixup = fixup_div0_error, .raise = raise_div0_error, }, { .name = "0x06 - undefined opcode (#UD)", .fixup = fixup_undefined_opcode, .raise = raise_undefined_opcode, }, { .name = "0x14 - page fault (#PF)", .fixup = fixup_page_fault, .raise = raise_page_fault, }, };
Source: https://habr.com/ru/post/196952/