SIGILL
in completely random places. Nothing could reasonably explain what was happening, and the crash itself occurred with valid processor instructions. This immediately made us suspect an unsuccessful clearing of the instruction cache.
__clear_cache
correctly. This made us look at how other virtual machines or compilers reset the cache on the ARM64, and we found a list of typos / corrections for the Cortex A53 specification . The descriptions of these problems from ARM are vague and difficult to understand, so we tried to find a workaround. But here, nothing happened.
libc
implementation? Nice try. Faulty hardware? We reproduced it on several devices. Bad luck or karma? Yes!
SIGILL
always occurred in the range between 0x40-0x7f
or 0xc0-0xff
. Therefore, we formatted the snapshots of memory in such a way that it was easier to check the operation of the allocator:
$ grep SIGILL *.log custom_01.log:E/mono (13964): SIGILL at ip=0x0000007f4f15e8d0 custom_02.log:E/mono (13088): SIGILL at ip=0x0000007f8ff76cc0 custom_03.log:E/mono (12824): SIGILL at ip=0x0000007f68e93c70 custom_04.log:E/mono (12876): SIGILL at ip=0x0000007f4b3d55f0 custom_05.log:E/mono (13008): SIGILL at ip=0x0000007f8df1e8d0 custom_06.log:E/mono (14093): SIGILL at ip=0x0000007f6c21edf0 [...]
libgcc
resets the arm64 cache :
void __clear_cache (char *address, size_t size) { static int cache_line_size = 0; if (!cache_line_size) cache_line_size = get_current_cpu_cache_line_size (); for (int i = 0; i < size; i += cache_line_size) flush_cache_line (address + i); }
get_current_cpu_cache_line_size
is a processor instruction that returns the size of the cache lines, and flush_cache_line
clears the cache line at the specified address.
__clear_cache
could be called on a big-core with 128 byte instruction cache lines, and then on one of the LITTLE-cores, skipping all the others at reset. There is simply no place. We deleted the caching and it all worked.
__clear_cache
with a certain size of the cache line on the other, which may not be true. Thus, we should try to figure out the global minimum cache line size among all cores. Here is our fix for Mono: Pull Request . Other projects have already borrowed our fix: Dolphin and PPSSPP .
Source: https://habr.com/ru/post/320342/