intel – Page 2 – Tarik Billa

Intel HAXM on macOS high sierra (10.13)

August 26, 2023 by Tarik

The command line installation doesn’t work and gives unsupported mac os version error, while the installation through IntelHAXM_6.2.1.mpkg works but kext is not loaded due to “Approved Kernel Extension Loading” changes, So you will need to allow the extensions from Intel and restart your mac, then launch the emulator like from inside Android Studio, To … Read more

Does my AMD-based machine use little endian or big endian?

August 26, 2023 by Tarik

All x86 and x86-64 machines (which is just an extension to x86) are little-endian. You can confirm it with something like this: #include <stdio.h> int main() { int a = 0x12345678; unsigned char *c = (unsigned char*)(&a); if (*c == 0x78) { printf(“little-endian\n”); } else { printf(“big-endian\n”); } return 0; }

Do Intel and AMD processor have the same assembler?

August 22, 2023 by Tarik

AMD and Intel processors(*) have a large set of instructions in common, so it is possible for a compiler or assembler to write binary code which runs “the same” on both. However, different processor families even from one manufacturer have their own sets of instructions, usually referred to as “extensions” or whatever. Ignoring the x87 … Read more

What does Intel mean by “retired”?

August 2, 2023 by Tarik

In the context “retired” means: the instruction (microoperation, μop) leaves the “Retirement Unit”. It means that in Out-of-order CPU pipeline the instruction is finally executed and its results are correct and visible in the architectural state as if they execute in-order. In performance context this is the number you should check to compute how many … Read more

Intel x86 Opcode Reference?

July 28, 2023 by Tarik

Check this very complete table of x86 opcodes on x86asm.net. Just CTRL+F and you’re done! Be sure to read the correct line tho, as C8 for example may appear in several locations.

Intel SSE and AVX Examples and Tutorials [closed]

July 27, 2023 by Tarik

For the visually inclined SIMD programmer, Stefano Tommesani’s site is the best introduction to x86 SIMD programming. http://www.tommesani.com/index.php/simd/46-sse-arithmetic.html The diagrams are only provided for MMX and SSE2, but once a learner gets proficient with SSE2, it is relatively easy to move on and read the formal specifications. Intel IA-32 Instructions beginning with A to M … Read more

C code loop performance

June 9, 2023 by Tarik

I noticed in the comments that: The loop takes 5 cycles to execute. It’s “supposed” to take 4 cycles. (since there’s 4 adds and 4 mulitplies) However, your assembly shows 5 SSE movssl instructions. According to Agner Fog’s tables all floating-point SSE move instructions are at least 1 inst/cycle reciprocal throughput for Nehalem. Since you … Read more

Why use _mm_malloc? (as opposed to _aligned_malloc, alligned_alloc, or posix_memalign)

June 3, 2023 by Tarik

Intel compilers support POSIX (Linux) and non-POSIX (Windows) operating systems, hence cannot rely upon either the POSIX or the Windows function. Thus, a compiler-specific but OS-agnostic solution was chosen. C11 is a great solution but Microsoft doesn’t even support C99 yet, so who knows if they will ever support C11. Update: Unlike the C11/POSIX/Windows allocation … Read more

what is a store buffer?

May 13, 2023 by Tarik

An invalidate queue is more like a store buffer, but it’s part of the memory system, not the CPU. Basically it is a queue that keeps track of invalidations and ensures that they complete properly so that a cache can take ownership of a cache line so it can then write that line. A load … Read more

How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

May 8, 2023 by Tarik

Other answers welcome to address Sandybridge and IvyBridge in more detail. I don’t have access to that hardware. I haven’t found any partial-reg behaviour differences between HSW and SKL. On Haswell and Skylake, everything I’ve tested so far supports this model: AL is never renamed separately from RAX (or r15b from r15). So if you … Read more