MOVing between two memory addresses
Your suspicion is correct, you can’t move from memory to memory. Any general-purpose register will do. Remember to PUSH the register if you are not sure what’s inside it and to restore it back once done.
Your suspicion is correct, you can’t move from memory to memory. Any general-purpose register will do. Remember to PUSH the register if you are not sure what’s inside it and to restore it back once done.
The primary example is the Intel X86 architecture. The Intel 8086 was, internally, a 16-bit processor: all of its registers were 16 bits wide. However, the address bus was 20 bits wide (1 MiB). This meant that you couldn’t hold an entire address in a register, limiting you to the first 64 kiB. Intel’s solution … Read more
The short answer: no, it is not. GCC does metrics ton of non trivial optimization and one of them is guessing branch probabilities judging by control flow graph. According to GCC manual: fno-guess-branch-probability Do not guess branch probabilities using heuristics. GCC uses heuristics to guess branch probabilities if they are not provided by profiling feedback … Read more
From the Linkers and loaders book: On 386 systems, the text base address is 0x08048000, which permits a reasonably large stack below the text while still staying above address 0x08000000, permitting most programs to use a single second-level page table. (Recall that on the 386, each second-level table maps 0x00400000 addresses.)
Newer microarchitectures have shifted the odds towards gather instructions. On an Intel Xeon Gold 6138 CPU @ 2.00 GHz with Skylake microarchitecture, we get for your benchmark: 9.383e+09 8.86e+08 2.777e+09 6.915e+09 7.793e+09 8.335e+09 5.386e+09 4.92e+08 6.649e+09 1.421e+09 2.362e+09 2.7e+07 8.69e+09 5.9e+07 7.763e+09 3.926e+09 5.4e+08 3.426e+09 9.172e+09 5.736e+09 9.383e+09 8.86e+08 2.777e+09 6.915e+09 7.793e+09 8.335e+09 5.386e+09 4.92e+08 … Read more
Excerpt from the C99 standard, normative annex F (The C++-standard does not explicitly mention this annex, though it includes all affected functions without change per reference. Also, the types have to match for compatibility.): IEC 60559 floating-point arithmetic F.1 Introduction 1 This annex specifies C language support for the IEC 60559 floating-point standard. The IEC … Read more
if I set this value to x86 does that mean I cannot run that project on a x64 machine? No, 32-bit applications (x86) run just fine on 64-bit Windows (x64). All 64-bit versions of Windows include a 32-bit compatibility layer called Windows on Windows 64 (WOW64). This is usually what you want, in fact, as … Read more
I have been studying measuring memory bandwidth for Intel processors with various operations and one of them is memcpy. I have done this on Core2, Ivy Bridge, and Haswell. I did most of my tests using C/C++ with intrinsics (see the code below – but I’m currently rewriting my tests in assembly). To write your … Read more
Yes, GCC generally avoids writing to partial registers, unless optimizing for size (-Os) instead of purely speed (-O3). Some cases require writing at least the 32-bit register for correctness, so a better example would be something like: char foo(char *p) { return *p; } compiles to movzx eax, byte ptr [rdi] instead of mov al, … Read more
I did some investigation with Linux perf to help answer this on my Skylake i7-6700HQ box, and Haswell results have been kindly provided by another user. The analysis below applies to Skylake, but it is followed by a comparison versus Haswell. Other architectures may vary0, and to help sort it all out I welcome additional … Read more