Performance difference between Windows and Linux using Intel compiler: looking at the assembly

In both cases the arguments and results are passed only in registers, as per the respective calling conventions on Windows and GNU/Linux. In the GNU/Linux variant, the xmm1 is used for accumulating the sum. Since it’s a call-clobbered register (a.k.a caller-saved) it’s stored (and restored) in the stack frame of the caller on each call. … Read more

What are the names of the new X86_64 processors registers?

The MSDN documentation includes information about the x64 registers. x64 extends x64’s 8 general-purpose registers to be 64-bit, and adds 8 new 64-bit registers. The 64-bit registers have names beginning with “r”, so for example the 64-bit extension of eax is called rax. The new registers are named r8 through r15. The lower 32 bits, … Read more

Why is memcmp(a, b, 4) only sometimes optimized to a uint32 comparison?

If you generate code for a little-endian platform, optimizing four-byte memcmp for inequality to a single DWORD comparison is invalid. When memcmp compares individual bytes it goes from low-addressed bytes to high-addressed bytes, regardless of the platform. In order for memcmp to return zero all four bytes must be identical. Hence, the order of comparison … Read more

Why is the construction of std::optional more expensive than a std::pair?

libstdc++ apparently does not implement P0602 “variant and optional should propagate copy/move triviality”. You can verify this with: static_assert(std::is_trivially_copyable_v<std::optional<int>>); which fails for libstdc++, and passes for libc++ and the MSVC standard library (which really needs a proper name so we don’t have to call it either “The MSVC implementation of the C++ standard library” or … Read more

What is the purpose of the “PAUSE” instruction in x86?

Just imagine, how the processor would execute a typical spin-wait loop: 1 Spin_Lock: 2 CMP lockvar, 0 ; Check if lock is free 3 JE Get_Lock 4 JMP Spin_Lock 5 Get_Lock: After a few iterations the branch predictor will predict that the conditional branch (3) will never be taken and the pipeline will fill with … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)