Performance optimisations of x86-64 assembly – Alignment and branch prediction

Alignment optimisations 1. Use .p2align <abs-expr> <abs-expr> <abs-expr> instead of align. Grants fine-grained control using its 3 params param1 – Align to what boundary. param2 – Fill padding with what (zeroes or NOPs). param3 – Do NOT align if padding would exceed specified number of bytes. 2. Align the start of a frequently used code … Read more

Does GCC generate suboptimal code for static branch prediction?

The short answer: no, it is not. GCC does metrics ton of non trivial optimization and one of them is guessing branch probabilities judging by control flow graph. According to GCC manual: fno-guess-branch-probability Do not guess branch probabilities using heuristics. GCC uses heuristics to guess branch probabilities if they are not provided by profiling feedback … Read more

Branchless internal merge slower than internal merge with branch

Such a large difference is the product of two conditions. The first condition is related to the original code. The in-place merge is so efficient there would be difficulty devising anything significantly faster, even if coding manually at the assembly language level. The application of generics is straightforward, so the compiler ** produced the same … Read more

Intel x86 0x2E/0x3E Prefix Branch Prediction actually used?

These instruction prefixes have no effect on modern processors (anything newer than Pentium 4). They just cost one byte of code space, and thus, not generating them is the right thing. For details, see Agner Fog’s optimization manuals, in particular 3. Microarchitecture: http://www.agner.org/optimize/ The “Intel® 64 and IA-32 Architectures Optimization Reference Manual” no longer mentions … Read more

How has CPU architecture evolution affected virtual function call performance?

AMD processor in the early-gigahertz era had a 40 cycle penalty every time you called a function Huh.. so large.. There is an “Indirect branch prediction” method, which helps to predict virtual function jump, IF there was the same indirect jump some time ago. There is still a penalty for first and mispredicted virt. function … Read more

tech