Modern x86 cost model

The best reference is the Intel Optimization Manual, which provides fairly detailed information on architectural hazards and instruction latencies for all recent Intel cores, as well as a good number of optimization examples. Another excellent reference is Agner Fog’s optimization resources, which have the virtue of also covering AMD cores. Note that specific cost models … Read more

‘Correct’ unsigned integer comparison

Well, you’ve correctly typified the situation: C/C++ have no way of doing a full signed int/unsigned int comparison with a single compare. I would be surprised if promotion to int64 was faster than doing two comparisons. In my experience, compilers are quite good at realizing that a subexpression like that is pure (has no side … Read more

Is performance reduced when executing loops whose uop count is not a multiple of processor width?

I did some investigation with Linux perf to help answer this on my Skylake i7-6700HQ box, and Haswell results have been kindly provided by another user. The analysis below applies to Skylake, but it is followed by a comparison versus Haswell. Other architectures may vary0, and to help sort it all out I welcome additional … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)