Significant FMA performance anomaly experienced in the Intel Broadwell processor

Updated I’ve got no explanation for you, since I’m on Haswell, but I do have code to share that might help you or someone else with Broadwell or Skylake hardware isolate your problem. If you could please run it on your machine and share the results, we could gain an insight into what’s happening to … Read more

Obtaining peak bandwidth on Haswell in the L1 cache: only getting 62%

IACA Analysis Using IACA (the Intel Architecture Code Analyzer) reveals that macro-op fusion is indeed occurring, and that it is not the problem. It is Mysticial who is correct: The problem is that the store isn’t using Port 7 at all. IACA reports the following: Intel(R) Architecture Code Analyzer Version – 2.1 Analyzed File – … Read more

tech