Measuring NUMA (Non-Uniform Memory Access). No observable asymmetry. Why?

The first thing I want to point out is that you might want to double-check which cores are on each node. I don’t recall cores and nodes being interleaved like that. Also, you should have 16 threads due to HT. (unless you disabled it) Another thing: The socket 1366 Xeon machines are only slightly NUMA. … Read more

Poor memcpy Performance on Linux

[I would make this a comment, but do not have enough reputation to do so.] I have a similar system and see similar results, but can add a few data points: If you reverse the direction of your naive memcpy (i.e. convert to *p_dest– = *p_src–), then you may get much worse performance than for … Read more

tech