best cross-platform method to get aligned memory

As long as you’re ok with having to call a special function to do the freeing, your approach is okay. I would do your #ifdefs the other way around though: start with the standards-specified options and fall back to platform-specific ones. For example If __STDC_VERSION__ >= 201112L use aligned_alloc. If _POSIX_VERSION >= 200112L use posix_memalign. … Read more

Atomicity in C++ : Myth or Reality [duplicate]

This recommendation is architecture-specific. It is true for x86 & x86_64 (in a low-level programming). You should also check that compiler don’t reorder your code. You can use “compiler memory barrier” for that. Low-level atomic read and writes for x86 is described in Intel Reference manuals “The Intel® 64 and IA-32 Architectures Software Developer’s Manual” … Read more

Why does the Mac ABI require 16-byte stack alignment for x86-32?

From “Intel®64 and IA-32 Architectures Optimization Reference Manual”, section 4.4.2: “For best performance, the Streaming SIMD Extensions and Streaming SIMD Extensions 2 require their memory operands to be aligned to 16-byte boundaries. Unaligned data can cause significant performance penalties compared to aligned data.” From Appendix D: “It is important to ensure that the stack frame … Read more

why is data structure alignment important for performance?

Alignment helps the CPU fetch data from memory in an efficient manner: less cache miss/flush, less bus transactions etc. Some memory types (e.g. RDRAM, DRAM etc.) need to be accessed in a structured manner (aligned “words” and in “burst transactions” i.e. many words at one time) in order to yield efficient results. This is due … Read more

Questions about Hinnant’s stack allocator

I’ve been using Howard Hinnant’s stack allocator and it works like a charm, but some details of the implementation are a little unclear to me. Glad it’s been working for you. 1. Why are global operators new and delete used? The allocate() and deallocate() member functions use ::operator new and ::operator delete respectively. Similarly, the … Read more

Query the alignment of a specific variable

You can try with something like: bool is_aligned(const volatile void *p, std::size_t n) { return reinterpret_cast<std::uintptr_t>(p) % n == 0; } assert(is_aligned(array, 16)); The above assumes a flat address space and that arithmetic on uintptr_t is equivalent to arithmetic on char *. While these conditions prevail for the majority of modern platforms, neither of which … Read more

Alignment requirements for atomic x86 instructions vs. MS’s InterlockedCompareExchange documentation?

x86 does not require alignment for a lock cmpxchg instruction to be atomic. However, alignment is necessary for good performance. This should be no surprise, backward compatibility means that software written with a manual from 14 years ago will still run on today’s processors. Modern CPUs even have a performance counter specifically for split-lock detection … Read more