Per-element atomicity of vector load/store and gather/scatter?
Per-element atomicity of vector load/store and gather/scatter?
Per-element atomicity of vector load/store and gather/scatter?
The spinlock mutex implementation looks okay to me. I think they got the definitions of acquire and release completely wrong. Here is the clearest explanation of acquire/release consistency models that I am aware of: Gharachorloo; Lenoski; Laudon; Gibbons; Gupta; Hennessy: Memory consistency and event ordering in scalable shared-memory multiprocessors, Int’l Symp Comp Arch, ISCA(17):15-26, 1990, … Read more
I’m two months late, but I’m having the exact same problem right now and I think I’ve found some sort of an answer. The short version is that it should work, but I’m not sure if I’d depend on it. Here’s what I found: The C++11 standard defines a new memory model, but it has … Read more
In the IntelĀ® 64 and IA-32 Architectures Developer’s Manual: Vol. 3A, which nowadays contains the specifications of the memory ordering white paper you mention, it is said in section 8.1.1 that: The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically: Reading or writing a … Read more
test-and-set modifies the contents of a memory location and returns its old value as a single atomic operation. compare-and-swap atomically compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value. The difference marked in bold.
I know that the atomic types do not have copy constructors, and I assume that explains why this code does not work. Yes, the error says that quite clearly. Does anybody know a way to actually get this code to work? Instead of copy-initialising from a temporary, which requires an accessible copy constructor: std::atomic<int> order::c … Read more
To answer all 5 questions: 1) A compiler fence (by itself, without a CPU fence) is only useful in two situations: To enforce memory order constraints between a single thread and asynchronous interrupt handler bound to that same thread (such as a signal handler). To enforce memory order constraints between multiple threads when it is … Read more
Code from “Olaf Dietsche” USE ATOMIC real 0m1.958s user 0m1.957s sys 0m0.000s USE VOLATILE real 0m1.966s user 0m1.953s sys 0m0.010s IF YOU ARE USING GCC SMALLER 4.7 http://gcc.gnu.org/gcc-4.7/changes.html Support for atomic operations specifying the C++11/C11 memory model has been added. These new __atomic routines replace the existing __sync built-in routines. Atomic support is also available … Read more
It sounds like the atomic operations on memory will be executed directly on memory (RAM). Nope, as long as every possible observer in the system sees the operation as atomic, the operation can involve cache only. Satisfying this requirement is much more difficult for atomic read-modify-write operations (like lock add [mem], eax, especially with an … Read more