Lock-free Reference counting and C++ smart pointers

Question

addr aIandF(addr r1) {
  addr tmp;
  int c;
  do {
    do {
      // r1 holds the address of the address
      // of the refcount
      tmp = *r1;       // grab the address of the refcount
      if (!tmp) break; // if it's null, bail

      // read current refcount
      // and acquire reservation
      c = lwarx(tmp);

      // now we hold the reservation,
      // check to see if another thread
      // has changed the shared block address
    } while (tmp != *r1); // if so, start over

    // if the store succeeds we know we held
    // the reservation throughout
  } while (tmp && !stwcx(tmp, c+1));
  return tmp;
};

Note that aIandF is used specifically when constructing a copy of an existing shared pointer, claiming a reference for the copy.

The Dr Dobbs article describes the operation for releasing a reference as first atomically swapping the address of the shared counter in the source shared pointer object with a null pointer local to the function; then atomically decrementing the counter; then testing to see if the result of the decrement was zero. This order of operations is important: you say, “The address of the shared object may never change and the LL/SC could succeed – but how does this help us if another thread has deallocated the shared object in the mean time?” – but this can never happen, since the object will never be deallocated without the swap happening first, giving us a means to observe the change of address.

aIandF tests for the counter address being null on entry.

It can spot the address becoming null if that occurs before the lwarx, because it explicitly tests for this once it has the reservation.

If the swap in the decrementing thread occurs after the lwarx we do not actually care: if the stwcx in aIandF succeeds, we know the decrementing thread will see the new reference count and not destruct the object, and we can proceed knowing we have claimed a reference to it; whereas if the other thread succeeds in decrementing the counter first, we will lose our reservation, the store will fail and we’ll detect the destruction of the object on the next loop iteration.

This algorithm assumes a strongly consistent memory model (all threads always see effects of each other’s reads and writes in program order) – this is not necessarily the case even on those modern architectures that do support ll/sc.

EDIT: thinking about it, the algorithm also apparently makes the assumption that it is always safe to read from a memory address that was once valid (e.g., no MMU/protection; or, the algorithm is broken):

if (!tmp) break;

// another thread could, at this point, do its swap, 
// decrement *and* destroy the object tmp points to
// before we get to do anything else

c = lwarx(tmp); 

// if that happened, we'll detect this fact and do nothing with c
// but ONLY if the lwarx doesn't trap 
// (due to the memory tmp points to 
// getting unmapped when the other thread frees the object)

Leave a Comment Cancel reply