in Personal

A Non-Blocking HashTable

In a previous blog I talked about my Non-Blocking Hash Table, and how to implement ‘get()’.  This blog will focus on ‘put()’ and all variations.  The hard part will be figuring out how to include state machine diagrams in a blog!  Quick recap: the HashTable is a closed power-of-2 sized table.  The table holds Key/Value pairs in even/odd entries in a large Java Object[] array.  Key slots are limited to States  {null,K} and make a 1-time transition from null to K (when a new Key is inserted).  Value slots are limited to States {null,V/T} where V is any Value, and T is a Tombstone.  Value slots can waffle about according to the last put(); keys are deleted by replacing the value with a Tombstone; keys can be re-inserted by replacing the Tombstone with a real Value again.

Two Key states times three Value states gives me 6 total states:

  1. {null,null} – Empty
  2. {K,null} – Partially inserted key/value
  3. {K,V} – Fully functional {key,value} pair
  4. {K,T} – Previously inserted, now deleted Key
  5. {null,V} – partially inserted K/V pair being read out-of-order
  6. {null,T} – partially inserted K/T pair being read out-of-order

States 5 & 6 are functionally identical, and are only visible to threads doing a get() call where they prefetch the Value slot despite seeing a null Key.  The null key counts as a miss for the get(); the stuff in the Value slot belongs to some not-quite-fully-inserted Key/Value pair – but the get()-thread does not know for which Key!

Note that Key slots, once set, can never be UNset – this is required for correctness, lest Thread A tests for K1 in a slot, then Thread B deletes it & inserts K2/V2, and finally Thread A gets around to reading the value slot and reads V2 – and not the now replaced V1.

put() can be broken down into 2 main steps: key-slot claim and value update.

Key-slot Claim:  First up: standard hash, mask-to-table-size, then look at the slot.  If the slot has a Key already and it’s the desired Key – we’re done.  If it’s the wrong key – then just like get() we reprobe.  If we find a null Key slot, then we attempt to CAS the slot from null to Key.  If we succeed, then we’re done.  If not, we reprobe.  If we reprobe too often, we’ll trigger a resize and that’s a subject for a later blog.

  Object put( Object K, Object V ) {
    idx = hash = K.hash();
    while( true ) {
      idx &= (size-1);      // mask to table size
      key = table[2*idx+0]; // key/val kept in even/odd pairs
      if( K == key || key.equals(K) ) // key hit?
        break;              // Found desired Key slot
      if( key == null &&    // Found end of key chain 
          CAS_key(table,idx,null,K) ) // try to claim slot
        break;              // CAS worked!  We own slot
        // If the CAS_key fails, then the slot got taken 
        // by somebody else...
      idx++;                // reprobe
      if( idx > limit ) 
        return resize_size_next_blog();
    }
  }

What, exactly, is CAS_key() doing?  It’s implemented via sun.misc.Unsafe.WeakCompareAndSwap().  I’m using the Weak version because I do not need a fence, and at least on Azul the fence costs (roughly the cost of a trip to memory).  On most other platforms the CAS includes an unavoidable fence, hence the CAS includes some unavoidable cost.  The weak version of CAS allows for spurious failure; so my CAS_key code will loop if it thinks the failure is spurious.  i.e., it loops until it succeeds or the value in memory no longer matches the expected value.

Value-slot Update: The output of the prior step is that the Key-slot is correct, but the Value-slot is really unknown.  You might think that after a fresh key-claim you could be assured that the value slot is null.  But another thread can leap in and change the null at any time.  Hence value-update has to inspect & go with what it finds there.  As before, I’m using an unfenced CAS that does not spuriously fail (internal looping on spurious failure).

  Object old = table[2*idx+1];  // key/val kept in even/odd pairs
  if( CAS_val(table,idx,old,V) )// Attempt CAS on val slot
    return old;                 // Success!  Return old value

If the CAS fails, then somebody else must have published another value out from under us.  We can either give up OR retry: if we give up we can claim it is “as if” our CAS succeeded but another thread immediately stomped our update.  The problem with this approach is that we can’t make the OTHER update return OUR value as it’s “old value”.    i.e., if we had truly succeeded and been immediately stomped, the stomper would be returning our value as it’s “old value”.  So instead we’ll retry:

  while( true ) {
    Object old = table[2*idx+1];  // key/val kept in even/odd pairs
    if( CAS_val(table,idx,old,V) )// Attempt CAS on val slot
      return old;                 // Success!  Return old value
  }

What about all those put() variations?  putIfAbsent?  remove?  replace?  Turns out, it’s all the same implementation in the end – we just need to gate the CAS attempt a bit more.  We’ll pass in an extra value which has to match the old value, or the extra value is some sentinel meaning “dont care”.

  while( true ) {
    Object old = table[2*idx+1];  // key/val kept in even/odd pairs
    if( extra != null && extra != old ) // Extra condition?  not met?
      return null;                // Then the put attempt failed!     if( CAS_val(table,idx,old,V) )// Attempt CAS on val slot
      return old;                 // Success!  Return old value
  }
  • remove: New value is really a Tombstone.
  • putIfAbsent: Extra value is a Tombstone.
  • replace: Extra value is the expected oldValue

and so forth.  There are only a few more tricks to go in here: things like ‘remove()’ should not insert a missing key, only to turn around and insert a tombstone.  ‘replace()’ gives up on seeing a tombstone unlike ‘putifAbsent()’ which only succeeds on seeing a tombstone (or null).  The returned old value has to have tombstones mapped back to null (the normal “no old value” return result).

Stayed for next posting, where I try to explain resize() in a blog… and probably punt to a real white-paper style presentation.

Cliff