in Personal

Pausing Transactional Memory Hardware

One of the more exciting trends in multi-core memory is the idea of Transactional Memory: memory which commits all of it’s state-changes atomically or not at all!  This idea is very powerful for coding concurrent algorithms, as it means a whole set of updates happen atomically.  Azul Systems felt this idea had so much merit that we included some Hardware Transactional Memory (HTM) support (as opposed to Software Transactional Memory) in our first generation product.

One of the issues with any TM system is livelock: transactions that continuously abort and retry (contrast this to programming with locks which suffer from deadlock).  At Azul, we use our HTM for accelerating Java locks so we can avoid avoid livelock by resorting back to the original Java locks.  However, we’d still like to profile what happens during HTM attempts!

Hardware event profiling is a common enough profiling technique; the hardware triggers an exception after every so many “events” and the exception handler records the event in memory.  “Events” can be things like every-1000-L2-misses or simply time-based or tick-based events.  The key issue is that the profiling data is recorded in memory.

Now mix profiling with HTM: the profile data is also stored in the transactional memory!  If the transaction commits then all is well…. but if it aborts then the profiling data is lost.  The CPU spent time doing work but there’s no record of what workWhere does the time go in a failed HTM attempt?    I don’t know!  The hardware has helpfully onwound all that “speculative” work!  I’m back to the start of the transaction with all state reset… except for the endless march of time.

SO… what I want is a way to ‘pause’ transactional memory hardware – to allow memory updates to actually happen and not be queued awaiting the success or failure of the transaction.  Visible side effects inside a transaction, if you will.  Hardware which allows writes to participate in the transaction or not on a per-instruction basis would also work.

This change (allowing some writes to succeed despite a failed transaction) does not have to destroy the notion of a ‘transaction’ at the higher language level – it just demonstrates the need for flexibility in any hardware TM support.

This article and the idea of ‘pausing’ a transaction are specifically placed in the public domain.