Misc Odds & Ends

I’ve been busy at work, but it’s been awhile so I’ve got a collection of odds-n-ends I’ve been meaning to write up:

Bandwidth: 70Gb/sec

A 768-way Azul (7280 box) gets 70Gb/sec of throughput as measured with a multi-threaded pure Java version of the infamous STREAM benchmark.  Not official (since it’s neither C nor FORTRAN!), so don’t quote me on this but it does put our gear in the top-20 highest-bandwidth supercomputers.

NonBlockingHashMap News

NonBlockingHashMap has been up on SourceForge for awhile now.  I finally got around to fixing a problem with racing writers & racing copy-threads-on-resize and a racing witness reader thread being able to see a value flip/flop more times than there are writers (smallest test case takes 5 racing threads, 4 of which are writing!).  The problem was found last year via inspection and confirmed with a model checker.  I’ve long had a fix figured out – use a simpler State Machine during the table copy – but just now finally got it implemented.

Gene Novark has agreed to model check the old version with SPIN (he’s checking the implementation not the algorithm and found a few implementation bugs; should be fixed already via the rewrite).  I’m hoping he’ll also do the new version!  In any case, he’s a smart guy and getting closer to graduation.

NonBlockingHashMapLong (a primitive-long-key’d NBHM for space-conscious users) is also getting more use and bug fixes.  I need to rewrite it to use the new State Machine; as part of that rewrite I’ll be able to fold in a classic space/time tradeoff flag.  The space-optimized version will use less (amortized) space than the normal HashMap version by a fair amount.  There’s also a NonBlockingSetInt which is essentially a classic auto-resizing bit-vector that’s non-blocking during the resize.

JavaOne, Talks, Abstracts

JavaOne submission deadline has come and gone.  So has MSPC, CGO tutorials, DaCapo and goodness knows what else.  I gave a talk at an internal Intel Dynamic Execution Environment forum, a Big Picture talk (“Challenges and Directions in the Multi-Core Era”) on why concurrent programming is so hard.  It was very well recieved (it helps that the audience got going early asking questions; I got connected with the audience and it mades for a much better presentation).  I’ll try to get the slides up on the blog a little later.  I submitted the these abstract to JavaOne:

  • “Challenges and Directions in the Multi-Core Era” – why concurrent programming is hard.
  • “Towards a Coding Style for Non-Blocking Algorithms” – guess that that one’s about!
  • Debugging Data Races” – same as last year, but the material is obviously still timely since I’m still debugging customer apps w/race bugs
  • “An Always-On Real Time Profiling and Monitoring Tool” – Azul’s version of an in-production profiling & monitoring tool

And this gem to MSPC:

  • “I Wanna Bit!” – a (really!) short paper on a magic bit I’d like from my hardware which will dramatically improve my ability to write non-blocking algorithms.  The bit isn’t all that magical and requires really minimal hardware (and yes I DO talk to the Azul hardware engineers!) – it’s basically an “atomic-read-set bit” that allows your L1 cache to be treated as a large atomic-read-set, along with the typical single-word atomic-write-set (e.g., CAS).

Quick audience poll – any one of those tweak your curiosity?

Other Stuff

  • Playing with Google index bench
  • Playing with several customer apps, some scaling to ridiculous levels
  • Writing a pure Java program to “Root” a JVM (already filed the bug report, no you can’t have the code yet)
  • Compiler optimizations in Azul’s JVM to reduce GC pause-time

Th-th-th-that’s all folks!
Cliff