JavaOne

JavaOne is fast upon us.  It’s a big year for Azul; we have 4 talks this year.  I’m giving 3 and Gil Tene, our CTO, is giving the other one.  I’m mentioning his talk specifically because it “sold out” and JavaOne kindly gave him a 2nd session in a new larger room at an earlier time – which will now be cavernously empty unless you sign up!  The talk is very interesting (to me) – because it’s essentially the GC version of “How NOT To Write A Microbenchmark” – it’s full of specific advice on how to benchmark GC for large applications, and what traps to avoid when trying to write your own GC benchmark or testsuite.  Some observations: SpecJBB2000 was originally intended as a Java middle-ware benchmark, yet was specifically designed so that GC never happened during the time portion of the run.  Do you never see a GC cycle during the “timed portion” of your Java workloads?  (SpecJBB2005 came about partially to correct this). 

Debugging Data Races is a re-run of a BOF I gave last year, but the material is more relevant that ever.  Slides here.  I had a number of comments from people who basically said “you shouldn’t write datarace bugs in the first place” implying that (1) writing such bugs is Bad because they are so hard to find & fix, and (2) you wanted to write such bugs in the first place!  My advice from last year still holds, and indeed I’ve got more evidence than ever that the bugs I’m targeting are the common bugs that bite you, and what I’m proposing is very accessible (no Phd required!).

Towards a Coding Style for Scalable Nonblocking Data Structures is … well, hopefully the title is fairly self-explanatory.  I’ve talked this on this blog before; slides here.  The basic idea is a coding style that involves a large array to hold the data (allowing scalable parallel access), atomic update on those array words, and a Finite State Machine built from the atomic update and logically replicated per array word.  The end result is a coding style allowing me to build several non-blocking (lock-free) data structures that also scale insanely well.  All the code is available on SourceForge.  At this point I’m sore tempted to add transactions to the non-blocking hash table and essentially create an uber-data-grid.

JVM Challenges and Directions in the Multi-Core Era is where Brian Goetz & I rant about the lack of any decent concurrent programming language.  Slides here.  Java’s the obvious best State-of-the-Practice concurrent programming language, and there are plenty of better “academic toy” concurrent programming languages Out There.

But none seem to tackle head-on the main problem programmers have with large concurrent programs: lack of name-space control over inter-thread communication, i.e. shared variables.  By default in Java, all global variables are shared with all threads.  Instance fields and code can be made private or public but you can't limit access e.g. to specific threads.  For example: all your DB threads in the DB thread pool can run all the code in your crypto-lib, should you accidentally let them get a crypto instance.  No language or compiler support to throw some kind of early access-error.  Erlang and CSP deny all global variables; Fortress, X10 & Scala provide convenient ways to express some kinds of parallelism without shared variables, but amongst the recent more well-known crop of academic-toy languages I don't see anybody tackling the access-control issue head-on.

All right I’ll stop before I start really ranting, but it’s a topic I’m passionate about.
JavaOne promises to be an exciting time, come and see the show!

Cliff