in Personal

Death by Decompilation

I recently got involved in trying to improve performance of a large customer application.  Azul’s JVMs come with an always-on profiling tool – RTPM – so it was easy to spot where the time was going.  The name has been changed to protect the guilty but the name might as well have been “addToSet”.  A quick check of the JIT’d assembly showed that all time was spent in spinning down a long array (an ArrayList actually), checking for duplicates.  In the end, no dup gets found and the new element gets appended.  Adding N elements gives us a classic O(N^2) algorithm.  The array was already >32000 elements in length at the time we profiled it, so time is proportional to (32K^2) ~ 1billion.  Ugly stuff.

Naturally (grumble) this code is in a large 3rd party not-open-source application, where the owning corporation has been recently purchased by an even larger Faceless Corporation.  The “proper approach” would be to file a performance bug and wait and wait and …

We voted to try and work around the problem by doing a little decompilation, hacking the ArrayList into a HashSet, and inserting a new jar file in the bootclasspath.  So we pulled this class file out of the pile-o-jar’s and hit it with the standard decompiler jad (www.kpdus.com/jad.html).  Presto!  About 2000 lines of fairly legible Java code.  Plus a pile-o-warnings about un-decompilable stuff (Ut-oh!) mostly around nested try/catch/finally clauses and inner classes.  I surf the Java code looking for other uses of the offending ArrayList – there are definitely some – and start to see places where jad bailed out.  When I turn javac loose on the new source file, I get lots of error messages.

No problem thinks I, I’ll just get another decompiler.  I immediately google-up cavaj (cavaj-java-decompiler.en.softonic.com).  A few moments later I get another Java file – and whoops!  It’s header mentions jad by name down to the exact same revision.  Sure enough this decompiler is a GUI front-end for jad complete with identically broken Java code.  Turns out there are a number of such GUI wrappers for jad.  So I search some more, discarding mocha (www.brouhaha.com/~eric/software/mocha) for being abandoned, ClassCracker (mayon.actewagl.net.au/index.html) because the demo version didn’t produce any Java code at all, and IceBreaker (www.breakertech.com) because their web site was down.  I also tried Java Decompiler (java.decompiler.free.fr) but it also produced loads of syntax errors although in different places than jad.  DJ (members.fortunecity.com/neshkov/dj.html) also choked on my class file but just barely.  Very long forward jumps spanning nested try/finally blocks would confuse it – but the inner classes came out looking nice.  A colleague also tried JDec (sourceforge.net/projects/jdec) with limited success.  I also stumbled across this nice list of decompilers from google.

In the end, I hand-integrated the code from JDec & DJ to get something that would compile.  I then hand-refactored the code to insert (probably re-insert) generics and Java6 style iterators – this step was entirely mechanical on my part but no decompiler I tried did it.  The resulting code was much smaller and more readable… and made a bug in the code more obvious (there are “before” & “after” Sets that periodically get unioned and during the union the dup-elimination code is broken and allows dups).  The actual replacement of ArrayList with a HashSet took only a few minutes.

In summary: decompiling a single class file to reasonable-looking java code took about 6 hrs (and 7 decompilers looked at and 5 actually turned loose on the class file), while the refactoring to correct the egregious performance bug took maybe 10-minutes (yanked all the search loops and just used ‘contains’ and ‘addAll’) and included fixing the dups-allowed bug as a pleasant side-effect.

Is it time for another Java decompiler bake-off?  Has somebody got a recent comparison of decompilers floating about?

Thanks,
Cliff