Too Much Theory, Part 3

This is the 3rd and last part of this blog!  (thank heavens!), where I wax poetic about Lattices and Type Theory, and their applications to Compilers and in particular Java JITs. Part 1: Too Much Theory Part 2:  Too Much Theory, Part 2 ## A Quick Recap We’re building a lattice, to allow us to do exciting optimizations on Java programs.  Our lattice has distinguished top and bottom types, and is Commutative, Associative, and Symmetric.  We understand pointers can be null, not_null, or unknown (e.g. bottom).  Because of symmetry across the lattice centerline, we also have the dual of not_null which we’ll call any_null.  For this recap, we’ll also have Java Classes (e.g. String), but we’re going to ignore subclassing for the moment.  Our lattice looks something like this: Wow, that’s getting fancy…  lets revisit some of the elements in this lattice real quick: * bottom: All possible values, including null, as computed by running the program.  No constants. * String:bot: All possible Strings, including null, as computed by running the program.  No constants, no XMLNodes.  For brevity I will sometimes refer to this as String (no trailing :bot notation). * XMLNode:bot or plain XMLNode: All possible XMLNodes, including null, as computed by running the program.  No constants, no Strings. * bottom:not_null: All possible values, as computed by running the program.  No constants, no null. * String:not_null: All possible Strings, as computed by running the program.  No constants, no XMLNodes, no null. * String:hello_world: The constant String “hello world”, and nothing else. * null: The constant null. * String:any_null: All possible String constants, all at once.  No XMLNodes, nor null.  An impossible situation in a real program, but a useful mathematical notion when running Constant Propagation. * String:top: All possible String constants and the null constant, all at once.  No XMLNodes. * top: All possible constants, including all Strings and all XMLNodes and null, and all them all at once. Notice the symmetry: ever Node below the centerline of constants also appears in dual form above the centerline.  The edges are also symmetric: e.g. top has 4 edges out and bottom has 4 edges in, String:top has 1 in, 2 out, and String:bot has 2 in, 1 out, etc. This lattice will let us find the constants in code like this example from last time:

final class LinkedListThing {
  LinkedListThing _next; // instance field
  int hashCode() {
    int hash = 0;
    Object x = this; // x is LinkListThing:not_null
    while( x != null ) {
      hash += x.hashCode(); // x is LinkListThing:not_null
      if( x instanceof LinkedListThing )
         x = ((LinkedListThing)x)._next; // x is LinkedListThing
      else // with Conditional Constant Propagation...
         x = "fail"; // ...this path is dead
      // x is LinkedListThing:bottom and not a String
    }
    return hash;
  }
}

And Now Gentle Readers, Let Us Ponder Subtypes Java, of course, has subtypes. So does C++ and many other languages. Lets take a look at what it would take to add subtypes to our lattice. This tiny program snippet makes exactly a Hashtable:

  Hashtable x = new Hashtable(); y = x.get(); // Calls Hashtable.get

And this snippet makes a known subclass of Hashtable, but treats it like a Hashtable:

  Hashtable x = new Provider();  y = x.get(); // Calls Provider.get

And this snippet mentions an UNknown subclass of Hashtable:

  void foo( Hashtable x ) {      y = x.get(); // Calls ???

When does ‘Hashtable x’ refer to exactly a Hashtable, and when does it refer to some subclass of Hashtable? Knowing if some value is exactly a Hashtable versus a subclass is very useful: we can optimize calls made to exactly known classes. Example: What function is called when we call “x.get()”? Well, if x is exactly a Hashtable, we get Hashtable.get() … and we can inline “get()”. If x is a Hashtable-or-a-subclass, then “x.get()” might be Provider.get() or Hashtable.get() or some other user-specified derived version of “get()”, and we cannot inline. The job of figuring out if ‘x’ is exactly of class Hashtable, or some subclass of Hashtable falls to Constant Propagation – and that requires we represent this notion of ‘exact class’ in the lattice. Lets add the notion ‘Hashtable:exact’ to mean exactly a Hashtable, and NO subclasses, while plain ‘Hashtable’ allows subclasses. This is an independent axis from allowing null, so we can still have e.g. Hashtable:exact:not_null.  Note that final classes like String cannot subclass and so are always String:exact; constants are always the exact class that they are, e.g. Hashtable:exact:0x1234. I have another “Lattice” for you to ponder, with Hashtables instead of XMLNodes.  I’m still using bottom in places but I might as well use Object:bot or just plain Object.  Also, I ‘twisted’ the picture slightly: lattice elements in the middle all exclude null, and those on the outer edges all include null.  I had to duplicate the null element to make the graph lay flat visually, but really they are the same element.  I could have laid the graph ‘flat’ on a cylinder, but the web’s not up to 3-D visualization yet. . ## And Now The Punch Line And now for the Trick Question: what happens when I end up mixing together (with Phi functions on loop backedges) Hashtable:top with String:exact:not_null?  (A different question is how I come to such a situation that I attempting to mix these types… but trust me I’ve seen this happen in QA from suitably complex programs!)  So back to my Question: I am look for the most precise answer possible, anything less and I’ll be losing type information for no good reason.  So for example bottom is a correct answer, but very conservative – we can do better than that! How about bottom:not_null?  Since for Hashtable:top the compiler can pick any Hashtable it wants – we want it to pick a Hashtable, ANY Hashtable.  The result of mixing that with String:exact:not_null is … some random not-null Object: bottom:not_null.  This might let me remove a null-pointer check at compile time. But we could also take just the null from Hashtable:top, since the compiler is allowed to pick that instead!  Remember: the top label means the compiler “gets to choose” during the course of Constant Propagation – and picking a more precise answer makes it more likely we’ll find a constant and have our choice remain correct.  So mixing null and String:exact:not_null yields String:exact.  This might, e.g., let me convert an unknown x.hashCode() into a known String:hashCode() call. So which one do I pick?  bottom:not_null or String:exact?  Typically I have no idea during the course of CP which one will pan out better – or actually if either will lead to a better final answer.  The Real Trick here is: I should not be required to pick.  The Constant Propagation algorithm relies on having a lattice – and lattices are defined as having a unique lower-bound (and unique upper-bound) for any two elements.  This structure does NOT have a unique lower-bound, and so is no longer a lattice!  For some particular Java program and some particularly bad luck with our CP algorithm – the CP algorithm can hang indefinitely, or miss some constants on some runs – but find those constants on other runs for the same Java program.  Garbage-In Garbage-Out.  CP is technically wedged. In practice HotSpot’s CP does not hang, although I suspect I can carefully arrange a Java program for which it would.  Instead, I end up triggering asserts in QA about how my lattice is lacking the proper symmetry (and hence losing constants for no other good reason, but only in weird programs).  I did construct 2 simple programs for which choosing bottom:not_null versus String:exact would find a constant (and enable an optimization) in one program and not the other… and vice-verse for reversing  choices. Cliff ## Postlude As you might have guessed by finding this blog, I’m off to a new adventure – this time using a JVM instead of building one (well maybe: I’m very pragmatic; may the best language win!).  And while I’ll probably still be tinkering in HotSpot from time to time, I think my future blogs will mostly be about my new adventure! – till next time, Cliff