This is the 3rd and last part of this blog! (thank heavens!), where I wax poetic about Lattices and Type Theory, and their applications to Compilers and in particular Java JITs. Part 1: Too Much Theory Part 2: Too Much Theory, Part 2 ## A Quick Recap We’re building a lattice, to allow us to do exciting optimizations on Java programs. Our lattice has distinguished * top* and

*types, and is Commutative, Associative, and Symmetric. We understand pointers can be*

**bottom****null**,

**not_null**, or unknown (e.g.

**). Because of symmetry across the lattice centerline, we also have the dual of**

*bottom***not_null**which we’ll call

**any_null**. For this recap, we’ll also have Java Classes (e.g. String), but we’re going to ignore subclassing for the moment. Our lattice looks something like this: Wow, that’s getting fancy… lets revisit some of the elements in this lattice real quick: *

*: All possible values, including*

**bottom****null**, as computed by running the program. No constants. *

**String:bot**: All possible Strings, including

**null**, as computed by running the program. No constants, no XMLNodes. For brevity I will sometimes refer to this as

**String**(no trailing

**:bot**notation). *

**XMLNode:bot**or plain

**XMLNode**: All possible XMLNodes, including

**null**, as computed by running the program. No constants, no Strings. *

**bottom****:****not_null**: All possible values, as computed by running the program. No constants, no

**null**. *

**String:not_null**: All possible Strings, as computed by running the program. No constants, no XMLNodes, no

**null**. *

**String:hello_world**: The constant String “hello world”, and nothing else. *

**null**: The constant

**null**. *

**String:any_null**: All possible String constants, all at once. No XMLNodes, nor

**null**. An impossible situation in a real program, but a useful mathematical notion when running Constant Propagation. *

**String:top**: All possible String constants and the

**null**constant, all at once. No XMLNodes. *

*: All possible constants, including all Strings and all XMLNodes and*

**top****null**, and all them all at once. Notice the symmetry: ever Node below the centerline of constants also appears in dual form above the centerline. The edges are also symmetric: e.g.

*has 4 edges out and*

**top***has 4 edges in,*

**bottom****String:top**has 1 in, 2 out, and

**String:bot**has 2 in, 1 out, etc. This lattice will let us find the constants in code like this example from last time:

final class LinkedListThing { LinkedListThing _next; // instance field int hashCode() { int hash = 0; Object x = this; // x is LinkListThing:not_null while( x != null ) { hash += x.hashCode(); // x is LinkListThing:not_null if( x instanceof LinkedListThing ) x = ((LinkedListThing)x)._next; // x is LinkedListThing else // with Conditional Constant Propagation... x = "fail"; // ...this path is dead // x is LinkedListThing:bottom and not a String } return hash; } }

## And Now Gentle Readers, Let Us Ponder Subtypes Java, of course, has subtypes. So does C++ and many other languages. Lets take a look at what it would take to add subtypes to our lattice. This tiny program snippet makes exactly a Hashtable:

Hashtable x = new Hashtable(); y = x.get(); // Calls Hashtable.get

And this snippet makes a known subclass of Hashtable, but treats it like a Hashtable:

Hashtable x = new Provider(); y = x.get(); // Calls Provider.get

And this snippet mentions an UNknown subclass of Hashtable:

void foo( Hashtable x ) { y = x.get(); // Calls ???

When does ‘Hashtable x’ refer to exactly a Hashtable, and when does it refer to some subclass of Hashtable? Knowing if some value is **exactly** a Hashtable versus a subclass is very useful: we can optimize calls made to exactly known classes. Example: What function is called when we call “x.get()”? Well, if x is exactly a Hashtable, we get Hashtable.get() … and we can inline “get()”. If x is a Hashtable-or-a-subclass, then “x.get()” might be Provider.get() or Hashtable.get() or some other user-specified derived version of “get()”, and we cannot inline. The job of figuring out if ‘x’ is exactly of class Hashtable, or some subclass of Hashtable falls to Constant Propagation – and that requires we represent this notion of ‘exact class’ in the lattice. Lets add the notion ‘**Hashtable:exact**’ to mean **exactly** a Hashtable, and NO subclasses, while plain ‘**Hashtable**’ allows subclasses. This is an independent axis from allowing **null**, so we can still have e.g. **Hashtable:exact:not_null**. Note that final classes like String cannot subclass and so are always **String:exact**; constants are always the exact class that they are, e.g. **Hashtable:exact:0x1234**. I have another “Lattice” for you to ponder, with Hashtables instead of XMLNodes. I’m still using * bottom* in places but I might as well use

**Object:bot**or just plain

**Object**. Also, I ‘twisted’ the picture slightly: lattice elements in the middle all exclude

**null**, and those on the outer edges all include

**null**. I had to duplicate the

**null**element to make the graph lay flat visually, but really they are the same element. I could have laid the graph ‘flat’ on a cylinder, but the web’s not up to 3-D visualization yet. . ## And Now The Punch Line And now for the Trick Question: what happens when I end up mixing together (with Phi functions on loop backedges)

**Hashtable:top**with

**String:exact:not_null**? (A different question is

**how**I come to such a situation that I attempting to mix these types… but trust me I’ve seen this happen in QA from suitably complex programs!) So back to my Question: I am look for the

**most precise**answer possible, anything less and I’ll be losing type information for no good reason. So for example

*is a correct answer, but very conservative – we can do better than that! How about*

**bottom****bottom:not_null**? Since for

**Hashtable:top**the compiler can pick any Hashtable it wants – we want it to pick a Hashtable, ANY Hashtable. The result of mixing that with String:exact:not_null is … some random not-null Object:

**bottom:not_null**. This might let me remove a null-pointer check at compile time. But we could also take just the

**null**from

**Hashtable:top**, since the compiler is allowed to pick that instead! Remember: the

**top**label means the compiler “gets to choose” during the course of Constant Propagation – and picking a more precise answer makes it more likely we’ll find a constant and have our choice remain correct. So mixing

**null**and

**String:exact:not_null**yields

**String:exact**. This might, e.g., let me convert an unknown x.hashCode() into a known String:hashCode() call. So which one do I pick?

**bottom:not_null**or

**String:exact**? Typically I have no idea during the course of CP which one will pan out better – or actually if either will lead to a better final answer. The Real Trick here is:

*I should not be required to pick*. The Constant Propagation algorithm relies on having a lattice – and lattices are defined as having a unique lower-bound (and unique upper-bound) for any two elements. This structure does NOT have a unique lower-bound, and so

*is no longer a lattice*! For some particular Java program and some particularly bad luck with our CP algorithm – the CP algorithm can

*hang indefinitely*, or miss some constants on some runs – but find those constants on other runs

*for the same Java program*. Garbage-In Garbage-Out. CP is technically wedged. In practice HotSpot’s CP does not hang, although I suspect I can carefully arrange a Java program for which it would. Instead, I end up triggering asserts in QA about how my lattice is lacking the proper symmetry (and hence losing constants for no other good reason, but only in weird programs). I

*did*construct 2 simple programs for which choosing

**bottom:not_null**versus

**String:exact**would find a constant (and enable an optimization) in one program and not the other… and vice-verse for reversing choices. Cliff ## Postlude As you might have guessed by finding this blog, I’m off to a new adventure – this time

**using**a JVM instead of

**building**one (well maybe: I’m very pragmatic; may the best language win!). And while I’ll probably still be tinkering in HotSpot from time to time, I think my future blogs will mostly be about my new adventure! – till next time, Cliff