in Personal

Too Much Theory, Part 2

As I write this I am flying back on SwissAir having just spent a week in beautifully freezing downtown Stockholm (highs are below freezing all this week) speaking at JFokus.  My talks went well; the audience was engaged and I had lots of  questions and comments afterwards.  The speakers were treated to a marvolous dinner (thanks Mattias!) which included singing by some of the JFokus crew.  I attended a few talks and wish I had time for more; JFokus definitely was a fun conference to attend.

When not at JFokus, Shelley & I managed to see The Vasa, a 1600’s era warship that sank on her maiden voyage, barely out of dock.  She lay forgotten for 300 years and was finally raised in the 1960’s and preserved & restored in the 1970’s and 1980’s.  All the iron bolts had rusted out (the iron cannons were salvaged a mere 50 yrs after her sinking) but all else remained.  Apparently the salinity in the harbor prevented the usual shipworms from eating everything wooden; the ship is 95% original.  She remains by far the largest salvaged wooden structure in the world.  She’s an amazing sight to behold and well worth the museum trip.  We also managed to visit the Swedish historical museum and ride the local trams some.  Only the horrible hacking cough Shelley picked up kept us from doing more exploration (nothing like being sick in a foreign land).

Assuming the rest of our trip back remains easy (first flight out was 15min delayed, giving us a whopping 30min to change from Terminal A to Terminal E in Zurich, including 2 passport checks – but we made it with only a little running), I’m going to dive into Part Two of my Deep Dive into Too Much Theory.

Last week, if you recall (I’ve wanted to use that phrase in a sentence for a long time now!  🙂 we were looking at the link between lattices and constant propragationConstant propagation is a means to discover constants in programs which in turn can enable more optimizations (e.g. replacing the obvious math producing the constant with just the constant directly).  Lattices are how we describe the kinds of “constants” we can discover.  As already discussed, we’re not just finding simple integer constants like “3”, but possibly things like the null pointer, or that a pointer is not_null (some unknown value… just never null), or ranges of integers (useful to remove some kinds of array range checks).  So when I say “Constant” what I really mean is “Some interesting values” – or properties about values – not necessarily limited to true constant values.

Before we dive into extending our lattice into even more exciting forms, I want to talk about properties of the lattice directly.  We need some properties or we’re doomed – but there are some properties that we want that are less obvious.

A Commutative, Associative, and Symmetric Lattice

  • A lattice is a collection of elements (e.g. top, bottom, 0, 1, 2, 3, …)
  • With an operator for mixing the elements called meet, which defines a partial order over the elements

We used meet when mixing different values for the same variable because of control-flow; (for you SSA wonks it is the operation given to Phi functions) meet produces another element in the lattice.  Thus lattices are closed under the meet operation.  Lattices also need:

  • A distinguished top-most and bottom-most values.  We call these values top and bottom respectively.  They represent the start and endpoints of Constant Propagation.
  • Our lattices need to be bounded height: meaning we cannot apply the meet operator an infinite number of times before we hit bottom.  The height of the lattice dictates our running time.  Especially for a JIT short running time is key; in practice lattices used in compilers tend to have a very small height.  For HotSpot it is something like 5 or 6.
  • We want our meet operator to be commutative, associative and symmetric.  The HotSpot C2 lattice has all these properties, and is defined using the above meet operator and a dual operator: reflection across a centerline.  The Wikipedia article uses meet and join to define a lattice, but HotSpot uses meet and dual because its easier to implement in C++.

The reason for these properties is a little subtle: if the meet operator lacks these then the lattice isn’t such a simple structure – and the order of application of meet will matter.  i.e., the order we visit our program and apply our constant propagation will matter.  Some visitation orders will give better answers than others.  Example: Suppose meet is not commutative and ‘b=3’ on entry to a loop and ‘b=TOP’ on the backedge.  If (3 meet TOP) is (3) but (TOP meet 3) is (BOTTOM) then we lose a constant – and it matters how (& when) we flow information around.  By requiring a commutative, associative and symmetric lattice we get a “Church-Rosser” style property, and this in turns means we can use an un-ordered worklist style technique and get the same best answer no matter how things come off the worklist – which is exactly how HotSpot implements Constant Propagation.

  • We want our lattice to be reflective around some centerline (which we choose as the line of simple constants); this is implied from the symmetry on the meet operator above.

The prior integer lattice was commutative, associative & symmetric:


    ...   -2      -1     0     1     2     3  ...
But the lattice with ranges from the prior blog (in additional to having a ridiculous bound) was not symmetric.  Lets try the lattice of ranges again – but limited to 2 values and then we'll add symmetry.  As before if we try to merge (**meet**) a lattice element of X with X+1 we'll make a short range, but if we meet a range of 2 values with a third – instead of expanding the range we will fall to _**bottom**_.  Here's our lattice:
    ...   -2      -1     0     1     2     3  ...
    ...  (-2,-1) (-1,0) (0,1) (1,2) (2,3) (3,4) ....

Is this lattice commutative?  i.e., is (a meet b) == (b meet a)?  Lets look at some examples.  What happens when we meet top and -1?  (top meet -1) == -1 == (-1 meet top), so this example works.  So does: (1 meet 2,3) == bottom == (2,3 meet 1).  So does: (0 meet 0,1) == 0,1 == (0,1 meet 0). I claim the “obvious interpretation” of this lattice is commutative.

Is this lattice associative?  i.e., is ((a meet b) meet c) == (a meet (b meet c))?  Again, running a few examples can convince us the lattice is indeed associative.

Is this lattice symmetric?  Well, in a word: No.  The opposite of top is bottom.  Things on the centerline are symmetric with themselves (e.g. in the lattice world, 3 is the opposite of 3).  How about a range like (1,2)?  It represents “1 or 2 but we don’t know which, and no other number”.  What is opposite the centerline from the range (1,2)?  Well, it’s a little weird: it is both the constant 1 AND the constant 2 at the same time.  NOT the more obvious “1 OR 2 and we don’t know which one it is”.  Whoa… lets digest that for a second.  Like top, it is an unnatural state of affairs (except in mathematics) whose concept is a little slippery.  Like top, we want to embrace a concept of allowing as much flexibility in number choices as we can… while still being a little bit more constrained than top.  Remember: top is ALL constants ALL the time; we’ll settle here for just 2 constants, but BOTH AT ONCE.

Having discussed the concept to death, lets settle on some nomenclature and then run some examples.  Lets write the opposite of the range (1,2) as the inverted range (2,1) (the high and low values are inverted).  Another syntax might be pre-pending a ‘~’ as in: ~(1,2).

     ..(-1,-2)  (0,-1) (1,0) (2,1) (3,2) (4,3) ....
     ...   -2      -1     0     1     2     3  ...
     ...  (-2,-1) (-1,0) (0,1) (1,2) (2,3) (3,4) ....
Notice how our lattice looks symmetric across the centerline now?  Here's the same example from the last blog using _**top**_:
  b = 1;       // b is the constant '1' here.
  while( P ) { // b is _**top**_ around the backedge -
               //   which matches the constant '1'
    S1;        // b is a '1' here
    b = 2-b;   // b is STILL a '1' here!
  }            // we bring a '1' value around for 'b' here
Again using the inverted range (2,1):
  b = 1;       // b is the constant '1' here.
  while( P ) { // b is (2,1) around the backedge -
               //   which matches the constant '1'
    S1;        // b is a '1' here
    b = 2-b;   // b is STILL a '1' here!
  }            // we bring a '1' value around for 'b' here
Ok, why bother with inverted ranges when _**top**_ covers everything?  Because we use them whenever we can **assert** a property is true, such as just after the property is tested with an IF:
  b = ...; // b is BOTTOM
  if( b >= 1 && b <= 2 ) {
    // here we can **assert** that b is the **join**
    //      of whatever it was before, and (1,2)
    b;     // b is range (1,2)
New jargon word: **join**, an operation which produces the intersection of properties (both the state of 'b' before the IF must be true, but also inside the if() the test must be true – so both properties are true at once).  Contrast this to the **meet** operator which yields the union of unknown properties.  The **join** of A and B can also be defined as: **~(~A meet ~B)**, and indeed this is exactly how HotSpot's C2 compiler computes it from src/share/vm/opto/type.cpp:


// JOIN operation; higher in lattice.
> // Done by finding the dual of the
> // meet of the dual of the 2 inputs.
> const Type *join( const Type *t ) const {
>  return dual()->meet(t->dual())->dual(); }
Back to our example of using **join** in a sentence:
  b = ...; // b is _**bottom**_
  if( b >= 1 && b <= 2 ) {     // b is the join of _**bottom**_ and (1,2)
    // b is ~(~_**bottom**_ **meet** ~(1,2))
    // b is ~(_**top**_ **meet** (2,1))
    // b is ~(2,1)
    // b is (1,2)
The symmetry property implies that:
if  **M == (A meet B)**, then:**~A == ~A meet ~M**     AND**~B == ~B meet ~M**
Our range lattice is now symmetric, so lets test that on some A,B pairs.
Example 1:  _**top**_ & (1,2)
M == (_**top**_ **meet**(1,2)) == (1,2)~_**top**_ =? ~_**top**_ **meet** ~(1,2) **_bottom_** =? **_bottom_** **meet** (2,1)

**_bottom_** == _**bottom**_

~(1,2) =? ~(1,2) **meet** ~(1,2)

(2,1) =? (2,1) **meet** (2,1)

(2,1) == (2,1)

Example 2: 1 & (0,1)
M == (1 **meet**(0,1)) == (0,1)~1 =? ~1 **meet** ~(0,1) 1 =? 1 **meet** (1,0)

1 == 1

~(0,1) =? ~(0,1) **meet** ~(0,1)

(1,0) =? (1,0) **meet** (1,0)

(1,0) == (1,0)


We have a **lattice**, with a distinguished _**top**_ and _**bottom**_ elements.  It has a **meet** operator which defines the lattice structure; **meet** is _commutative_, and _associative_.  The **lattice** is also _symmetric_ about the centerline and we use the _dual_ operator “~” to refer to elements across the centerline.  Constants exist on the centerline.  The elements above the centerline are used after IF statements to assert properties, or for unknown loop backedge values – and represent 'impossible' situations like some expression producing multiple constants being true _**at the same time**_.

We (almost) have enough knowledge now to be dangerous!  Lets build a more realistic Java lattice now.

## A Class Hierarchy

Our pointers have more structure than just **null** or **not_null**; they have types!  We have a Class Hierarchy.  Can we do something with it?  Somewhat similar to adding ranges to plain integer constants, the base case is simple but we need to be careful as we get clever.  The payoff is also very high: if we can deduce a **constant class** for a pointer, and the pointer is used to make a virtual call then we can simplify the virtual call to a static call … and maybe even inline.  That's a big payoff, so it's worth some effort here.

Lets go with a simple starter lattice; similar to the integers we'll look JUST for things are known to be a specific constant class… and NOT any subclass.  Weird, I know, but bear with me:


            _**top**_ - **any** single class the compiler desires
> String   Object(no subclass!)   XMLNode   Fred
>            _**bottom**_ - multiple choices,
>            e.g. String or XMLNode or a subclass

A common source of known-class-thingies is the '**this**' pointer in calls (and usually any other arguments), so my examples start that way:


boolean String.weird_equal( Object that ) {
>   // 'this' has the constant Class String, no subclasses!
>   // 'that' has the Class _**bottom**_
>   //        since it might be a subclass of Object
>   int hash1 = this.hashCode();
>   int hash2 = that.hashCode();
>   if( hash1 != hash2 ) return false;
>   if( !(that instanceOf String) ) return false;
>   return equals((String)that);
> }

What does CP tell us here?  Lets roll it forward one line at a time:

        int hash1 = this.hashCode();  // '**this**' has Class String
                                      //  and is **not_null**

**this** has the class String, so this is really a call to String.hashCode() and we can (now) happily inline it.

        if( that == null ) throw NPE; // hidden built-in NPE check...
        int hash2 = that.hashCode();  // 'that' has Class **_bottom_**
                                      // and is **not_null** 

No constant Class here, so this becomes a not-inlinable virtual-call, i.e. expensive.

        if( hash1 != hash2 ) return false;
        if( !(that instanceOf String) ) return false;

Hey!  After the 'if' test we know that '**that**' has Class String!  We already know that '**that**' is **not_null** – this gives us 2 orthogonal lattices we can track: pointers might be **null** or not, and they might be a constant Class or not.

        return equals((String)that); // the cast can be optimized away!

And since '**that**' is known to be the class String, the JIT can remove the cast.

A quick summary of what we have so far:
  pointers can be **null** or **not_null** (or _**top**_ or _**bottom**_)
    and this info is used to remove _null checks_
  pointers can be constant Classes (or _**top**_ or _**bottom**_)
    and this info is used to _de-virtualize_ calls
    and remove extra type checks
  We know both kinds of lattice info about pointers, at the same time.

Similar to integer constants, our pointer Constant Propagation can be optimistic or pessimistic.  Here's some code where being optimistic might help:


  **final class** LinkedListThing {
>     LinkedListThing _next; // instance field
>     int hashCode() {
>       int hash = 0;
>       Object x = **this**;
>       while( PRED ) {
>         hash += x.hashCode();
>         if( x **instanceof** LinkedListThing )
>            x = ((LinkedListThing)x)._next;
>         else
>            x = "fail";
>       }
>       return hash;
>     }

Deep breathe – okay, this example is getting large for a blog.  Let's break it down.  What happens if we are pessimistic?


  int hashCode() {    // this is Class LinkedListThing
>     int hash = 0;
>     Object x = **this**;  // x also is Class LinkedListThing
>     while( PRED ) {  
>       // x is _**bottom**_ around the backedge
>       hash += x.hashCode(); // unknown v-call
>       if( x **instanceof** LinkedListThing )
>       // _**bottom**_ so needs the full type-check
>            x = ((LinkedListThing)x)._next;
>            // x is Class LinkedListThing
>       else x = "fail";
>       // Else x is Class String if the full type-check fails
>     } // Here x is either String or LLT... so _**bottom**_
>     return hash;
>   }

In one pass of CP we “discover” that 'x' is Class _**bottom**_ – not much help here!  Let's go again with the optimistic approach:


  int hashCode() {    // this is Class LinkedListThing
>     int hash = 0;
>     Object x = **this**;  // x also is Class LinkedListThing
>     while( PRED ) {   // x is _**top**_ around the backedge;
>                       //   assume it is LLT
>       hash += x.hashCode(); // known call to LLT.hashCode()
>       if( x **instanceof** LinkedListThing )
>            // type-check not needed!
>            x = ((LinkedListThing)x)._next;
>            // x is Class LinkedListThing
>       else ... // Not reachable; x is the correct Class already
>     } // Here x is LinkedListThing!
>     return hash;
>   }

Hey presto!  We figure out what the programmer shoulda wrote already: that 'x' is of type LinkedListThing, and the **instanceof** test is useless.  Of course, we might have gotten here from some prior rounds of inlining, where 'x' could only be typed as an Object so lets not blame our strawman programmer (who's really just me trying to make some not-too-contrived examples).

All right, time to wrap up this blog.  **Next** time we'll add subclasses (and maybe arrays and/or interfaces) and see what happens.  But for now this blog is big enough.  Besides, me & my daughter hiked up to Upper Yosemite Falls and then hiked down at night, strictly by starlight… but that's a exciting story for the next blog.