Writing to Final Fields After Construction

[ed. Updated Oct 19, see below, heck it’s almost like THREE posts in the same month…]

Surprise!  TWO blog posts in the same month!!!

Surprise #2!  A popular open-source framework writes to final fields after object construction via generated bytecodes!  Final fields are supposed to be… well, final. I was quite shocked to realize the default Verification settings let such things go by (to support some common use-case in Reflection; de-serialization I think is the use-case).  Final fields are allowed to be written to exactly once in the constructor.  If you write to them either outside the constructor, or more than once in the constructor, or publish the “this” pointer of a partially constructed object with a final field, or write to the final via Reflection then…. All Bets Are Off.  See, for example, section 9.1.1. of this paper by Bill Pugh.  To quote:

“Another problem is that the semantics is designed to allow aggressive optimization of final fields.  Within a thread, it is permissible to reorder reads of a final field with calls to methods that may change final fields via reflection.”

In particular, the following code “behaves differently” (but legally) after the JIT kicks in:

  final Object _crunk;
  void someFunction() {
    if( _crunk == null ) // a First Read of _crunk
      init_crunk();      // Does this write to a final field?
    ..._crunk...         // Perhaps, a Second Read of _crunk or perhaps not.

In all cases there is a first-read of _crunk.  Before the JIT kicks in, there is a 2nd read of _crunk after the initialization call… because the interpreter executes each ‘getfield’ bytecode in isolation.  The JIT, however, notices that there are 2 reads of _crunk outside the constructor, and therefore it is profitable to do common-subexpression-elimination of the loads of _crunk – and legal to do so across the not-inlined-call “init_crunk()” because _crunk is a final field.  The setting of _crunk inside the “init_crunk()” call is ignored until the CPU leaves the JIT’d version of someFunction().

This optimization of the final field is the main optimization performed on final fields.  It is a crucially important optimization for Java asserts – with Java asserts, all the asserts are guarded by a test of a static final field.  Since that field is set once and known statically, the JIT’s can optimize on it’s current value.  In particular, the guarding field is typically set to false (asserts are turned off), and the JITs read the value early (at JIT time) and constant-fold the guard test away.  I.e., there’s no performance penalty for having turned-off asserts…. because the final field is read very very early, and then optimized based on it’s value.

In the case of our open-source framework, the framework is lazily de-serializing an object with final fields.  They test for a null value in a final field, which is an indication of a lazily-not-yet-de-serialized object.  If the final field is null, they call out to a complex de-serialization routine (which is sometimes not inlined into the local compilation unit by the JIT) which sets the final field.  After the test & de-serialization call, they then begin using the final field assuming it is not null.  Those later uses are optimized by the JIT to use the original value loaded from the final field… i.e. null.  The observation of the initializing write is delayed until after the CPU exits compilation unit.  The NEXT call to the same routine then loads the final field again and now observes the not-null value.

So What Do We Do?

So what do we do?  Do we ship a JVM which does a legal (and typically very useful) optimization on final fields… that happens to kill this one open source framework occasionally (after sufficient JITing under heavy load, and the right random pot-luck of inlining to trigger the behavior)?  (btw, I have been in communication with the people in the project, and they are now aware of the issue, and are working on it)    Do I make a magic flag “-XX:-OptimizeFinalFields” that you need to use with this framework  (it will be needed at least to run our JVM on existing jar files)?  How long do we need this flag?  I hate user-visible flags… they stick around forever, in ancient scripts of antiquity never to be changed again…

Right now I am leaning towards a solution which simply observes if a field is ever set outside a constructor, and disables final-field optimizations on a field-by-field case if it ever sees such an update.  No flag needed and it will “Do The Right Thing” in this case, although the JIT’d code will be somewhat pessimistic (in that no final-field optimizations will happen for all instances of that field when a single out-of-bounds update is observed).  How clever do I need to get, to keep the performance there for Everybody Else, while allowing old jar files for this framework to keep on doing what it is doing?

Cliff

Update Oct 19-

Doug Lea writes:

I just saw your blog post …

Every time I see things like this I feel sad/guilty that because we don’t give people a good way to do what they want (mainly, lazily-initialize or deserialize finals), they do [unfortunate] things instead.  If you have any good thoughts on offering better ways to do this, please let me know.  In the absence of any, maybe I ought to try another push for Fences API.

I was struck while reading this that the main problem in your intro example is that Java makes it too easy for programmers not to notice that fields and local reads of those fields are different  things.  In my concurrency course, I’ve tried to force students to notice by telling them that for one assignment, they are required to always explicitly declare/use locals. I wish I could say that this experience generalizes, but I still see too many cases of, for some volatile field:

if (field != null) field.f();

and similar brokennesses. When a language makes it hard to think clearly about something, it’s not surprising that people don’t.

Cliff writes:

So disallow mentioning a volatile variable “bare”.   It needs to be decorated, to make it more clear you are performing an action on access.  Require that for all volatile fields, you are limited to:

    "...field.**vol_read**() ..." and "field.**vol_set**(x)"?

Too bulky syntax… how about:

    "...field.**get_acq**() ..." and "field.**set_rel**(x)"?

No better.  So we switch to styles, how about Itanic load-acquire and store-release style:

    "... x.**ldacq** ..."  and "x.**strel** = ..."

Or using language more suitable to Java programmers:

    "...x.**get** ..." and "x.**put** = ..."

So your example below becomes:

    "if( field.**get** != null ) field.**get**.f();"

Obviously 2 “get” actions in a row.

But proposing new syntax for Java is water-under-the-bridge.  I’m with you on the “sigh”.  This can only be fixed in a new language.

Published by

wpengine

This is the "wpengine" admin user that our staff uses to gain access to your admin area to provide support and troubleshooting. It can only be accessed by a button in our secure log that auto generates a password and dumps that password after the staff member has logged in. We have taken extreme measures to ensure that our own user is not going to be misused to harm any of our clients sites.