Worlds Apart – JVMs and CLIs

Glyn Normington: It’s interesting to speculate what would have happened if a static module system had been put into Java much earlier. My guess is that it wouldn’t have bothered to address versioning or dynamicity requirements which only become crucial in the context of relatively large systems with continuous operation.

Indeed it is interesting to speculate. Mono and .net are both implementations of the Common Language Infrastructure.  They have had a static module system from the beginning and they power many applications from best in show MP3 players to relatively large systems with continuous operations such as myspace and shortly second life.

CLI implementations and JVM implementations couldn’t be more different.  While at first glance they appear very similar, they both have an intermediate language and they both use a JIT, the similarities seem to end there.

CLI implementations have been designed to apply classic compiler optimizations using code analysis techniques just like those found in a typical C++ compiler. For example their approach for method inlining is very simple. They also JIT all code once on first use. They can do this as most of the CLI languages have been designed for compilation using classic  analysis techniques.

JVMs, on the other hand, use runtime knowledge to decide what to optimize.  Rather than using the classic techniques they guess a lot (e.g. on whether a method can be inlined) and they have to keep these guesses around in case they load a class that invalidates them.  They also have different levels of optimizations that they perform, saving the most aggressive for code that is getting used the most (i.e. the hotspots). They use these approaches as the class loading rules in java make the application of code analysis techniques virtually impossible.

So which is better, I think the jury is still out, and they both have strong supporters.  However, CLI implementations have one big advantage over JVMs.  As they rely on classic static analysis CLI modules can be ahead of time compiled to native binaries. This brings many advantages such as improved startup and better memory sharing between processes, the latter being a key requirement if multiple applications are being run on a single workstation.

As such, the forthcoming language level changes for Java modularity along with the corresponding changes to the runtime provide a unique opportunity.  They could be designed as a static module system like the one in the CLI.  To do this they would need to change the classloading rules for classes in modules, but this would allow JVM implementations to ahead of time compile java modules to native code bringing all the advantages I mentioned above.

I really hope the expert groups design with this flexibility in mind and I see that I am not the only one that has seen the potential.

Update: I posted a follow up pointing out how subtle language differences have lead to the two different approaches to the runtime.

11 thoughts on “Worlds Apart – JVMs and CLIs”

  1. Interesting perspective and thanks for writing it up publicly Rob.

    Could you say a bit more about how the rules for loading classes in modules would need to change?

  2. Rob,

    Interesting analysis, thanks.

    One justifcation I heard for Java’s very late compilation and optimization of bytecode was described by James Gosling in a talk he gave (which I blogged about here: http://neilbartlett.name/blog/2007/03/15/an-evening-with-james-gosling). He said that there is substantial variation between individual processors even within the same line, i.e. the Pentium IV I have in my PC and the Pentium IV you have in your PC, although they share functional compatibility, have very different optimization characteristics. And of course both of them very different from from an AMD chip. Because the JVM compiles as late as possible, it can optimize for the “exact piece of silicon it’s running on” (his words). Whereas if you distribute native binaries then all you can do is optimize for the generic x86 architecture.

    Now, I don’t know enough of the low-level details to know whether that’s a valid justification or a load of hot air. I just wanted to point out that there is at least a theoretical advantage in very late compilation, even if it doesn’t translate to a practical advantage.

    Neil

  3. Neil: Gosling is right about the different characteristics of different processors. .NET ahead-of-time compilation actually happens when the bytecode is put onto a particular machine (typically when software is installed), so the compilation can take advantage of the characteristics of the PC’s particular processor. Significant changes to the system cause the cache of native images to be invalidated, and then .NET falls back to runtime JITting.

  4. Hmmm…. if Gosling were correct about the importance of hardware optimization, then wouldn’t we see source-code-only distros like gentoo getting significant traction? Wouldn’t you see Matlab recompiling itself on install?

    It’s a very convenient explanation, that there’s lots of optimization you can do, last-minute. But it requires proof. And microbenchmarks aren’t proof.

  5. Glyn,

    so for a module to be able to be AOT’d then as mentioned above classic analysis techniques need to be applicable to Java code. There are a few “features” in java that make this basically impossible to do. These are those features that allow code to alter the class hierarchy dynamically at runtime. Custom Classloaders and javaagents are two such examples (there may be others).

    The classloading rules would therefore need to be changed so that there is only ever one classloader per module. IKVM (java for CLI) has an approach close to what I think is needed.

  6. I believe the JSR 294 proposal (superpackages) specifies that a superpackage (presumably will be linked to a JSR 277 JAM) must be loaded by the same classloader. Or at least that’s my understanding from the strawman proposal. So, maybe that will become an option in Java 7.

  7. Rob–I believe that when you say “JVMs, on the other hand, use runtime knowledge to decide what to optimize” is only true for some JVMs, and is not required by the JVM spec. In fact, early on the approach was to have JIT-only, but the overhead of JIT’ing all code (even code which ran infrequently) plus other issues (all that compiled code needs to be cached, and the cache managed) meant that other approaches were tried. Mixed-mode, runtime-optimizing compilers, like Hotspot, are one alternative. But there are others–there are not only optimized JITs still being developed, but also AOT compilers (including gcj and Excelsior Jet)–unfortunately in many cases these are buried under IP protection so we don’t know exactly what they are doing. It is true that Sun’s Hotspot’s dynamic-optimization gets the most press because it’s turned out to lead to great performance, although it took a long time. But as with a market economy, where prices tend to stabilize, in the Java world I think it’s the case that performance of these various approaches are within spitting distance of each other. I certainly don’t think that the Java/CLR comparison can be made so easily. Best regards, Patrick

Leave a Reply

Your email address will not be published. Required fields are marked *