Skip to content

Worlds Apart – JVMs and CLIs

Glyn Normington: It’s interesting to speculate what would have happened if a static module system had been put into Java much earlier. My guess is that it wouldn’t have bothered to address versioning or dynamicity requirements which only become crucial in the context of relatively large systems with continuous operation.

Indeed it is interesting to speculate. Mono and .net are both implementations of the Common Language Infrastructure.  They have had a static module system from the beginning and they power many applications from best in show MP3 players to relatively large systems with continuous operations such as myspace and shortly second life.

CLI implementations and JVM implementations couldn’t be more different.  While at first glance they appear very similar, they both have an intermediate language and they both use a JIT, the similarities seem to end there.

CLI implementations have been designed to apply classic compiler optimizations using code analysis techniques just like those found in a typical C++ compiler. For example their approach for method inlining is very simple. They also JIT all code once on first use. They can do this as most of the CLI languages have been designed for compilation using classic  analysis techniques.

JVMs, on the other hand, use runtime knowledge to decide what to optimize.  Rather than using the classic techniques they guess a lot (e.g. on whether a method can be inlined) and they have to keep these guesses around in case they load a class that invalidates them.  They also have different levels of optimizations that they perform, saving the most aggressive for code that is getting used the most (i.e. the hotspots). They use these approaches as the class loading rules in java make the application of code analysis techniques virtually impossible.

So which is better, I think the jury is still out, and they both have strong supporters.  However, CLI implementations have one big advantage over JVMs.  As they rely on classic static analysis CLI modules can be ahead of time compiled to native binaries. This brings many advantages such as improved startup and better memory sharing between processes, the latter being a key requirement if multiple applications are being run on a single workstation.

As such, the forthcoming language level changes for Java modularity along with the corresponding changes to the runtime provide a unique opportunity.  They could be designed as a static module system like the one in the CLI.  To do this they would need to change the classloading rules for classes in modules, but this would allow JVM implementations to ahead of time compile java modules to native code bringing all the advantages I mentioned above.

I really hope the expert groups design with this flexibility in mind and I see that I am not the only one that has seen the potential.

Update: I posted a follow up pointing out how subtle language differences have lead to the two different approaches to the runtime.

{ 7 } Comments

  1. Glyn Normington | March 30, 2007 at 7:22 am | Permalink

    Interesting perspective and thanks for writing it up publicly Rob.

    Could you say a bit more about how the rules for loading classes in modules would need to change?

  2. Neil Bartlett | March 30, 2007 at 11:01 am | Permalink

    Rob,

    Interesting analysis, thanks.

    One justifcation I heard for Java’s very late compilation and optimization of bytecode was described by James Gosling in a talk he gave (which I blogged about here: http://neilbartlett.name/blog/2007/03/15/an-evening-with-james-gosling). He said that there is substantial variation between individual processors even within the same line, i.e. the Pentium IV I have in my PC and the Pentium IV you have in your PC, although they share functional compatibility, have very different optimization characteristics. And of course both of them very different from from an AMD chip. Because the JVM compiles as late as possible, it can optimize for the “exact piece of silicon it’s running on” (his words). Whereas if you distribute native binaries then all you can do is optimize for the generic x86 architecture.

    Now, I don’t know enough of the low-level details to know whether that’s a valid justification or a load of hot air. I just wanted to point out that there is at least a theoretical advantage in very late compilation, even if it doesn’t translate to a practical advantage.

    Neil

  3. Dominic Cooney | March 30, 2007 at 1:58 pm | Permalink

    Neil: Gosling is right about the different characteristics of different processors. .NET ahead-of-time compilation actually happens when the bytecode is put onto a particular machine (typically when software is installed), so the compilation can take advantage of the characteristics of the PC’s particular processor. Significant changes to the system cause the cache of native images to be invalidated, and then .NET falls back to runtime JITting.

  4. Gastromancer | March 30, 2007 at 8:32 pm | Permalink

    Hmmm…. if Gosling were correct about the importance of hardware optimization, then wouldn’t we see source-code-only distros like gentoo getting significant traction? Wouldn’t you see Matlab recompiling itself on install?

    It’s a very convenient explanation, that there’s lots of optimization you can do, last-minute. But it requires proof. And microbenchmarks aren’t proof.

  5. Rob Yates | March 31, 2007 at 1:59 am | Permalink

    Glyn,

    so for a module to be able to be AOT’d then as mentioned above classic analysis techniques need to be applicable to Java code. There are a few “features” in java that make this basically impossible to do. These are those features that allow code to alter the class hierarchy dynamically at runtime. Custom Classloaders and javaagents are two such examples (there may be others).

    The classloading rules would therefore need to be changed so that there is only ever one classloader per module. IKVM (java for CLI) has an approach close to what I think is needed.

  6. Alex Miller | April 19, 2007 at 3:05 am | Permalink

    I believe the JSR 294 proposal (superpackages) specifies that a superpackage (presumably will be linked to a JSR 277 JAM) must be loaded by the same classloader. Or at least that’s my understanding from the strawman proposal. So, maybe that will become an option in Java 7.

  7. Patrick Wright | April 20, 2007 at 9:31 pm | Permalink

    Rob–I believe that when you say “JVMs, on the other hand, use runtime knowledge to decide what to optimize” is only true for some JVMs, and is not required by the JVM spec. In fact, early on the approach was to have JIT-only, but the overhead of JIT’ing all code (even code which ran infrequently) plus other issues (all that compiled code needs to be cached, and the cache managed) meant that other approaches were tried. Mixed-mode, runtime-optimizing compilers, like Hotspot, are one alternative. But there are others–there are not only optimized JITs still being developed, but also AOT compilers (including gcj and Excelsior Jet)–unfortunately in many cases these are buried under IP protection so we don’t know exactly what they are doing. It is true that Sun’s Hotspot’s dynamic-optimization gets the most press because it’s turned out to lead to great performance, although it took a long time. But as with a market economy, where prices tend to stabilize, in the Java world I think it’s the case that performance of these various approaches are within spitting distance of each other. I certainly don’t think that the Java/CLR comparison can be made so easily. Best regards, Patrick

{ 4 } Trackbacks

  1. [...] Rob Yates had an interesting post this week on the difference between the JVM and CLI approach to compilation. Specifically, how does the CLI differ from having a static module system from the start as opposed to the JVM’s more open dynamic classloader system. Seems like the changes in JSR 294/277 might create a world where you could get the benefits of both. [...]

  2. [...] Alex Buckley C# itself is no more or less designed for these analyses than Java. (in response to compilation approaches outlined in Worlds Apart – JVMs and CLIs). [...]

  3. [...] So why can’t Java programs share memory given that shared libraries are a technology that’s been in widespread use for at least 25 years? It’s really for two reasons.  The first is that up until this point there has really been no way to define a shared library in Java i.e. here’s a bunch of code and here is its external interface. JSR294 (modularity for Java) addresses this point.  The second is that Java has no efficient AOT compilation and so unlike most other statically typed languages it can’t utilize the traditional OS functions to load and share its libraries. I have belabored this point in the other "Worlds Apart" posts here and here, pointing out that this is a key differentiation for CLI implementations such as .net or mono.  Miguel (mono lead) also has an excellent post that provides a good overview of mono’s AOT support and the rationale behind it. JSR294 could be designed in such a way to provide for efficient AOT, and I belabor this point here. [...]

  4. robubu : Unladen Swallow, a Hummer Hybrid? | October 20, 2009 at 7:49 pm | Permalink

    [...] the problems inherent in making any dynamic language go faster, namely type inference. I’ve blogged about this before for Java and the unladen swallow presentation does a good job of explaining all the pitfalls around [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *