Worlds Apart – C# and Java

Alex Buckley C# itself is no more or less designed for these analyses than Java. (in response to compilation approaches outlined in Worlds Apart – JVMs and CLIs).

I beg to differ, JVMs had to introduce hotspot compilers whereas CLIs have not because of a subtle design difference between the languages and, moreover, other design differences can lead to much more efficient AOT‘d C# code. JSR294 (modularity system for Java), however, provides an opportunity to provide just as efficient AOT for Java, I hope we take this opportunity.

The subtle but vital design difference, that led to JVMs requiring hotspot compilers and CLIs not,  is that non-virtual is the default in most CLI languages which is the complete opposite to Java.  Most methods will never be overridden (by, for example, a method in a subclass) however in Java the programmer must explicitly state this via the final keyword. On the other hand, in C# the programmer has to explicitly declare the opposite i.e. when a method can be overridden, by using the virtual keyword.  If no declaration is placed on a method in C# then it is assumed to be non-virtual i.e. the equivalent of final in java.

So why is this important? Well it turns out that one of the key qualities that must be calculated for a method to be an inline candidate is whether it is overridden.  As most programmers don’t say anything about whether a method is virtual or non-virtual (we’re a lazy bunch), the default applies and as a result many more methods in C# are implicitly non-virtual i.e. they can never be overridden. The compiler can use this information when determining whether to inline. In Java much more complicated calculations must be performed and it is the costly nature of these calculations that created the need for hotspot virtual machines.

Early JVMs (i.e. pre-HotSpot) used to JIT compile every method on first use, just like CLIs do today. However, the early JIT compilers were conservative in making inlining assumptions because they didn’t know what classes might be loaded later, and as such their performance was pretty bad.

So why were they conservative?  Think about what they have to do. They can’t rely on the fact that non-virtual methods are declared as such because most of them aren’t (lazy programmers and virtual is the default in Java). Instead they have to guess.  To make that guess they have to understand the class hierarchy at the time of JITing and check that non of the subclasses override the method. That’s expensive enough for the inline operation, but it gets worse. They have to remember this guess in case a class loaded later overrides the method forcing them to de-optimize the code by undoing the inlining. All of this cost is significant and so hotspot compilers were introduced that only do this on code that gets used a lot.

Contrast that with the CLI implementations (i.e. mono and .net).  There’s a description of the rules that microsoft uses and they are very simple. One rule is even "Virtual functions are not inlined".  Whereas JVMs go to great lengths to determine whether a virtual function can be inlined, .net doesn’t even bother looking and this is all due to differences in the language specifications of Java and C#.  As an aside Visual Basic also has non-virtual as the default and the overridable keyword is used to make a method virtual.

Now the JSR294 expert group (modularity system for Java) can’t make non-virtual the default in Java, but they can probably do enough that it won’t matter. If the goal is to get to efficient AOT compiled code, which is the key differentiation between CLIs and JVMs at the moment, then that goal is attainable via JSR294.

As Patrick pointed out there are various attempts to add AOT compilation to Java.  GCJ and Excelsior are the two front runners.  However, they have to answer this same question i.e. when can a method be inlined? Given that they do compilation ahead of time, the performance of the actual compilation step is not as critical as it is for JIT compilers and so they could use classic techniques such as rapid type analysis (RTA) or class hierarchy analysis (CHA) to determine whether a method is overridden. Unfortunately neither of these classic techniques can be applied ahead of time to Java, although they can be applied to C#, why is this?

Java’s custom classloaders mean that the class hierarchy can only be determined at runtime. This is due to a custom classloaders ability to load classes from anywhere on the classpath at any time. It’s simply impossible to understand ahead of time what the class hierarchy looks like and so RTA or CHA can’t be used for AOT compilation of Java programs. Instead to maintain Java compatibility GCJ, for example, resorts to an ABI that doesn’t even allow inlining. It’s performance suffers accordingly.

This is the problem that JSR294 can choose to solve, it can define a modularity system that allows for efficient AOT of Java code.  This problem has already been well researched (and solved).  MJ, a paper by, amongst others, the inventor of RTA, has already defined such a modularity system and successfully applied it to Tomcat.

Again I urge the JSR294 expert group to consider taking on the requirement of producing a modularity system that allows for efficient AOT compilation of Java code.  JVMs are at a significant disadvantage compared to their CLI counterparts at the moment as efficient AOT of CLI code is already possible. Why not bring this possibility to the Java platform in JSR294?

11 thoughts on “Worlds Apart – C# and Java”

  1. Hi Rob,

    thanks for an interesting post.

    I know a little bit of AOT compilation for Java because I’m the CTO for Excelsior, the maker of Excelsior JET. ;)

    Let me add a few notes to what you said.

    INLINE

    Both RTA & CHA are possible with AOT compilation provided the guarded inline is used. A simple check
    is followed by inlined body and the virtual call is put on another branch and considered “cold” code. Note that this check is required anyway (even if a final method is inlined) because the receiver (“this”) must be checked for null before the body is executed.

    Though guarded inline is not as effective as direct one (the merge point in the control flow kills the info that would come from the inlined body), it is still very effective because, for example, method accessors (set/get) work virtually as fast as they would do if inlined directly.

    Note also that many JVMs with dynamic compilers also employ guarded inline to avoid expensive recompilation immediately after speculative assumptions on the virtual call target fail. Later, they can re-optimize the method according to the “new reality”. ;)

    Custom classloaders are irrelevant to inlining because even a standard classloader (system or application one) can be used to load a derived class with the method of interest overridden.

    >>Java’s custom classloaders mean that the class hierarchy
    >>can only be determined at runtime.

    This can be done with a standard classloader too.

    CUSTOM CLASSLOADERS

    >>This is due to a custom classloaders ability to load
    >>classes from anywhere on the classpath at any time.

    And not only on the classpath – in general, from anywhere up to creating bytecodes (literally, array of bytes) at run-time.

    In theory, custom classloaders hinders AOT compilation but it’s still possible. For example, Excelsior JET Runtime comes with a Caching JIT that caches the results of dynamic compilation. We also provide the JIT Cache Optimizer that, in essence, applies the main AOT compiler to the JIT cache to use the power and resource freedom of a static compiler. After that, on the next app launch the optimized cache works much better. ;)

    This way, Java applications using custom classloaders can be AOT-compiled.

    The downside is that checking consistency of the JIT cache (a must for Java compatibility) imposes some start-up overheads and requires the presence of original class files to decide if the cache is up to date or should be (partially) rejected.

    There is yet another approach to resolve this issue – a reasearh work done at IBM back in 2000-2001. Just search for “QuickSilver and Midkiff”

    ———-

    JSR-294 could make things simpler provided it will not substantially revised and get accepted within the JCP process, which is not a simple procedure. ;)

    Thanks,
    –Vitaly

  2. First off, thanks, Vitaly, for such an in-depth commentary on the issue :-P

    Actually, I think this is a great argument for why C# is bass ackwards.

    In theory (and Vitaly’s comment and the tail end of your blog post explore how to turn this into “in practice”) you can reduce the overhead of ‘assumed virtual’ to insignificance.

    This isn’t the only time I caught C# making design decisions that are loosely equivalent to ‘optimizing early’. Right now it makes a lot of sense because the techniques to automate a certain step are currently too hazy or just too difficult to get right, but eventually, it might happen. In that case, don’t even allow programmers to specify this and lay down the rules, locking the entire platform into that rule for perpetuity, backwards compatibility being the holy grail that it is in both C# and especially in Java land.

    What I’d like to see more of, instead, is language keywords which specify INTENT, regardless of the compilation benefits thereof. One could for example make a case that every method declaration should specify explicitly whether or not it is simply supplying an alternative implementation (e.g. overriding a definition of an implemented interface or extended class), or creating a new definition, and whether or not this implementation is designed to be extended, or not. Not a very good case, but there’s a case there. Right now neither java or C# really do this, they simply occupy opposite opinions of the same fence: there’s a keyword to specify one behaviour, and ommiting it automatically means it’s the other behaviour, and there’s no keyword that can be supplied to clarify the default. Java and C# both follow this, they’ve just got the default switched around. Neither is more expressive in communicating intent than the other on the surface.

    to add another perspective (or maybe this is what you were driving at): Let’s say you have a module (JSR294 based or OSGI) where, for the entire codebase, no custom classloaders are even defined, and no classes are loaded dynamically through reflection (actually covers the majority of java software out there, for good reason). It should be simple to annotate this code analysis factoid. The hotspot compiler will have to do a lot less work to determine the safety of inlining anything based on that knowledge.

  3. Interesting discussion. In the Java world, the strong tendency these days is not to use “final” as a compiler optimization hint, but rather as a programmatic control–to control negative or unintended side-effects when classes aren’t designed for subclassing, or variables shouldn’t be re-initialized. In most cases, people design for extensibility and take advantage of being able to modify, extend, swap, classes and methods at runtime; it’s a big part of the platform. Maybe I haven’t seen the performance numbers you’re looking at, but nothing that I have seen makes any of the JVM implementations out to be a slouch. Patrick

  4. Interesting stuff. I’d only put a question, just to learn better something about .NET (which I basically don’t know): it sounds like .NET can’t have custom classloaders. Right?

  5. Fabrizio,

    so .NET does not have custom classloaders it has dynamic assemblies instead.

    Patrick,

    I am making no claims as to the performance advantage of one over the other. I think the performance is actually pretty similar, I just think it interesting that CLI (with the benefit of hindsight) chose a totally different approach to JVMs.

    Reinier,

    you hit the nail on the head, what I would like the JSR294 expert group to consider is ensuring that classloading rules for modules allow a module to be declared static (i.e. custom classloaders are not used and compilers can determine the class hierarchy within a given module so that, for example, RTA can be used within a module). Of course I would also prefer static modules to be the default. As you point out most java code does not require the dynamism.

    Vitaly makes a good point, but I wonder how efficiently guarded inlines are when trying to compile something like Tomcat or even Websphere with all its dependencies into DLLs / Shared Objects. One of the criticisms I have heard leveled at AOT compilation for Java is that the size of the executable gets significantly larger than the corresponding jar. I imagine that having the redundant code paths with guarded inline compounds this problem.

    Again my main point here is that JSR294 can bring efficient simple AOT to the Java platform. I just hope we take the opportunity.

  6. Not a performance consideration, just a style one – why would you have a non-virtual method at all, rather than making it static? Is that just because it’s more convenient to write b.doThis(a) than doThis(b,a)?

  7. @Reinier
    @What I’d like to see more of, instead, is language keywords which specify INTENT, regardless of the compilation benefits thereof.

    Syntax for rule-based systems on top of programming languages? Rule number one: Avoid accidental complexity at all costs. This is why the Macker project is not as practical as I wish. In addition, this concept of “intent” conflates how-to and what-to.

    @Rob Yates

    Your view-point is interesting. I always thought Java had the right idea with the “final” keyword. I like when people challenge what I consider right. To me, “final” is used very effectively in Java to communicate that once something is defined, it cannot be changed.

    @Ricky Clarkson

    A static method does not depend on instance variables, whereas methods with other storage classes do. A static method depends only on the arguments passed into it. Since it does not have these dependencies, a static method remains available until the end of program execution.

  8. From my personal communication with Sun HotSpot team, I know that the currently implemented inlining mechanisms are considered well done and do not require any “simplification”

    As for AOT compilers, I can judge only about Excelsior JET and I would say that’s not a big problem. The existing devirtualization techniques work well and the optimizer feels comfortable. Not sure about GCJ ‘cause I do not know the details.

    JSR294 will not simplify inlining virtual calls. Despite the stricter access rules, an application can dynamically load a new class that __belongs__ to a superpackage (the notion introduced by JSR294). It effectively means that a superpackage cannot be considered a “closed sub-world” in which CHA or RTA will get more freedom to work.

    ——–

    Custom classloaders really make AOT compilation harder (though still possible). The necessity of shipping the original class files along with their optimized counterparts as well as multiple consistency checks often makes AOT compilation impractical for apps that heavily use custom classloading such as Eclipse.

    BUT, there is a hope and it is called… JSR277 ;)

    This JSR introduces “deployment modules” as an addition to “development modules” proposed by JSR294. Actually, in most cases custom classloaders are used for two simple purposes:

    1. Unique name spaces for loaded classes
    2. Version control

    For that, the using of _general_ custom classloaders is overkill and complicates things to a great extent. That’s why JSR277 proposes deployment modules that use

    a static import policy declared in the module’s manifest
    module classloader (one instance per module)

    For details, see “Section 8. Execution” of the available public draft of JSR277. In my opinion, this is definitely a right direction and I would have to join EG ;)

    ————
    @Rob (size of AOT compiled code)

    Java bytecode had been originally designed for compactness. It has a much higher level than a typical CPU instruction set and takes less space than the equivalent machine code, say, for Intel x86. But Java class files contain not only code. The amount of symbolic information in class files has grown dramatically over the years due to the development of numerous APIs and their package structure. So now there is not a dramatic difference in disk footprint between the original and AOT-compiled forms. As for total download size of AOT-compiled apps, it can be even smaller than the size of the JRE.

    http://www.excelsior-usa.com/blog/?p=52

    Thanks for your attention – I talk a lot. ;)

    –Vitaly

  9. Re: “C# itself is no more or less designed for these analyses than Java.”

    I was mainly thinking of the non-OO control flow analyses when I wrote that. It’s true that C#’s non-virtual default makes CHA redundant for some programs, but 1) CHA is still useful in a CLI JIT due to the possibility of virtual methods, and 2) the default is in no small part for familiarity with C++ than for a desire to avoid CHA per se.

Leave a Reply

Your email address will not be published. Required fields are marked *