My Newest Insight into the Generics Controversy

Okay, I’ve had some quality time with Java5 at my newest gig, and I’m starting to understand the great controversy behind Java generics. There was a particularly painful insight I just had that has officially flipped me from stalwart supporter to reluctant supporter, and I’d like to share that with the world so they don’t have to hunt this bug down themselves.

The particular insight runs as follows. Assume that you have a class FooBar that takes parameterized type T. And assume you have a method that looks something like this:

    public T create() {
        final Object created = makeIt();
        try {
            final T out = (T) created;
            return out;
        } catch (ClassCastException cce) {
            return null;
        }
    }

Now, if I have a Foobar<Integer> object kicking around, and call create() in such a state that it makeIt() returns the String “42″, what would you expect to come out of create()?

Well, my thought was as follows.
Reduction 1:

    public T create() {
        final Object created = "42";
        try {
            final T out = (T) created;
            return out;
        } catch (ClassCastException cce) {
            return null;
        }
    }

Reduction 2:

    public Integer create() {
        final Object created = "42";
        try {
            final Integer out = (Integer) created;
            return out;
        } catch (ClassCastException cce) {
            return null;
        }
    }

Reduction 3:

    public Integer create() {
        try {
            final Integer out = (Integer)"42";
            return out;
        } catch (ClassCastException cce) {
            return null;
        }
    }

Reduction 4:

    public Integer create() {
            return null;
    }

So, I’m thinking the answer should be null.

What I forgot is that Java implemented generics through type erasure (cite). So the reductions actually look like this:

Reduction 1:

    public T create() {
        final Object created = "42";
        try {
            final T out = (T) created;
            return out;
        } catch (ClassCastException cce) {
            return null;
        }
    }

Reduction 2:

    public Object create() {
        final Object created = "42";
        try {
            final Object out = created;
            return out;
        } catch (ClassCastException cce) {
            return null;
        }
    }

Reduction 3:

    public Object create() {
            return "42";
    }

And that method — that one which the compiler dilligantly guarantied would only be used in places where it returns Integer objects — just returned a String.

Seriously.

I’ve got the unit tests to prove it.

EDIT: Oh, yeah, and there’s no way to go from <T> to something like T.class.

Related posts:

  1. A Java Gotcha
  2. This is your brain; this is your brain on OCaml
  3. Motley Crew
  4. Extending Java Syntax
  5. Implementation Exposure Through Inheritance
This entry was posted in Classic, To Be Categorized and tagged , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
  • http://neilbartlett.name/blog/ Neil Bartlett

    This is “Erasure 101″. Here are a few other gems:

    1) You can’t create an array of type T[].

    2) If you have an interface called, say, EventHandler, and you want to handle multiple types of events in one class, you can’t say “implements EventHandler, EventHandler”

    3) Suppose you have an interface EventHandler with the method handle(T event). You implement EventHandler so you write a method with the signature handle(String s). Now the problem is, your class doesn’t implement the interface! T is erased, so the interface method is actually handle(Object), and obviously handle(String) doesn’t override handle(Object) – it just overloads the handle method. The way this is fixed is the compiler generates a “bridge” method for you with the signature handle(Object) which just calls your handle(String) method. So you have an extra level of indirection, and also you’re not allowed to write your own method for handle(Object) because it would clash with the compiler-generated one.

    Phew! Still a supporter? ;-)

  • http://neilbartlett.name/blog/ Neil Bartlett

    Oh damn, all my generics syntax got wiped by the blog software. Here’s what I wanted to post, with less-than and greater-than replaced with [ and ]:

    This is “Erasure 101″. Here are a few other gems:

    1) You can’t create an array of type T[].

    2) If you have an interface called, say, EventHandler[T], and you want to handle multiple types of events in one class, you can’t say “implements EventHandler[String], EventHandler[Exception]”

    3) Suppose you have an interface EventHandler[T] with the method handle(T event). You implement EventHandler[String] so you write a method with the signature handle(String s). Now the problem is, your class doesn’t implement the interface! T is erased, so the interface method is actually handle(Object), and obviously handle(String) doesn’t override handle(Object). The way this is fixed is the compiler generates a “bridge” method for you with the signature handle(Object) which just calls your handle(String) method. So you have an extra level of indirection, and also you’re not allowed to write your own method for handle(Object) because it would clash with the compiler-generated one.

    Phew! Still a supporter? ;-)

  • http://enfranchisedmind.com/blog Robert Fischer

    Well, I’m a supporter in that I support people using generics if at all possible. The metadata value at development time is priceless, even if it causes the occassional annoying warning. It does mean that there are occassions for stuff like TypedCollection to add more type guaranties, and there are a lot of annoyances (like the ones you talked about), but it’s better than the alternative in Java.

    It does makes me miss Ocaml’s type inference all the more, though.

  • Brian

    Ocaml simply wouldn’t let you do that. I mean, the function makeIt has, in Ocaml terms, the type unit -> 'a. This is Ocaml typespeak for “this function doesn’t return”- i.e. it is either an infinite loop or it always throws an exception. Unless the makeIt function was passed in.

    This is one case where Ocaml outlaws (more or less) a whole “design pattern”. And probably for the best. I mean, if you know what type T is, at least to the point where generally it’ll be a subclass of the type returned from makeIt, then it’s not really general. In Ocaml, the type ‘a -> ‘b means “for all types a and b, the function mapping from type a to type b”. If type T really is generic, then there is no way for makeIt to make something of that type sensibly.

    Brian

  • http://enfranchisedmind.com/blog Robert

    The original version for this code came from trying to genericify the Factory the method: the user would provide a type and an implementation, and that class cast exception was to detect when the two got out of whack. The makeIt() method was actually impl.create().

    In the case of Ocaml, this design pattern isn’t so much outlawed as just pointless — the possible return values from the function are known at compile time, so a wrapper that provides /enforces that information has zero value: the compiler provides/enforces that information for you!

  • Brian

    There’s no way to enforce that impl.Create returns a type T?

  • http://enfranchisedmind.com/blog Robert

    Nope: it’s an interface which uses the standard Java type system undercut of returning Object. See the Factory API here.

  • bhurt-aw

    This, I think, is the reason after-market bolt-on type systems are a bad idea. The problem is that the code that is not correctly typed is, by definition, dangerous. Now, all programming languages have some amount of untyped (and thus “dangerous”) code- even Ocaml and Haskell call untyped and unsafe C routines. The question is how much of the code is unsafe? The more unsafe the code is, the more danger there is. Which is why both Haskell and Ocaml programmers dislike dropping to C to do anything that can be done in the main language.

    Java has been a popular language for over a decade before Generics were introduced- most Java code was written before Generics, and as such is dangerous.

    Well, they’re calling my flight. More on this later.

  • http://enfranchisedmind.com/blog Robert

    I know Brian has heard this before, but I need to say it again, because it still makes me cranky.

    What drives me the most nuts is that Java should have known about these problems. C++’s templates were a known mess at the time when Java was on the drawing board — I suspect that’s why Generics didn’t make it into Java 1. But they also knew that there was a problem with non-templated C++ (namely, the inevitable undercutting of the type system with casts), and Java decided to just go ahead and ignore that problem.

    At first, I was hoping that generics were going to be a pretty slick solution, but no dice.

  • Brian

    It makes you cranky?

    The problem is that there are two solutions to the problem implemented in “popular” languages- you either ditch static typing althogether, or you go with C++ templates. Both of which suck.

    It looks like C# is going to introduce type inference for local types (at least): see this link. Hopefully this will introduce the idea to mainstream programmers. Unfortunately, this means Microsoft “invented” it (you know, in the same way they invented the GUI).

  • bhurt

    I want to expand on my comment from friday on why bolt-on typesystems are inheritently inferior to built-in typesystems.

    The first problem bolt-on type systems have is what do you do with code that isn’t type checked? This is exactly the problem that Chia was having. In any bolt-on system you’re going to have a large body of code that will never have been type checked. Some non-checked code may simply be expensive to retroactively check- that code written by the guy who quit the company five years ago to go herd goats in Uganda, for example. Others may be outside the local sphere of control and impossible to check or fix- standard libraries, for example. In either case, retrofitting the code is a signifigant challenge.

    The other problem is more subtle- it’s the assumption that type checking is optional. Which rapidly leads to it being considered an annoying option. This is something most people miss (especially people who haven’t worked with good type systems ala Ocaml or Haskell)- the type system is not your enemy, to be tricked, subverted, or fought with. The type system is your friend, to be worked with.

    Whenever you are working with just about any static type system, you will hit a point where the type system will go “this code is incorrect”, when you know that it’s correct and it will work. You might even be correct and it will work. Especially in the latter case, there is an incredible temptation to simply say “well, OK, for this code, that I know is correct, I simply won’t type check it”. Which means you won’t ask the questions you should be asking- which are “why is this code construct a bad idea?” and “how else can I implement this functionality so that the type checker is happy?”

    When you start asking these questions, then you are on the way to working with the type system, not against it.

    Brian

  • Pingback: Enfranchised Mind » JConch’s CacheMap: Change of Tact

  • Pingback: The Cheap Sitcom Clip Scene Blog Post | Enfranchised Mind

  • Categories