What’s the difference between undefined in Haskell and null in Java?

Question

What’s the difference between undefined in Haskell and null in Java?

Ok, let’s back up a little.

“undefined” in Haskell is an example of a “bottom” value (denoted ⊥). Such a value represents any undefined, stuck or partial state in the program.

Many different forms of bottom exist: non-terminating loops, exceptions, pattern match failures — basically any state in the program that is undefined in some sense. The value undefined :: a is a canonical example of a value that puts the program in an undefined state.

undefined itself isn’t particularly special — its not wired in — and you can implement Haskell’s undefined using any bottom-yielding expression. E.g. this is a valid implementation of undefined:

 > undefined = undefined

Or exiting immediately (the old Gofer compiler used this definition):

 > undefined | False = undefined

The primary property of bottom is that if an expression evaluates to bottom, your entire program will evaluate to bottom: the program is in an undefined state.

Why would you want such a value? Well, in a lazy language, you can often manipulate structures or functions that store bottom values, without the program being itself bottom.

E.g. a list of infinite loops is perfectly cromulent:

 > let xs = [ let f = f in f 
            , let g n = g (n+1) in g 0
            ]
 > :t xs
 xs :: [t]
 > length xs
 2

I just can’t do much with the elements of the list:

 > head xs
 ^CInterrupted.

This manipulation of infinite stuff is part of why Haskell’s so fun and expressive. A result of laziness is Haskell pays particularly close attention to bottom values.

However, clearly, the concept of bottom applies equally well to Java, or any (non-total) language. In Java, there are many expressions that yield “bottom” values:

comparing a reference against null (though note, not null itself, which is well-defined);
division by zero;
out-of-bounds exceptions;
an infinite loop, etc.

You just don’t have the ability to substitute one bottom for another very easily, and the Java compiler doesn’t do a lot to reason about bottom values. However, such values are there.

In summary,

dereferencing a null value in Java is one specific expression that yields a bottom value in Java;
the undefined value in Haskell is a generic bottom-yielding expression that can be used anywhere a bottom value is required in Haskell.

That’s how they’re similar.

Postscript

As to the question of null itself: why it is considered bad form?

Firstly, Java’s null is essentially equivalent to adding an implicit Maybe a to every type a in Haskell.
Dereferencing null is equivalent to pattern matching for only the Just case: f (Just a) = ... a ...

So when the value passed in is Nothing (in Haskell), or null (in Java), your program reaches an undefined state. This is bad: your program crashes.

So, by adding null to every type, you’ve just made it far easier to create bottom values by accident — the types no longer help you. Your language is no longer helping you prevent that particular kind of error, and that’s bad.

Of course, other bottom values are still there: exceptions (like undefined) , or infinite loops. Adding a new possible failure mode to every function — dereferencing null — just makes it easier to write programs that crash.

Leave a Comment Cancel reply