This post was provoked by a real event,
unlike many of my posts which are born of pure imagination.
Javascript distinguishes between null
and undefined
. A cursory Google suggests that null
is meant to be used by the programmer to represent an intentional null value and undefined
is supposed to be a value that “doesn’t exist”. But this is a pretty vaguely worded distinction, and it lives entirely in the programmer’s mind and culture. The compiler has no idea what the difference between the two use cases is.
I was using a webpage a while ago and noticed that one of their buttons didn’t work. (It was a pretty important button too, rendering the page completely unusable.) On a whim, I decided to open up the Firefox dev tools console and see whether there was some kind of error. Sure enough, TypeError: paymentMethod is undefined
showed up in the log, with a handy line number attached. I clicked on the line number and scanned through the code, wondering if the fix would be obvious, and… wait a minute, they’re checking whether paymentMethod
is undefined just a few lines above! Control flow should never hit here unless paymentMethod
has a value…
Oh wait. The check is paymentMethod !== null
. And sure enough, undefined == null
evaluates to true
, but undefined === null
evaluates to false
. They checked to make sure paymentMethod
wasn’t null
, and then acted as if they knew paymentMethod
was neither null
nor undefined
— relying on assumptions that were not provided by the check they made.1
if (x !== null) ...
is an insidious mistake, because it looks like within its scope you should be safe. Even an experienced programmer could be forgiven for allowing their eyes to gloss over it, and proceeding to assume that in the following code block x
has a value.
Some language features can demand an unreasonable amount of your attention.
A vast maze of choice
Clojure has two falsy values, false
and nil
. false
is a pretty straightforward boolean; nil
is a bizarre mashup of falsity, failure, every kind of empty sequence, and Java’s null
.
It just so happens that idiomatic Clojure uses a lot of functions that return “either nil
or some non-nil
value” as pseudo predicates, since you can conditionally branch on any kind of value.
But nil
corresponds to null
on the Java side, so it generally shouldn’t be used with Clojure’s java interop. I once had to write (if x x false)
2 instead of x
, because we had a function that called out to Java and therefore wasn’t allowed to return nil
. (I’m lucky the function’s docstring said as much, or it could have taken me weeks to figure out what was going wrong.)
And then there’s the fact that a lot of Clojure data is represented by key-value maps. What’s the difference between {:some-property false}
, {:some-property nil}
, and {}
?3 Are they all allowed? Do you prefer one form over the others? You decide!
Or well, you try to decide. But it’s not really your choice: it’s your codebase’s choice. Every time someone checks to see whether a given map has :some-property
or not, they’re making an implicit choice as to what those three things mean. And if sometimes people are sloppy, because in the heat of the moment one interpretation was obvious and the others never even came to mind… well, you just have to hope that all of these accidental, unconscious choices happened to match up.
(Even though you know in your heart that they don’t. They never do, for the universe is cruel.)
Whatever shall we do?
If you’ve been reading my blog for very long, you might expect me to now embark on a rant about how Haskell does it better. This is, in fact, what I’m going to do.4
For very simple sources of failure, there’s the Maybe
type, which I have probably written a lot about (and if it doesn’t have its own dedicated blog post it will probably get one eventually). Maybe
is useful when there’s only one kind of failure possible (or there might as well be only one for all you care), and when you want to respond to that failure just by abandoning the control flow that led to the error. When you’re in this situation, you can use Maybe
to introduce a single failure value, guaranteed to be distinct from any of the “success” values, which basically has the behavior of causing control flow to be abandoned whenever something tries to operate on it.
What about when you want to have multiple failure values, with subtle distinctions between them?
Well, first of all, think twice. Dealing with subtly different shades of mu can be a huge headache.
But if you really need to, Haskell has an idiomatic way of easing the headache. First, define an explicit data type enumerating the ways something might have failed: data Error = KeyboardNotFound | SegmentationFault | PEBKAC
. Then you use the Either
type, which is like Maybe
except that instead of one failure value there are many. (In particular, Either errorType
introduces as many failure values as there are values of errorType
). But while there are many different failure values, there’s a bastion of order and sanity for you to cling to: all of the different ways of failure are guaranteed to have the same effect on control flow (namely, the abortion of control flow). And it is guaranteed to be distinct from the effect of successful values on control flow, which are in turn all the same as each other.
And most importantly, there is a single way to check for this composite notion of failure, and this check is easy to remember and doesn’t hide in a room full of distracting wrong alternatives.
- this is one of the more important, nonobvious things a static type system can give you. A really good type checker can, to some degree, infer what assumptions a code block relies on and what properties a dynamic check verifies. If these things don’t match up, as in this javascript case, you can get an error at compile time. ↩
-
I now realize that this could be replaced by the equally horrible but more compact
(or x false)
. ↩ -
It gets even worse if you think about lists instead of booleans.
false
vsnil
vs “not present” works out the way you’d expect barring occasional subtle interactions, but[]
vs()
vs{}
vs#{}
vs “not present” requires a lot more care — especially since almost all functions that act on collections return seqs5 (like()
) regardless of what the input was, and becausenil
can be treated to some extent like an empty collection of any built-in collection type. ↩ - note, however, that this section glosses over a few things. It is in praise of Haskell’s ideal philosophy, which I do believe is closer to perfection than any language before it, but which Haskell does not unfailingly embody. I’m happy to talk about the various unfortunate departures Haskell makes, but in this section of this blog post a lengthy discussion of such would merely distract from the point. ↩
-
except, ironically, for
seq
, which coerces the input collection to aseq
unless the input was empty in which case it returnsnil
. ↩