An unsettling amount of my time working at Prismatic has been spent looking for errors whose ultimate cause turns out to be a function statically applied to the wrong number of arguments. By this I mean that you can tell that the function is applied to too many or few arguments just by looking at the function definition and call site, without doing any sophisticated control flow analysis or reasoning about the values of any variable. For example:
(defn foo [x y z] ...) ... (foo x)
Things aren’t always this easy, of course. If we want to be able to check something like this trivially, there are a few wrinkles:
The function needs to have a static number of arguments at the declaration site — variadic functions are harder to check. If the declaration looks like
(defn foo [x y z] ...), the compiler should be able to know exactly how many arguments the function takes. But if it’s like
(defn foo [x y z & rest]) instead, knowing whether a given number of arguments is an appropriate input is much harder and probably impossible in general.
The function also needs to be called with a static number of arguments — which is to say, we need to provide an argument list in the source code rather than using a list variable.
(foo 5 6 7) is statically applied to three arguments, so if its declaration statically requires three arguments we know this code is wrong. But something like
(apply foo list-of-arguments) applies
foo to a dynamic number of arguments, and we have to find out the length of
list-of-arguments to know whether the call is correct. This is of course impossible in general.
The entire codebase
This one is probably less of an issue in practice than the other two, but it’s much harder to check for, and I expect that it is the main reason function argument counting is not usually implemented.
Basically, suppose we have something like this:
(defn foo [x y z] ...) (defn foo [x y] ...) (foo 5 6)
This is perfectly legal in Clojure and in most other languages, and it means that we can’t just nominate a single declaration form as the definitive place to find
foo‘s argument count. And while in this particular case it’s pretty obvious that the function being called has an arity of exactly 2, there are obfuscations we can do to make it less clear whether a function has been redefined between its first definition and any particular call site. So whereas with the other two restrictions you can check whether they apply and turn off the static checking, it is certainly very difficult and maybe impossible to determine whether a definition has been superseded, and that means that static argument counting can disallow valid code.
(defn foo [x y z] ...) ;; something here that redefines foo ;; so that it can take two arguments (foo 5 6) ;; this is valid!
If the compiler can’t tell that this unspecified code in the middle is redefining
foo… it will disallow this code even though it’s technically valid and won’t crash.
So does this mean that in a dynamic language we can’t have the extremely helpful sanity checking of counting a function invocation’s arguments and comparing that to the definition? Actually, no!
The thing that really prevents us from performing this check is the third caveat, simply because it’s difficult to detect whether the caveat is applicable or not. But this caveat doesn’t inescapably apply to all dynamic languages: it’s a side effect of tying the top level namespace to the execution model. If top-level functions were immutable1, there wouldn’t even be the possibility of caveat 3, and we could safely protect programmers from passing the wrong number of arguments.
Of course, if you’re not willing to disallow the redefinition of top-level names, there are still ways to perform the check. They’re not quite as obvious, and I don’t actually know any of them, but I’m certain it’s possible because the SBCL compiler for Common Lisp can very often emit warnings if you call a function with too few arguments.
It would be really nice if languages like Clojure and Python could take a leaf out of SBCL’s book and start doing some straightforward checking of function argument count. It’s not applicable in every case, but where it doesn’t work it can just be disabled with no downside. And over the rest of my life, I calculate that it will save me seven years and seven days of trouble.
- technically, if top-level variables were immutable. Clojure functions already are in the sense that once you have a reference to a function it’s set in stone, but if you’re calling a function by name (which is what happens almost all of the time), the referent of that name (and hence the meaning of the function call) can still change. ↩