Static Function Arglist Mismatch

An unsettling amount of my time working at Prismatic has been spent looking for errors whose ultimate cause turns out to be a function statically applied to the wrong number of arguments. By this I mean that you can tell that the function is applied to too many or few arguments just by looking at the function definition and call site, without doing any sophisticated control flow analysis or reasoning about the values of any variable. For example:

(defn foo [x y z] ...)
...
(foo x)

Things aren’t always this easy, of course. If we want to be able to check something like this trivially, there are a few wrinkles:

Declaration site


The function needs to have a static number of arguments at the declaration site — variadic functions are harder to check. If the declaration looks like (defn foo [x y z] ...), the compiler should be able to know exactly how many arguments the function takes. But if it’s like (defn foo [x y z & rest]) instead, knowing whether a given number of arguments is an appropriate input is much harder and probably impossible in general.

Call site


The function also needs to be called with a static number of arguments — which is to say, we need to provide an argument list in the source code rather than using a list variable. (foo 5 6 7) is statically applied to three arguments, so if its declaration statically requires three arguments we know this code is wrong. But something like (apply foo list-of-arguments) applies foo to a dynamic number of arguments, and we have to find out the length of list-of-arguments to know whether the call is correct. This is of course impossible in general.

The entire codebase


This one is probably less of an issue in practice than the other two, but it’s much harder to check for, and I expect that it is the main reason function argument counting is not usually implemented.

Basically, suppose we have something like this:

(defn foo [x y z] ...)
(defn foo [x y] ...)
(foo 5 6)

This is perfectly legal in Clojure and in most other languages, and it means that we can’t just nominate a single declaration form as the definitive place to find foo‘s argument count. And while in this particular case it’s pretty obvious that the function being called has an arity of exactly 2, there are obfuscations we can do to make it less clear whether a function has been redefined between its first definition and any particular call site. So whereas with the other two restrictions you can check whether they apply and turn off the static checking, it is certainly very difficult and maybe impossible to determine whether a definition has been superseded, and that means that static argument counting can disallow valid code.

(defn foo [x y z] ...)
;; something here that redefines foo
;; so that it can take two arguments
(foo 5 6) ;; this is valid!

If the compiler can’t tell that this unspecified code in the middle is redefining foo… it will disallow this code even though it’s technically valid and won’t crash.


So does this mean that in a dynamic language we can’t have the extremely helpful sanity checking of counting a function invocation’s arguments and comparing that to the definition? Actually, no!

The thing that really prevents us from performing this check is the third caveat, simply because it’s difficult to detect whether the caveat is applicable or not. But this caveat doesn’t inescapably apply to all dynamic languages: it’s a side effect of tying the top level namespace to the execution model. If top-level functions were immutable1, there wouldn’t even be the possibility of caveat 3, and we could safely protect programmers from passing the wrong number of arguments.

Of course, if you’re not willing to disallow the redefinition of top-level names, there are still ways to perform the check. They’re not quite as obvious, and I don’t actually know any of them, but I’m certain it’s possible because the SBCL compiler for Common Lisp can very often emit warnings if you call a function with too few arguments.

It would be really nice if languages like Clojure and Python could take a leaf out of SBCL’s book and start doing some straightforward checking of function argument count. It’s not applicable in every case, but where it doesn’t work it can just be disabled with no downside. And over the rest of my life, I calculate that it will save me seven years and seven days of trouble.


  1. technically, if top-level variables were immutable. Clojure functions already are in the sense that once you have a reference to a function it’s set in stone, but if you’re calling a function by name (which is what happens almost all of the time), the referent of that name (and hence the meaning of the function call) can still change. 
Advertisements

Type-directed termination

I’ve been distracted from Serpent lately, not just by work but by a fascinating theoretical idea.

The Inspiration


A couple months ago I came across this great paper1 detailing a way to do guardedness checking in a type system.

(A quick detour, because there are going to be a lot of strange words here. “Coprogramming” is a fancy way to talk about programs that are supposed to be infinite loops (like an operating system or game). “Guardedness” is a property of coprograms that can be used to test whether your coprogram is “productive”, which is a particular way for it to be well-behaved. For more information, see previous posts, and if I ever find a nice introductory article I will announce it somewhere.)

Anyway. Type-directed guardedness checks are great because traditional productivity checkers are based on syntactic guardedness, which means (roughly) they just consider the program’s text without thinking about what’s actually happening in the program. This is okay in the sense that it never produces any false positives: if a syntactic guardedness check returns “true” for your program, then your program is definitely “well-behaved” (in the specific sense of productivity). But failing to take into account the program’s semantics means we misclassify a lot of well-behaved things as badly behaved. Usually false negatives can be rescued by rewording them so that the guardedness is obvious to the compiler, but this often produces code that is harder to write and harder to read.

Atkey and McBride’s framework doesn’t directly take program semantics into account, but it does bring productivity checking into the type system, which is almost as good. Basically, each infinite structure has a type that represents “how much of it has already been completed”, and this allows the type system to control how much of the structure you’re allowed to use. Basically, an incompletely constructed infinite sequence has a type that only supports accessing the pieces that have already been definitely placed, but after you’ve constructed it (i.e. outside the definition) it has a type that supports retrieving arbitrary pieces. This ensures that information never depends on Information From The Future, which means your infinite data structure is “productive” or well-behaved.

So that’s why Atkey and McBride’s paper is exciting. What’s this idea of mine that I mentioned in the introduction?

Termination checking


So, it turns out that termination checking (proving that a computation that isn’t supposed to be an infinite loop, well, isn’t an infinite loop) is also traditionally done using syntactic checks. This has the same drawbacks as productivity checking: lots of false negatives, and rewriting the false negatives so that the checker can see they’re well-behaved gives you code that’s a lot uglier and more distracting.

Now it might be obvious from the way we name these things that type theorists consider recursion and corecursion to be closely related. (Precisely, in category theory the relationship is called being “dual” to each other. As I am not an expert category theorist, I won’t go too far into that in this post.) The relationship is pretty simple to verbally define, but its implications are not necessarily easily to grasp: I can’t quite visualize what arrows I need to flip to transform one into the other. But I have the powerful intuition that, if I can figure out how to “flip the arrow” to turn recursion into corecursion, I can use the same “arrow-flipping” procedure on the productivity checking method described in this paper and get…

a method of termination checking. One that’s type-directed rather than relying on dumb syntactic checks.

This is pretty exciting, because when programming in Idris I upset the termination checker a lot more often than the productivity checker. It’s exciting because here is an academic problem that I don’t know the answer to, and it’s been a while since I’ve seen one of those. It’s exciting because as far as I can tell from a cursory Google Scholar search, no one has done this yet.

A part of me would really love to reread this paper a couple more times, until I understand exactly what they’ve done and (hopefully) how they came up with the idea. That part of me would go on to perhaps do some research on the category-theoretic formulation of recursion and corecursion, so that she can visualize the arrow to flip to turn one into the other. That part of me would be ecstatic if it turned out there was a publishable paper somewhere in here, but even if it turns out someone else has already done it, she’d feel that the independent discovery was worth her time, if only as a recreational activity.

Harsh Reality


Perhaps my intuition is mistaken, and there is no straightforward transformation of this productivity checker into a termination checker. But before I even think about that, I need to remember Serpent.

If I really wanted to, I’m sure I could convince myself that abandoning Serpent is the right choice here. And I’m sorely tempted, but… I’m trying to set a precedent for myself. There are important skills that I need but don’t have, and I’m not going to develop those skills by returning to my old ways of abandoning projects every time I come across an idea that excites me.

So as much as I’d love to drop back into type theory, and think about clock variables and flipping arrows… I think it’s pretty important that I finish Serpent first.


  1. Atkey and McBride, Productive Coprogramming with Guarded Corecursion 

C is declarative, kind of

Inconsequential musing of the day:

If you look at say, a Python file — or Bash or Ruby or almost any other scripting language — you can’t find any information1 about the names defined in a program without actually running the script. This is because in Python, the only way to associate a name with a value is to execute a line of code that causes it to get some value (maybe an assignment statement, maybe a function definition, maybe an eval that contains one of these things). The association of top-level names to values is inherently a procedural process, defined only by the language’s execution model.

Contrast this with the situation in C. C has actual declarations, which associate certain information to a name completely independently of the execution model, and in fact the execution model usually doesn’t2 cause values to become associated with top-level names at all. C’s top-level namespace is specified declaratively: it states directly which names are associated with which values.

Does this matter? In practice, probably not much. In theory, a declarative program layout should be easier to manipulate through automated tooling (for example, to list out the documentation for all of the top-level names defined in a module/file without running any of its code). But most reasonable Python programs are pretty declarative at the top level anyway: you rarely see things like

# webserver.py
turing_machines = [...]

if does_it_halt(turing_machines[0]):
    def post(blah, bleh):
        blee
if does_it_halt(turing_machines[2]):
    def get(bloo):
        bluh

globals()[running_time(turing_machines[1])].__doc__ =
  ''.join(accepted_language(turing_machines[1]))

On the other hand, languages like C do make it a lot easier to catch typos, which I consider to be an appreciable benefit. Not that typos are a particularly difficult to track down class of bug.

But mostly I just found it interesting to think about how C’s model is, in a way, more declarative than lots of more modern languages.


  1. In general. Of course most specific scripts aren’t quite so horrible. 
  2. I think there are a couple exceptions to this having to do with extern and/or static top level variables and perhaps also function pointers (can those be defined at the top level?). But the top level of a basic C program consists entirely of function declarations and no assignment. 

Impressions of Mercury

I’ve been using Mercury for a while and I think I’ve gotten enough experience to talk a bit about my first impressions.

  • state variables — useful but hacky. Basically, Mercury’s version of haskell’s side effect sequestering involves threading a useless unique variable through all IO predicates (much like GHC’s implementation of Haskell’s IO monad, actually). So !IO (the name can be anything, not just IO) is preprocessed into something like IO1, IO2, except that the next time you write it it becomes IO2, IO3 and so on. The entire exercise seems a bit pointless to me since it’s all a hack to get things to occur in the order you wrote them. It’s no worse than any other language, though — it just doesn’t tickle my aesthetic sense of “a principled way to handle side effects”.
  • DCG — a bit like state variables. Instead of writing foo(A, B, !Var) :- do1(!Var), do2(B, BOut), do3(BOut, A, !Var)., you write foo(A, B) --> do1, { do2(B, BOut) }, do3(BOut). It’s not clear why you need to have both of these, or why it’s preferred to use state variables instead of DCG for writing IO predicates. DCG is kind of fun to write though.
  • I wish there was a way to get multiple clauses to desugar to nested if statements instead of plain disjunctions. The haskell style where you write f ConstructorOne = ...; f (ConstructorTwo bleh) = ...; f _ = ... produces nondeterministic or multi-solution predicates, when you usually want either deterministic or semideterministic (at most one solution).
  • Thinking about determinism is fun, though. See Typechecker-directed Epiphanies — I’m getting to the point where I understand how the determinism checker works.
  • Modes are interesting. Essentially, any given predicate might have several different arguments that make sense as “outputs”. In prolog, if I remember correctly, you could just use any argument as an output, even if the code you wrote to define the predicate causes an infinite loop when you try to solve for that particular variable. In Mercury, you have to declare some set of modes for each predicate — a mode is basically “If you want to solve for this set of variables, you need this other set of variables to have a known form, and this mode of the predicate has at least (zero|one) solution and (can|can’t) have multiple solutions.” And then, this is the cool part: Mercury checks to see that each mode you declare is valid. So if you say :- mode Foo(in, out, out) is nondet. :- mode Foo(in, out, in) is semidet., Mercury will actually check that Foo can provide zero or more values for the second and third arguments if the first argument is known, and that it can provide zero or one value for the second argument if the first and third are known. And this is just scratching the surface; there’s even cooler things you can do by defining custom “insts” (instantiation states). I could probably write a whole blog post about how amazing modes are.
  • I’m not actually making very good use of the logic programming style — almost all of my predicates are deterministic or semideterministic and could easily be rewritten as functions. It’s kind of fun to say eval(Arg, Env, ArgOut), eval(Body, [Arg | Env], AppOut) instead of AppOut = eval(Body, [eval(Arg, Env) | Env]), but it’s a superficial kind of fun, like wearing a halloween costume instead of normal street clothes. I wonder if this is because I’m just not good at taking advantage of logic programming or because logic programming isn’t very good for this kind of problem?
  • Unfortunately, a lot of the cooler prospects of the languages are still unimplemented. Things like unique modes on data type fields and partial instantiation states are just not supported yet — as I understand it, if you try to write code that uses these features, you’ll get a compile-time or runtime error. This is really disappointing, but luckily the language as it exists now is already pretty nice to program in.

Typechecker-directed epiphanies

As I get more experience with powerful type systems like that of Idris and Mercury, I’m noticing a strange phenomenon. These type checkers deal with powerful logics, but the checkers themselves are not very smart; they can only verify a property if it’s syntactically obvious. And if you don’t have much experience with such a type checker, and haven’t learned the quirks of exactly what “syntactically obvious” means, programming in that language can be a chore. It’s been a while since I felt lost in Idris, but since lately I’ve been learning Mercury, I’m getting reacquainted with the despair of “Why won’t you accept this obviously-correct code, compiler?”

But as you get more familiar with what exactly the compiler checks, your relationship with it starts to change. The question changes from “Why can’t the stupid compiler see this?” to “Well, how do I know my code is correct?” And something that very often happens is, once I’ve thought for a few moments about how I know what’s going on, it’s immediately clear how to rewrite the code so that the compiler knows what’s going on.

But more importantly: sometimes, when I’ve taken a moment to think rigorously, I realize that my code isn’t correct. Maybe the Mercury predicate that is supposed to be deterministic actually has multiple solutions — not just the way I wrote it but the way I meant it, and I have to come up with a different plan for writing the predicate because my old one is flawed. And I would never have noticed it before because as far as I can tell, the code is obviously correct.

I admit that stupid typecheckers verifying complicated properties is a bit of a usability problem for people inexperienced with the typechecker. I have felt that pain and frustration, and I know it isn’t pleasant.

But I think it might be worth it. Once the pain goes away, you’re left with an obligation to strengthen your understanding of your code — to go from “This obviously does what I want it to do” to “Here, Stupid Compiler, you can’t understand my higher level thought processes, so let me walk you through it.”