Gradual typing for Clojure

dissoc

Today’s programming landscape is dynamic and polyglot. Interlanguage interoperability is a mainstay — it’s almost unavoidable that a given piece of code will be used without the knowledge of the original author.

So it’s surprising to learn most static type systems completely ignore the pragmatics of running statically typed code in the real world. Who can use this code? Can they break the type invariants? These questions are answered through the narrow vision of compile-time sources, without considering what can actually happen at runtime.

Gradual typing is designed to keep static invariants reliable in the real world — it preserves the invariants of well-typed code beyond the compile-time sandbox type checkers assume.

Crowdfunding campaign

I am enhancing Typed Clojure with gradual typing. You can support me by funding or sharing the associated crowdfunding campaign. To learn more, keep reading.

Gradual typing cares deeply about interlanguage invariants. While static type systems define and verify invariants within a language, gradual type systems preserve these invariants across language boundaries.

Gradual typing is widely applicable to many combinations of languages. There is usually a “more typed” and respectively “less typed” language — Idris and Haskell are a plausible combination.

The role of a gradual type system is to protect a language’s static invariants from foreign code. Often this means protecting a typed language from an untyped language.

We now explore the details of gradual typing.

It takes two to tango, so let Typed Clojure be our “typed” language — the language with finer grained static invariants — and Clojure our “untyped” language.

The main feature of a gradually typed language is the runtime mediator. As the gatekeeper to the typed land, it uses runtime contracts to ensure typed and untyped values play nicely.

You can think of each land as a file written in the corresponding language.

We represent the mediator as an orange line separating “typed land” (left) and “untyped land” (right).

Language boundary

Gradual typing was independently invented around 2006 by four papers. Siek & Taha’s “micro” gradual typing applies to single expressions, while Tobin-Hochstadt & Felleisen’s “macro” approach only applies to entire modules. We concentrate on the latter, as featured in Typed Racket.

Whenever a value — typed or untyped — crosses the language boundary, it is the mediator’s job to apply an appropriate contract.

We represent an untyped value with the Clojure logo. When the mediator wraps some untyped value in contracts, we represent this as a ring of orange.

Untyped example

Similarly, the Typed Clojure logo stands for a well-typed Typed Clojure value, with contracts represented by a ring of orange.

Typed example

Importing untyped values into typed land requires a contract wrapping the untyped code.

Import untyped boundary

The goal is to restrict the untyped value to behave exactly as the annotation ascribed to it by the programmer, otherwise throw a contract violation. Now the typed language can “forget” the value is untyped and treat it like any other typed value.

An exported typed value also needs a contract.

Export typed boundary

Isn’t the typed value already type-safe by definition? Yes. Why do we need a contract? All usages in typed land are type checked, but exporting a typed value leaves it vulnerable to unknown code. To preserve the static invariants for typed value in untyped land, we need a contract.

That’s the extent of the runtime mediator’s job at the language boundary. Pretty straightforward really.

Of course, the devil’s in the details — especially (as usual) with functions. What is a function contract? How do function contracts protect typed functions? Or verify untyped functions? Let’s take a closer look at some invocation sites to appreciate some of runtime mediator’s choices.

The simplest case is an untyped function applied to an untyped value.

Untyped invoke untyped

Recall the goal of gradual typing — to ensure the invariants of typed code are preserved.

The output is simply an untyped value. Since there is no interaction with typed code, we don’t need to check inputs or outputs.

Similarly, there’s not much to do when typed code is applied to typed code.

Typed invoke typed

By elimination, this situation can only occur in typed land — if it were in untyped land, then both the function and the argument would be wrapped in contracts by the mediator as they crossed the language boundary. Therefore, this invocation has been type checked already and requires no further validation.

Is it possible for a typed function to be applied to an untyped value — without contracts?

Bad typed invoke untyped

No — at least one of these values must have contracts from crossing the language boundary.

In the first situation, the typed function has crossed the language boundary into untyped land.

Export typed boundary

Now it’s plausible to apply the operator.

Gradually typed invoke untyped

What kind of contract is attached to the typed function? Checking the input is very similar to the mediator accepting an untyped value — it is wrapped with a contract.

This picture represents an untyped value flowing into a (contracted) typed function. While similar to the mediator, don’t confuse the two.

Import untyped boundary

Now notice the return value of the invocation. A typed function returns a typed value, right? Yes. Why is the output contracted? We need to protect typed values from further interactions with untyped code.

Say we return a typed function.

(fn [n :- Int] :- Int
  (+ 1 n))

The following can happen if the function is left bare in untyped land.

((fn [n :- Int] :- Int
   (+ 1 n))
 1.2)
;=> 2.2

Clearly this is bad. 1.2 is not an Int. Neither is 2.2. Adding contracts preserves the static invariants.

((fn [n :- Int] :- Int
   {:pre [(integer? n)]}
   (+ 1 n))
 1.2)
;=> Contract violation ...

It turns out we can elide the contract on the return value of typed functions if they are flat values, that is a value that can be checked immediately — like integers, strings, or immutable data structures. If the return type is a function like [Int -> Int] or anything that might require a higher-order contract like Any, we must be proactive and add a contract.

The second case where a typed function can be applied to an untyped value is in typed land —

Typed invoke gradually untyped

— where the untyped value was imported into typed land via the mediator.

Import untyped boundary

Remember, typed code can consider an untyped value as typed if the mediator assigns it a contract. So, if the argument is now typed, we require no more runtime checking — the invocation is statically typed.

The final two cases are similar.

Firstly, exporting a typed value can be used by an untyped function.

Untyped invoke gradually typed

Secondly, importing an untyped function wraps it in a function contract, which ensures the plain typed parameter is protected before it enters the untyped function.

Gradually untyped invoke typed

The return value also needs checking on each invocation — we can’t trust an untyped function to do the right thing.

Finally for comparison, here are the interesting combinations.

Gradually typed invoke untyped

Typed invoked gradually untyped

Gradually untyped invoke typed

Untyped invoke gradually untyped


In practice, a gradual typing system must consider much more than what we’ve outlined here.

If you want to support the effort to bring gradual typing to Clojure, please fund the crowdfunding campaign.

19 Jun 2015