Clojure‘s binding form is amazingly useful, but as with any very long length of rope, you can hang yourself in a cinch with it. So, let’s review a couple of traps that I’ve personally fallen into while using binding of which you should be aware.
This is super-simple, and it’s the first thing that one learns upon encountering binding for the first time, but you can get bitten by sloppily thinking that an established binding will migrate to another thread, or by not understanding the concurrency semantics of a function you’re calling within your binding form. Consider:
user=> (def *foo* 5) #'user/*foo* user=> (defn adder [param] (+ *foo* param)) #'user/adder user=> (binding [*foo* 10] (doseq [v (pmap adder (repeat 3 5))] (println v))) 10 10 10 nil
So, we have a var
*foo* holding a default value, and a function
adder that just adds its argument to the current thread-local value of
*foo*, returning the result. This is obviously just illustrative; you can assume that
adder is a function call into an opaque library you’re using that takes some arguments and perhaps pulls some configuration or other data from the values bound into some var it specifies as being part of its API.
The problem here is that
adder is being invoked in threads other than the thread that is establishing the binding on
*foo*; therefore, the value of
adder is always the default, 5.
The lesson? Bindings do not migrate across thread boundaries. One of the great things about Clojure is you can “do concurrency” using a variety of easy-to-use primitives (e.g.
pmap is absolutely the cat’s nuts, in that it’s a dead-simple way to almost-transparently parallelize computation over a dataset). The ironic downside to that is that whereas thread boundaries are painfully obvious in other languages because of all the ceremony one needs to go through to get results, things like
pmap have so little ceremony that it’s easy to forget the basics.
One solution to the problem illustrated above would be to change the implementation of
adder so that it’s explicitly capturing the bound value of
*foo*, and returning a new function that does the adding using that binding:
user=> (defn make-adder  (let [foo-value *foo*] #(+ foo-value %))) #'user/make-adder user=> (binding [*foo* 10] (doseq [v (pmap (make-adder) (repeat 3 5))] (println v))) 15 15 15 nil
Parenthetically, it’s very much worth noting that all of the wonderful ref/transaction machinery in Clojure is implemented using thread-local bindings. That means that if you try to
pmap a function across some set of refs in the course of a transaction (or otherwise attempt to poke at refs in a concurrent environment), things will go very wrong for you. There are ways around this, but they (last I checked) involve manually copying the thread-local bindings associated with any running transaction across thread boundaries – in general, it’s not worth the hassle.
Lazy seqs often escape the scope of
binding forms, so capture the value of any bound vars you care about explicitly
As wonderful as lazy sequences are, how and when they dereference bound vars isn’t always obvious, and is entirely dependent upon how and when those lazy sequences are used/materialized. Consider, assuming
*foo* is bound to 5 by default as in our first example:
user=> (defn some-fn  (lazy-seq [*foo*])) #'user/some-fn user=> (binding [*foo* 10] (some-fn)) (5)
What’s going on here? The
lazy-seq macro returns a lazy sequence, which will evaluate the sequence-producing form provided to it on demand – in this case, after the
binding form has returned, therefore ensuring that
*foo* has reverted to its default value.
This may become clearer with this example:
user=> (binding [*foo* 10] (doall (some-fn))) (10)
doall forces the full evaluation of a lazy sequence – and in this case, because that evaluation is being performed within the
*foo* and the returned sequence is found to have the value we expect.
These are obviously simplistic examples; the real-world scenario that this applies to is where you might be writing a library, and part of that library’s public API are some number of bindable vars that callers can use to configure the behaviour of the library’s functions, etc. This is super-useful, especially for libraries where there are a ton of knobs and levers: rather than forcing callers to provide a configuration object on every function call (and therefore forcing you to thread that configuration through all helper functions, etc), using bindings for such things allows callers to only change the defaults they care about, and allows you to code the implementation of the library in a straightforward way.
The lesson? If you are going to use bound values of vars, you need to make sure you capture those bindings before returning any lazy seqs that use those bound values. Aside from using
doall as shown above (which defeats the point of using lazy seqs), the solution looks a lot like the
make-adder function from the first section (notice a trend?):
user=> (defn some-fn  (let [foo-val *foo*] (lazy-seq [foo-val]))) #'user/some-fn user=> (binding [*foo* 10] (some-fn)) (10)
some-fn is now explicitly capturing the bound value of the
*foo* var; this ensures that, regardless of when and where or on which thread the lazy seq is materialized, the values it contains are what were bound by the caller of
some-fn. This is almost always what you want to have happen.
Too many do not fully realize the degree of flexibility that vars and
binding provide to the capable programmer. As is often the case though, power comes with responsibility, and whether one is writing libraries, using them, or casually using
binding in localized ways in application code, it needs to be handled with care.