String Interpolation in Clojure

Update: An updated version of the string interpolation function is now available as part of the core.incubator project in nestled safely the clojure.core.strint namespace (as of version 0.1.1-SNAPSHOT of core.incubator).  Seriously, use it instead.


It’s strange how some days or weeks have running themes. One theme for me this week programming-wise has been string interpolation:

  • I mentioned it in the #clojure channel on freenode earlier this week (sounds like Rich Hickey isn’t a fan of the concept in general, yet),
  • Miles and I talked about it some in connection with the Clojure templating system he’s been working on (plug: after recording another episode of the Strictly Professional podcast),
  • and just this morning, I noticed a post by Vassil Dichev about how one might implement string interpolation in Scala

I’ve become weary of format of late, and all of the other formats out there aren’t any more pleasant – variadic (and even keyword or named-argument) string replacement is just a dull tool compared to real interpolation.

The Scala implementation post was the last straw for me, especially because (with all due respect to the Vassil, as he’s doing very well with the materials he has at his disposal) it showcases so many of the aspects of Scala that I came to dislike in the course of using it for a year or so: the tortured syntax; the rope, nay, the barbed wire that is implicit conversions; the bear trap of traits.

A Clojure Implementation

OK, enough flame-bait. What I’m really here to do is show how easy it is to add string interpolation to Clojure, and how simple its implementation is:

(ns commons.clojure.strint
 (:use [clojure.contrib.duck-streams :only (slurp*)]))

(defn- silent-read
  [s]
  (try
    (let [r (-> s java.io.StringReader. java.io.PushbackReader.)]
      [(read r) (slurp* r)])
    (catch Exception e))) ; this indicates an invalid form -- s is just string data

(defn- interpolate
  ([s atom?]
    (lazy-seq
      (if-let [[form rest] (silent-read (subs s (if atom? 2 1)))]
        (cons form (interpolate (if atom? (subs rest 1) rest)))
        (cons (subs s 0 2) (interpolate (subs s 2))))))
  ([#^String s]
    (let [start (max (.indexOf s "~{") (.indexOf s "~("))]
      (if (== start -1)
        [s]
        (lazy-seq (cons
                    (subs s 0 start)
                    (interpolate (subs s start) (= \{ (.charAt s (inc start))))))))))

(defmacro <<
  [string]
  `(str ~@(interpolate string)))

Don’t mind the namespace – that’s just where we put extensions to Clojure-the-language. The public macro << (named as an homage to heredocs) takes a single string argument, and emits a str invocation that concatenates the string data and evaluated expressions contained within that argument.

Example Usage

First, let’s get a value we can refer to:

commons.clojure.strint=> (def n 99)

You can do simple value replacement:

commons.clojure.strint=> (<< "There's ~{n} bottles of beer on the wall...")
"There's 99 bottles of beer on the wall..."

And evaluate arbitrary code:

commons.clojure.strint=> (<< "There's ~(dec n) bottles of beer on the wall...")
"There's 98 bottles of beer on the wall..."
commons.clojure.strint=> (<< "There's ~(seq (range n 90 -1))
                              bottles of beer on the wall...")
"There's (99 98 97 96 95 94 93 92 91) bottles of beer on the wall..."

You can use any functions or macros you have available in your Clojure environment:

commons.clojure.strint=> (defn- some-function [] {:name "Chas" :zip-code 01060})
#'commons.clojure.strint/some-function
commons.clojure.strint=> (<< "My name is ~(:name (some-function)), it's nice to meet you.")
"My name is Chas, it's nice to meet you."

…including interop with Java methods:

commons.clojure.strint=> (<< "You have approximately ~(.intValue 5.5) minutes left.")
"You have approximately 5 minutes left."

Caveats

First, let’s say what’s wrong with this implementation compared to, say, Ruby’s string interpolation (I may be missing other points, I’m no Ruby hacker):

  1. Strings cannot be used within interpolated expressions; e.g. this will cause a straightforward parse exception:
    commons.clojure.strint=> (<< "~(str n "another string")")
    #<CompilerException java.lang.IllegalArgumentException:
         Wrong number of args passed to: strint$-LT--LT-
    

    The Clojure reader sees this as providing three arguments to the << macro. Being able to use strings within interpolated expressions would require a “native” Clojure reader macro for interpolated strings, or the ability to define reader macros in “userspace” (Clojure’s read table cannot be modified in Clojure code right now – this is an intentional design decision right now).

    Update: pmjordan mentioned on hackernews that you can get around this by escaping the nested strings, like so:

    commons.clojure.strint=> (<< "~(str n \" another string\")")
    "99 another string"
    

    Very true, and very useful in a pinch, but I would definitely consider it to be a wart (and an issue that is insurmountable from Clojure userland right now).

  2. Heredocs aren’t available. That’s a far more general shortcoming compared to other languages, but is still related to string interpolation. This is significantly mitigated by the fact that Clojure strings are multiline already, but it would be nice in some circumstances to be able to specify a block of text using different delimiters for one-off templating, etc.
  3. Lazy sequences need to be made strict in order for them to print as they do at a REPL (thus the additional seq invocation in the (range n 90 -1)) example above).

Advantages

I’m sure a lot of people will look at this implementation and say, “so what?”. Well, it’s got a lot going for it:

  1. Simple implementation. Unless you’ve got a Pavlovian aversion to parentheses (but are somehow immune to piles of braces?), it’s very comprehensible.
  2. It’s user-land code. Many languages would require a compiler extension or modifications to the language core to pull this off.
  3. The interpolation happens at compile-time! The only processing that occurs at runtime is the concatenation of the chunks of each string, but all of the string and expression parsing happen before your code using the << macro would hit a customer’s server or desktop. This is decidedly in contrast with the Scala interpolation implementation, where all of the string parsing is done at runtime; to my knowledge, doing anything else would require a compiler plugin there.
  4. It’s fully composible with all other Clojure code. There’s no restriction on where you can use the << macro, and no restriction on what Clojure (or Java!) code you can include in interpolation expressions.
  5. There’s no magic. Many languages make it very easy to inject magical – as in, opaque – behaviour into your code. The Scala interpolation implementation is no different – to get that special behaviour out of a String, one must call a magical method i in order to rope in the machinery around the InterpolatedString implicit conversion. On the other hand, all of the effects and actors involved in the << macro are local, and its semantics and calling conventions are exactly the same as any other Clojure macro.

Exhale…

So, hopefully that puts string interpolation behind me. I’d love to see something like this become a reader macro in Clojure someday (maybe in conjunction with heredoc support), but in the meantime, this will make a lot of one-off templating jobs a whole lot easier in Clojure compared to using the usual variadic string replacement methods that are otherwise available.

About these ads
This entry was posted in Clojure, Scala. Bookmark the permalink.

4 Responses to String Interpolation in Clojure

  1. Vagif says:

    It may be ok as a learning excercise. But for everyday usage i would suggest StringTemplate library. It plays very nicely with clojure maps:

    (ns com.yourbusiness.templates
      (require [clojure.contrib.str-utils2 :as s])
      (import [org.antlr.stringtemplate StringTemplate]))
    
    (defn template
      [#^String txt #^java.util.Map context]
      (let [t (StringTemplate. txt)]
        (.setAttributes t context)
        (.toString t)))
    
    ;; Usage:
    (template "Hello $user$. Today is $date$" {"user" "Joe" "date" todays-date}) 
    
    • Chas Emerick says:

      StringTemplate is a great library, no doubt about it. However, there’s a lot to be said for having the template parsing happen at compile-time, being able to include arbitrary expressions (rather than explicitly lining them up as arguments, as with Clojure’s existing format function), and not having to depend on an additional library (i.e. you can either grab the code above and drop it into your project…and hopefully it or some variation of it will end up being included in Clojure itself, or maybe the contrib library(ies)).

  2. francoisdevlin says:

    cemerick, I’ve hacked both Ruby & Clojure, so I can comment on the way Ruby handles stings.

    Ruby has a concept of a ‘string’, as opposed to a “string”. ‘This is strict input, no escapes’ “This is input with escapes”

    I think that the suggested \” method is the best solution in Clojure.

  3. Pingback: Specifying default slot values for defrecord classes in Clojure | cemerick

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s