Scala isn’t complicated; it’s clever

I’ve been away from Scala for a long while now — a little more than three years.  v2.7.1 was the last rev I used significantly, if memory serves.  I enjoyed my time with it, but it just wasn’t the best fit for me.

Anyway, I’ve learned that sometime in that era, Scala added support for the use of Unicode arrows as optional alternatives to the usual ASCII approximations it’s always had.  For example, this:

for (x <- 1 to 10) { ... }

is equivalent to this:

for (x ← 1 to 10) { ... }

The second example uses a Unicode instead of the ASCII <-; other arrows (corresponding to -> and =>) also have Unicode equivalents in Scala.  There are various other Unicode characters that have been proposed to serve similar roles, generally with positive discussion around them, including , , , , , , and .  This syntactic flexibility isn’t defined in a library, or in private code, but in the language itself.

I’m aghast.  This blog now sports a “WTF” category.

An early Scala-specific keyboard design.
An early Scala-specific keyboard design.

In the ticket that suggested the addition of one of these equivalencies, various potential problems are pointed out, and presumably ignored.  Perhaps people yearn for APL; unfortunately, suitable keyboards are not widely available.

I’ve no issue with Unicode identifiers: I work with a language that imposes no restrictions at all on the characters used, so programmers that want to use Japanese, Cherokee, Sanskrit or Greek characters in their identifiers can have at it.  Being able to opt in to such things is unequivocally good.  However, that language wisely tries to avoid doing clever things, like aliasing the core anonymous function operator to the λ character for everyone1.

There’s been various discussions, controversy, and gnashing of teeth about whether Scala is a complicated programming language.  I won’t get into that; that’s an argument over semantics that I don’t think is framed properly to begin with.

Maybe, though, most of us can agree that, if Scala isn’t complicated, it is at least clever, and things like this are just the most facile examples of that nature.  We are left to our personal biases as to whether or not that is commendable, desirable, or good.


1Discussions like this must mention Fortess, and its duality of language representation: one writes Fortress in ASCII, but the language defines a sort of markup within that character set so as to produce quite elegant renderings of code suitable for publishing.  See an example here.

String Interpolation in Clojure

Update: An updated version of the string interpolation function is now available as part of the core.incubator project in nestled safely the clojure.core.strint namespace (as of version 0.1.1-SNAPSHOT of core.incubator).  Seriously, use it instead.


It’s strange how some days or weeks have running themes. One theme for me this week programming-wise has been string interpolation:

  • I mentioned it in the #clojure channel on freenode earlier this week (sounds like Rich Hickey isn’t a fan of the concept in general, yet),
  • Miles and I talked about it some in connection with the Clojure templating system he’s been working on (plug: after recording another episode of the Strictly Professional podcast),
  • and just this morning, I noticed a post by Vassil Dichev about how one might implement string interpolation in Scala

I’ve become weary of format of late, and all of the other formats out there aren’t any more pleasant – variadic (and even keyword or named-argument) string replacement is just a dull tool compared to real interpolation.

The Scala implementation post was the last straw for me, especially because (with all due respect to the Vassil, as he’s doing very well with the materials he has at his disposal) it showcases so many of the aspects of Scala that I came to dislike in the course of using it for a year or so: the tortured syntax; the rope, nay, the barbed wire that is implicit conversions; the bear trap of traits.

A Clojure Implementation

OK, enough flame-bait. What I’m really here to do is show how easy it is to add string interpolation to Clojure, and how simple its implementation is:

(ns commons.clojure.strint
 (:use [clojure.contrib.duck-streams :only (slurp*)]))

(defn- silent-read
  [s]
  (try
    (let [r (-> s java.io.StringReader. java.io.PushbackReader.)]
      [(read r) (slurp* r)])
    (catch Exception e))) ; this indicates an invalid form -- s is just string data

(defn- interpolate
  ([s atom?]
    (lazy-seq
      (if-let [[form rest] (silent-read (subs s (if atom? 2 1)))]
        (cons form (interpolate (if atom? (subs rest 1) rest)))
        (cons (subs s 0 2) (interpolate (subs s 2))))))
  ([#^String s]
    (let [start (max (.indexOf s "~{") (.indexOf s "~("))]
      (if (== start -1)
        [s]
        (lazy-seq (cons
                    (subs s 0 start)
                    (interpolate (subs s start) (= \{ (.charAt s (inc start))))))))))

(defmacro <<
  [string]
  `(str ~@(interpolate string)))

Don’t mind the namespace – that’s just where we put extensions to Clojure-the-language. The public macro << (named as an homage to heredocs) takes a single string argument, and emits a str invocation that concatenates the string data and evaluated expressions contained within that argument.

Example Usage

First, let’s get a value we can refer to:

commons.clojure.strint=> (def n 99)

You can do simple value replacement:

commons.clojure.strint=> (<< "There's ~{n} bottles of beer on the wall...")
"There's 99 bottles of beer on the wall..."

And evaluate arbitrary code:

commons.clojure.strint=> (<< "There's ~(dec n) bottles of beer on the wall...")
"There's 98 bottles of beer on the wall..."
commons.clojure.strint=> (<< "There's ~(seq (range n 90 -1))
                              bottles of beer on the wall...")
"There's (99 98 97 96 95 94 93 92 91) bottles of beer on the wall..."

You can use any functions or macros you have available in your Clojure environment:

commons.clojure.strint=> (defn- some-function [] {:name "Chas" :zip-code 01060})
#'commons.clojure.strint/some-function
commons.clojure.strint=> (<< "My name is ~(:name (some-function)), it's nice to meet you.")
"My name is Chas, it's nice to meet you."

…including interop with Java methods:

commons.clojure.strint=> (<< "You have approximately ~(.intValue 5.5) minutes left.")
"You have approximately 5 minutes left."

Caveats

First, let’s say what’s wrong with this implementation compared to, say, Ruby’s string interpolation (I may be missing other points, I’m no Ruby hacker):

  1. Strings cannot be used within interpolated expressions; e.g. this will cause a straightforward parse exception:
    commons.clojure.strint=> (<< "~(str n "another string")")
    #<CompilerException java.lang.IllegalArgumentException:
         Wrong number of args passed to: strint$-LT--LT-
    

    The Clojure reader sees this as providing three arguments to the << macro. Being able to use strings within interpolated expressions would require a “native” Clojure reader macro for interpolated strings, or the ability to define reader macros in “userspace” (Clojure’s read table cannot be modified in Clojure code right now – this is an intentional design decision right now).

    Update: pmjordan mentioned on hackernews that you can get around this by escaping the nested strings, like so:

    commons.clojure.strint=> (<< "~(str n \" another string\")")
    "99 another string"
    

    Very true, and very useful in a pinch, but I would definitely consider it to be a wart (and an issue that is insurmountable from Clojure userland right now).

  2. Heredocs aren’t available. That’s a far more general shortcoming compared to other languages, but is still related to string interpolation. This is significantly mitigated by the fact that Clojure strings are multiline already, but it would be nice in some circumstances to be able to specify a block of text using different delimiters for one-off templating, etc.
  3. Lazy sequences need to be made strict in order for them to print as they do at a REPL (thus the additional seq invocation in the (range n 90 -1)) example above).

Advantages

I’m sure a lot of people will look at this implementation and say, “so what?”. Well, it’s got a lot going for it:

  1. Simple implementation. Unless you’ve got a Pavlovian aversion to parentheses (but are somehow immune to piles of braces?), it’s very comprehensible.
  2. It’s user-land code. Many languages would require a compiler extension or modifications to the language core to pull this off.
  3. The interpolation happens at compile-time! The only processing that occurs at runtime is the concatenation of the chunks of each string, but all of the string and expression parsing happen before your code using the << macro would hit a customer’s server or desktop. This is decidedly in contrast with the Scala interpolation implementation, where all of the string parsing is done at runtime; to my knowledge, doing anything else would require a compiler plugin there.
  4. It’s fully composible with all other Clojure code. There’s no restriction on where you can use the << macro, and no restriction on what Clojure (or Java!) code you can include in interpolation expressions.
  5. There’s no magic. Many languages make it very easy to inject magical – as in, opaque – behaviour into your code. The Scala interpolation implementation is no different – to get that special behaviour out of a String, one must call a magical method i in order to rope in the machinery around the InterpolatedString implicit conversion. On the other hand, all of the effects and actors involved in the << macro are local, and its semantics and calling conventions are exactly the same as any other Clojure macro.

Exhale…

So, hopefully that puts string interpolation behind me. I’d love to see something like this become a reader macro in Clojure someday (maybe in conjunction with heredoc support), but in the meantime, this will make a lot of one-off templating jobs a whole lot easier in Clojure compared to using the usual variadic string replacement methods that are otherwise available.

…recommended by 4 out of 5 surveyed seasoned programmers…

In a thread on the Google Group dedicated to discussing languages hosted on the JVM (i.e. Scala, Groovy, JRuby, et al.), it was asked by a fellow named Jon Harrop whether something like F# (an OCaml / Standard ML derivative that targets the .NET CLR) would find any traction if it were made available for the JVM. Well, some unremarkable discussion ensued about the costs associated with developing languages, how existing efforts attract funding, etc., and then things turned towards the question of “Why not just use Scala?”, since Scala does fold in a lot of functional programming primitives.

Mr. Harrop’s replies centered on various aspects of ML-style languages that he misses in Scala, and aspects of Scala that he finds irritating. All fine and good — hey, everyone has their own preferences — until he unveiled this nugget (emphasis mine):

OCaml and F# have shown that ML’s approach to structured programming using modules, variant types and pattern matching and extensive type inference is almost always preferable to OOP. When given the choice between OOP and FP, seasoned programmers rarely choose OOP.

Zealotry isn’t anything new — you can probably find inverse statements right now in some Smalltalk newsgroups, or someone agitating about the uniform superiority of s-expressions in a Lisp or Scheme forum. The odd thing about this is that Mr. Harrop is not exactly a random troll — he seems fairly well-respected in the F#/OCaml/ML community, is a prolific writer, and looks to be writing a book on F# for Microsoft Press.

Stuff like this makes the whole facade about software development being akin to engineering even more farcical than one might initially imagine. Can we please recognize that there is a difference between spirited advocacy and demagoguery? I’ve certainly been guilty of the latter on occasion (usually much to my later regret), but it’s particularly irksome to find those that are apparently unaware of the distinction at all.

Scala Makes Me Think

(…or, “Oh, Dear, Wasn’t I Thinking Before?”)

As my friends will attest, I really enjoy programming languages. I’m one of those language fetishists that talk about “expressiveness” and “concision”, and yes, I’m one of those very strange fellows who blurt out bad Lisp jokes while getting odd looks from innocent bystanders. And while my bread and butter is built in Java, I often find myself yearning for a more expressive language while deploying, customizing, or integrating PDFTextStream (there I go again with the “expressiveness” bit). That yearning can reach almost pathological extremes at times, prompting me to go so far as to sponsor projects that make it possible to use Java libraries (including PDFTextStream) from within Python.

Fortunately, things don’t always have to be so hard. Case in point, I recently dove head-first into Scala, a language that combines object orientation and functional programming into one very tasty stew. Scala has a number of characteristics that make it interesting aside from its merging of OO and FP mechanisms:

  • it is statically-typed, and provides moderately good type inference that enables one to skip most type declarations and annotations
  • it is compiled, which provides a minimum level of performance (sure, it’s actually byte-compiled, but let’s not quibble right now)
  • and the real kicker: it compiles down to Java class files (or .NET IL), thereby enabling it to be hosted on a JVM (or .NET’s CLR), and call (and be called by) other Java (or .NET) libraries

There’s a lot to like here, for programmers from many walks of life, and I could go on and on about how Scala has single-handedly created and filled a great niche of delivering most of the raw power of purely functional languages like Haskell and ML within a JVM-hosted environment with respectable performance. But what has really impressed me has been the way that Scala has improved how I work. In short, it’s made really think about development again.

I generally have two working styles. In a classic statically-typed environment (say, Java or C#), I tend to generate pretty clean designs, but my level of productivity is very low. I attribute both of these characteristics to the copious amount of actual work (i.e. finger-typing) that has to go into writing Java or C# code, even with the best of tools. See, while I’m typing (and typing, and typing), I’m thinking two, three, four steps ahead, figuring out the design of the next chunk of code. The verbosity of the language gives me time to reason about the next step while my fingers are working off the previous results.

In a dynamically-typed environment (say, Python or Scheme), I tend to be extraordinarily productive, but unless I consciously step back and purposefully engage in design, the code I write is much more complex. In such environments, there’s less finger-typing going on, so I don’t have a natural backlog allowing me to think about the code before it’s already on the screen. Further, I know I can get from point A to point B relatively easily in many circumstances, so I end up skipping the design step, switching into Cowboy Coder mode, and hacking at things until everything works. Oddly enough, in certain circles, this isn’t so much frowned upon as it is recommended.

Scala is statically-typed, so the naive observer might speculate that my working style in Scala would be much the same as in Java. However, I’ve found that working with Scala has prompted (forced?) me to consciously step back and think about everything, at every step along the way: class hierarchies, type relationships in general, testing strategies, eliminating state where possible…the amount of actual thinking I’ve done while working with Scala has far outstripped the amount of reasoning that typically goes into any similar period of coding. Unsurprisingly, this has led to quite the spike in code quality, which translates into productivity through fewer bugs and less rework.

I attribute this to the strong, static typing that Scala enforces, combined with the type inference that Scala provides. The former forces me to reason about what I’m doing (as it does in Java, for instance), but because the latter eliminates so much of the finger-typing associated with static typing in other environments, I’m given the opportunity to realize that a concrete design phase would yield tremendous benefits, regardless of the scope of code in question. I suspect I would find working in Haskell or ML to be a similar experience, but because those languages don’t easily interoperate with the libraries I need to do my work, I’ve never really given them a chance.

Thankfully, I don’t think I’ll have to. Scala is a great environment, and even more important than its technical merits, its design has led me to engage in a more thoughtful, more conscious development process.