Case-insensitive filesystems vs. AOT-compiled Clojure

I recently ran into a problem related to ahead of time-compiling (a.k.a. AOT) some Clojure code that took me a little while to figure out. Here, I’d like to leave behind some breadcrumbs for anyone else that happens to run into the same problem.

Consider a namespace at ./foo/myns.clj:

(ns foo.myns)
(defn hi [])
(defn HI [])

Let the fun begin. AOT-compile the namespace into the cwd:

[catapult:/tmp] chas% java -Dclojure.compile.path=. -cp clojure-1.2.0-RC3.jar:. clojure.lang.Compile foo.myns
Compiling foo.myns to .
[catapult:/tmp] chas% find foo
foo
foo/myns$hi.class
foo/myns$loading__4403__auto__.class
foo/myns.clj
foo/myns__init.class

Ut-oh.  There are two defined fns, hi and HI, but there’s only one classfile, ostensibly for hi (note that each Clojure function AOT-compiles down to its own classfile).

[catapult:/tmp] chas% java -cp clojure-1.2.0-RC3.jar:foo clojure.main
Clojure 1.2.0-RC3
user=> (require 'foo.myns)
java.lang.NoClassDefFoundError: foo/myns$hi (wrong name: foo/myns$HI) (NO_SOURCE_FILE:0)
user=> foo.myns$hi
java.lang.NoClassDefFoundError: foo/myns$hi (wrong name: foo/myns$HI) (NO_SOURCE_FILE:0)

What’s going on here?

It took me a while (probably a few hours) of false starts before things clicked:

  • My disk is formatted as Mac OS Extended (aka HFS+), but not the case-sensitive variety (the fs is still case sensitive insofar as it retains case information in filenames [just for display purposes as far as I can tell], but that’s as far as it goes).
  • When the compiler writes out the classnames for each fn in the namespace, the second one it writes out goes to the same location as the first, but has a different name as far as the classloader is concerned.
  • The (require 'foo.myns) invocation uses the classloader to find a class named foo.myns, which goes on to load the classes associated with each of the functions defined in that namespace (or, .clj files are found and loaded, if they’re newer than any classfiles or if there are no matching classfiles).
  • The classloader finds a classfile for the foo.myns/hi function, but the class’ internally-defined name (HI) doesn’t match the requested name (hi), so an exception is thrown.

As a sanity check, I mounted a ramdisk, formatted it with case-sensitive HFS+, and voilà, I could AOT-compile myns.clj and require it without a problem.  Filesystem case-sensitivity isn’t usually something one has to worry about with most languages: if you can’t name identical-except-for-case source files on disk, then you are saved from being in a position of potentially compiling to identical-except-for-case class or object files.  However, because of how Clojure maps code to classfiles (remember, one classfile per function, with source files [usually!] designating namespaces), it’s not until one attempts to load a Clojure namespace from AOT-compiled classfiles does one run up against any trouble.

Solutions

The easiest solution is to simply not AOT-compile your Clojure code.  Avoiding AOT carries no runtime performance penalty (though initialization of each namespace will be slower), and source distributables will always be smaller.  Of course, there are many nontechnical and some technical reasons why AOT-compilation is a necessity in various circumstances, so source distributions certainly won’t be right for everyone.

At some point while reading this, you’ve probably said under your breath, “Well, you shouldn’t have function names that differ only in case anyway!”  I agree wholeheartedly, and it must be said that the above situation is highly exceptional.

However, sometimes one isn’t necessarily choosing function names – this is often the case when when generating code from some dataset.  In my case, I was generating functions for handling PDF tokens, many of which differ only in case (e.g. Tj and TJ are one example).  The general solution when you’re in a circumstance like this is that your macro (or other code-generating facility) needs to take care to mangle the function names being emitted so that they are guaranteed to be unique.

A trickier question is whether the Clojure compiler should be doing something to prevent this scenario due to accidental function name collision.  Off the top of my head, I can imagine simply checking for existing/matching classfiles on disk that nonetheless have non-equal filenames would be sufficient cause to raise an exception when AOT-compiling.  That change, or something similar, may yet come to pass; until then, hopefully the above will be helpful.

3 thoughts on “Case-insensitive filesystems vs. AOT-compiled Clojure

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s