Continuous Deployment of Clojure Web Applications: Clojure Conj slides and an epic screencast

It was a privilege and honor to give a talk at last month’s (first (clojure-conj)) on a topic that is often a pain point for many Clojure developers and teams, the continuous deployment of Clojure web applications.

Based on discussions I’ve had with Clojure developers over the past year or two, it seems that many (if not most) find the domain of application deployment to be incredibly complicated (it is), populated with too many potential “solutions” to choose from (there are), of which few seem to provide gentle on-ramps for new users (it’s true).  My talk was probably the least technical and most high-level of the conference, but based on the feedback I’ve heard so far (crazy selection bias in play!), many people seem to have come away from it with a better grasp on the problem overall, and perhaps some reasonably concrete potential paths to sanity.

Here are the slides from that talk (download/view as PDF):

(Special credit goes to Brian Carper, who was a good enough sport to allow me to call out his clever/scary app deployment architecture based on emacs, swank, screen, apache, mod_proxy, and jetty as the ultimate example of what not to do. ;-))

One of those paths is a Clojure-based toolchain, provided primarily by pallet and jclouds (which I’ve mentioned before).  If you’re already a Clojure programmer, it’s a very easy option to slip into compared to, say, Chef or Puppet. I provided an outline of using pallet and jclouds in my high-level talk, but I wanted to provide an addendum to that with a very detailed, step-by-step introduction to them, along with how one can “close the loop” on continuous deployment using the Hudson continuous integration server.

I figured I’d knock out a short-ish screencast, but quickly discovered that there was way more material than I anticipated…and ended up with a 50-minute screencast that provides a bunch of background on pallet and jclouds, walks through a basic web application build and deployment process, and sets up Hudson from scratch to drive the whole thing.  In hindsight, the whole thing may be far too long (and maybe too introductory?) for most people’s tastes; in any case, I do hope some find it helpful.

Please heed the disclaimer included in the sample project:

…what is contained herein and discussed in the screencast is by no means an ideal web application deployment process.  It is only meant to be an easy-to-understand first step into using pallet, jclouds, and Hudson. You are strongly encouraged to learn about these tools and customize (or likely replace) this basic approach for your own needs.

Without any further ado, here’s the sample project repository:

and the accompanying screencast (or watch in HD on Vimeo):

As an aside, I’d be remiss if I didn’t mention the excellent introduction to pallet that Nate Young posted just yesterday.

Finally, here are some links to additional resources you may find helpful if you do decide to start using jclouds and pallet for your application deployments:

  • The main pages for pallet and jclouds
  • Pallet’s formal documentation; insofar as typical usage of jclouds is via pallet (for us lucky Clojure developers, anyway), what’s here covers nearly all you need to know to get started.
  • Take a look at the pallet-examples repository for more sample projects.
  • Hugo Duncan (the lead of the pallet project) blogs about pallet from time to time.
  • When you do need to stray outside of pallet proper to use jclouds- (and cloud-) specific APIs, you’ll find some good Clojure wrappers waiting for you.  These include the blobstore API wrapper, as well as wrappers for Amazon’s EBS and Elastic IP service (disclosure: I wrote the latter two). It doesn’t look like autodoc for these wrappers is published, unfortunately; thankfully, the relevant namespaces (org.jclouds.blobstore,, and org.jclouds.elastic-ip) are well-documented. ns-publics and doc are your friends here.
  • One will inevitably start using jcloud’s APIs directly; its javadocs are helpful, though there’s a ton of stuff there that you fundamentally don’t need to worry about as a user. NodeMetadata is the most important common class (the class of the object in pallet’s request map in the :target-node slot).
  • The #jclouds and #pallet channels on freenode are always populated with helpful people. Don’t be afraid to ask. :-)

Provisioning, administration, and deployment of CouchDB, Java, Tomcat, etc., made easy with Pallet

Note: there may be relevant bits in here still, but usage of Pallet and jclouds has changed since this was first published originally.  See this post for links to up-to-date comprehensive example project, a screencast, and other goodies.

As I briefly mentioned in my last post, I’ve been working with Pallet to enable automated administration of, among other things, CouchDB. If you’re wondering why I’m using Pallet instead of, say, Puppet or Chef, you can either read the “Why Write Another Tool?” section in Hugo Duncan’s recent post on Pallet. My answer to that question is that I wanted a tool that would provide automated:

  • Provisioning,
  • Administration & configuration, and
  • Application deployment

…all in one piece of kit that would neatly interoperate with the rest of our development stack (JVM, Clojure, Maven, Hudson, etc., etc). Pallet is the only option I found that thread that needle.

From bare metal to ready-for-production app deployment in 5 minutes or 5 paragraphs…

Using Pallet, we can automate everything necessary to provision and configure the resources needed to run our application. The following code defines, spins up, and configures an EC2 node; the steps listed below correspond almost exactly with each line of the defnode configuration that forms the majority of the code:

  1. Use a specific Ubuntu AMI on a particular instance size
  2. Use a standard firewall / security group configuration
  3. Configure an “admin user” with a specific username that has only one authorized key (mine).
  4. Tweak apt so that it’s “sane”. <snark>I like being able to install useful software, so multiverse it is.</snark>
  5. Install the Sun JDK
  6. Install the Tomcat application server
  7. Install CouchDB and set two properties in its local.ini file (one to disable the javascript view server reduce limit – don’t ape that if you don’t know what you’re doing – and one to change its default storage location to a different directory).
  8. Create the aforementioned CouchDB storage directory.
  9. Deploy our application as the ROOT application in tomcat and restart it (I’ve omitted the part that sets security policy in the same block, which is what actually necessitates the app server restart).

(I’ve simplified certain things in this rendition, but what I’ve elided are details that are pretty esoteric and/or miscellaneous – i.e. installing unlimited-strength crypto policy files in the installed JDK, setting VM parameters for Tomcat, etc.)

(defn- sane-package-manager
  (pallet.resource.package/package-manager :universe)
  (pallet.resource.package/package-manager :multiverse)
  (pallet.resource.package/package-manager :update))

(pallet.core/defnode master
  [:ubuntu :X86_32 :size-id "m1.small"
   :image-id "ami-bb709dd2"
   :inbound-ports [22 80 443]]
  :bootstrap [(pallet.crate.admin/automated-admin-user +admin-username+)
  :configure [( :sun)
                [:query_server_config :reduce_limit] "false"
                [:couchdb :database_dir] +couchdb-root+)
              ( +couchdb-root+
                :owner "couchdb:couchdb" :mode 600)]
  :deploy [(pallet.resource.service/with-restart "tomcat*"
             (pallet.crate.tomcat/deploy-local-file "/path/to/my/warfile.war" "ROOT"))])

(def service (jcompute/compute-service "ec2" "AWS_ID" "AWS_SECRET_KEY" :ssh :log4j)

(pallet.core/with-admin-user [+admin-username+]
  (jcompute/with-compute-service [service]
    (pallet.core/converge {master 1} :configure :deploy)))

(Note that jcompute is an alias for the compute namespace provided by the excellent jclouds library, which Pallet uses for cloud-agnostic infrastructure provisioning as well as cloud-specific stuff, like EBS volume and snapshot management, elastic IP management, etc.)

Want to spin up 10 nodes instead of one? Change {master 1} to {master 10}. Other changes are similarly straightforward. Want to deploy an application update to existing nodes instead of creating new nodes? Instead of using converge, execute (pallet.core/lift master :deploy).

There’s obviously a lot going on behind the scenes, but this is what the day-to-day configuration and usage of Pallet looks like. Using it means that I never have to use a command line or fiddly manual AWS tooling like their console or ElasticFox, or cobble together some combination of Chef/Puppet with Capistrano/Fabric and a pile of shell scripts to get a complete provision/configure/deploy solution.

Huge thanks to Hugo (who let me play in his sandbox) and Adrian Cole (the crazy man behind jclouds) for making this all possible.

Clearing some hurdles automating CouchDB administration

I ran into a couple of administration issues with CouchDB while working on support for it in the excellent Pallet project1, so I thought I’d leave some breadcrumbs for those that follow.

(Note that these issues were experienced with CouchDB 0.10.0 on Ubuntu Karmic. They may be resolved in later versions of CouchDB or Ubuntu, but those are the versions we’re targeting for now.)

Broken Directory Permissions

First, Karmic’s couchdb package is broken, insofar as key directories that CouchDB uses don’t have the right ownership or mode. The symptom of this is that CouchDB will not stop properly when one invokes /etc/init.d/couchdb stop. This is a known issue, and will hopefully be resolved for Ubuntu Lucid. Rumor has it that some versions of CentOS have the same issue.

The fix is simple:

chown -R couchdb:couchdb /var/log/couchdb /var/run/couchdb /var/lib/couchdb /etc/couchdb
chmod 0770 /var/log/couchdb /var/run/couchdb /var/lib/couchdb /etc/couchdb

That’s a bit of a carpet-bombing, but certainly won’t do any harm, and does the trick (adjust for the install dir you have, e.g. perhaps prefixing everything with /usr/local).

CouchDB only detaches when started from a full shell

This is where the world will learn that I’m mostly an idiot when it comes to shell stuff and sysadmin in general. Thanks go to Hugo Duncan for giving me a key hint that allowed to get past this one.

In short, pallet was doing the equivalent of this in order to invoke the scripts it generates for configuration management, etc. (assuming here that your user has NOPASSWD in /etc/sudoers:

ssh -t 'sudo /etc/init.d/couchdb start'

So, we’re allocating a tty, which many services need around in order to fork and detach properly (such as Tomcat via jsvc, for example). However, the CouchDB server that is started with this command dies along with the ssh session. Go ahead, give it a shot. If you really want proof, you can do this to see that the server is running before the session is closed out:

ssh -t 'sudo /etc/init.d/couchdb start;sleep 1; curl http://localhost:5984'

Of course, if you log into an environment with a full interactive session, starting CouchDB and then logging out will leave the server running as one would expect.

The solution is painfully simple in this case – just don’t invoke /etc/init.d/couchdb start as an ssh exec command. Whatever you’re using for configuration management, have it run in a full interactive shell session. That’s exactly what Pallet is now doing for all of its configuration executions.


The CouchDB crate in pallet is now pretty well bullet-proofed…or so I hope. :-)

1 Pallet is a tool/framework for compute node provisioning as well as configuration management and general sysadmin automation. I’m not aware of any similar provisioning automation frontends (except for jclouds, which Pallet wraps / uses), but I’d otherwise characterize Pallet as a mashup of chef + capistrano, but written in Clojure (yay!).