Adding Gzip compression to a Clojure webapp in 30 seconds

As you might have seen, I'm working on a new web project, which happens to involve shipping a metric ton of content to each user's browser upon visiting the meat of the site.  We're talking about something like 1.5MB of HTML, Javascript, and CSS, and that's after best-effort minification and such.

Clearly, Gzipping the whole mess is called for.  I've never worked on any high-volume sites that called for such measures, so this is a new requirement for me.  The site's backend is implemented in Clojure though, so my first instinct was to Google "gzip ring clojure" (Ring being the thoroughly spectacular Clojure web framework), whereupon I found Michael Stephens' ring-gzip-middleware project.  Seems simple enough: Ring request handlers are just functions, and you can apply middleware trivially via function composition, so ring-gzip-middleware provides a function that wraps your assembled Ring handler(s) to compress outgoing responses appropriately.

Thankfully, I stopped short just in the nick of time: while Ring middleware is a damn fine hammer, Gzip response compression shouldn't be a nail in this scenario.  I didn't want to have to read through entirely unrelated infrastructure-related bits in my codebase henceforth, no matter how elegantly they folded themselves in.  There remains value in the notion of separation of concerns.

Again thankfully, Clojure web apps are Java web apps, universally deployed in a servlet container like Tomcat, Jetty, Glassfish, and so on. So, I have quite the tasty menu to choose from.

Container-provided Gzip compression

Most if not all Java servlet containers provide Gzip response compression out-of-the-box.  Tomcat, for example, requires simply adding a couple of attributes to the Connector element in its server.xml file.  Jetty handily provides Gzip compression of static resources via its default servlet; just set its gzip init parameter to true in your web.xml1 file, and you're done:

[sourcecode language="xml"] <servlet> <servlet-name>default</servlet-name> <servlet-class>org.eclipse.jetty.servlet.DefaultServlet</servlet-class> <init-param> <param-name>gzip</param-name> <param-value>true</param-value> </init-param> </servlet> [/sourcecode]

I think you can leave out the <servlet-class> element there; I didn't experiment much on this path, because:

  1. I needed to be able to Gzip dynamically-generated content, and
  2. I use Jetty in my development environment, but deploy to Tomcat in production, so I wanted a general-purpose solution.

Thus, what I consider to be the ideal approach:

Gzip Servlet Filters

Servlet filters are independent, composable components that can dynamically modify requests and responses, the circa-1999 Java corollary to Ring middleware.  The difference is largely in packaging and context: while Ring middleware is just a function that can be folded into a codebase programmatically, servlet filters are specified statically as part of an application's web.xml file, and in general are not modified from within the servlet at runtime.  (Things can get interesting when you implement servlet filters using Clojure, but that's perhaps a topic for another post.)

There are various Gzip compression servlet filter implementations floating around the 'nets, including one particularly bad example that appeared in some magazine in 2004 that has an unreasonable amount of Google juice associated with it for some reason.  Use any of them, and your Clojure web application will be Gzip-ready, regardless of which servlet container you deploy to.  For my money, the best one is provided by the Jetty project, simply because it's impossible to argue with its provenance given that Jetty is used everywhere: if there was a problem with its Gzip servlet filter implementation, it certainly would've been found out by now.

Using it is cake; add the corresponding dependency to your pom.xml file:

[sourcecode language="xml"] <dependency> <groupId>org.mortbay.jetty</groupId> <artifactId>jetty-util</artifactId> <version>6.1.26</version> </dependency> [/sourcecode]

and add it to your web.xml file with a corresponding URL mapping:

[sourcecode language="xml"] <filter> <filter-name>jetty-gzip</filter-name> <filter-class>org.mortbay.servlet.GzipFilter</filter-class> </filter>

<filter-mapping> <filter-name>jetty-gzip</filter-name> <url-pattern>/*</url-pattern> </filter-mapping> [/sourcecode]

Done.

All my content is now Gzip-compressed, both dynamically-generated and static (because I always use the servlet container's default servlet for serving up static resources).  I didn't have to make any changes to my codebase, and I'll never once be reminded of Gzip compression when I fiddle with my Ring handlers.

The Jetty Gzip filter has various options for tweaking which mime types and sizes of content should be included and for excluding specific user agents from receiving Gzipped content, but I'll just leave the defaults alone for now (i.e. compress everything for everyone).

Postscript: Wait, what's with all the XML in this "Clojure web app"?

Some people are allergic to parentheses; some are allergic to XML; I choose to find peace in both as appropriate. :-)

That Jetty Gzip filter does a good job of something I don't want to think about.  Just as core vs. context is a useful frame in business affairs, I think it's handy when thinking about how to approach software development.  I don't get points for having a "pure Clojure" stack for my web application, especially if I need to break away from having a reasonable separation of concerns or deal with FUD creeping up in my head about whether a reimplementation of fundamentally commodity operations is really up to spec or not — it may very well be, but there's simply no gain to be had when that choice pans out favorably if there are safer alternatives for such matters of context.

Thus, I chose the ~5-year-old Gzip filter from Jetty, just like I often choose to use boringly reliable Java-land libraries (e.g. Spring Security) and tools (e.g. Maven and Eclipse) to support far more interesting things for which I use bleeding-edge kit like Clojure.

Footnotes
  1. You know what a web.xml file is, right?  Every Java web application has one, whether it's generated by your build process or you create it yourself.  The latter is generally preferable IMO, simply because you can take advantage of all the goodies that it opens up for you.  You can read generalities about web.xml files all over the web; there's a sample Clojure web project over here that contains a simple example, and I talk about them a bit in my post and screencast here.