Cargo-culting to success and understanding

In doing my part to further the fortune-cookie bullshit cycle that is Twitter, I tossed out this nugget yesterday:

Little did I know that it would spur such conversation, debate, DMs, emails, and so on.  I was going to expound on the original tweet in eight parts, but thought better of it.

Cargo culting is an actual thing, of course.  Thanks to Justin Sheehy for this f’rinstince:


He rightly pointed out that real cargo culting is all about mimicking the “trappings of others, not their essential behaviour”.

Mimicking triviality certainly happens — witness the fads that drift in and out of tech culture around the most trivial things, e.g. the super-minimal desk/workspace, actually using a hammock for “hammock time”, the revolutions in agile/xp/lean/scrum/etc. nomenclature, and so on — but I don’t think that’s what most people mean by “cargo culting”, at least when it comes to software, programming, and related design.

By “cargo-culting”, I was generally referring to doing something without properly knowing why it might be good or bad.  In most cases, doing this is a recipe for disaster, especially if the person doing it is of the incurious sort that is looking for permanent shortcuts.  “Keep adding indexes if your query is slow”, “go look in the ‘pattern’ book when trying to design something”, “use Mongo if you need to be web-scale”.

(The dangerous bit is that we all occasionally cargo-cult things, implicitly, simply because we are social creatures, and we’re inevitably influenced by certain patterns and design philosophies and technical approaches being baked into the fabric of our time and place in the world.  My sense is that many discontinuous innovations can be strongly correlated with someone becoming explicitly aware of these undercurrents, rationally re-evaluating their bases, and offering an alternative better suited to modern times and uses.)

What I was tweeting about is a different thing, though.

Especially when I’m groping into a new domain, I often ape others’ work with abandon, and with only a dim view of the ‘why’ of the motivating design problems and decisions.  Doing so produces technical, design, and understanding debt, but I take it on willingly.  Making constant progress is often more important and more useful to me to than methodically building a formal understanding of the theory or practice related to my current task.  As I go along in the work, I continually look for ways to understand those decisions I previously adopted wholesale.  Because of the bias towards constant progress, I generally have the benefit of having a working system I built in front of me, so I have a tangible sense of the impact of those decisions.  I then can carry on having understood the original ‘why’ motivating them; and, if I’m feeling adventurous, I can use that understanding to usefully re-evaluate those decisions, and maybe make different ones to yield a better result.

Maybe I’m being too loose with the terminology, but the first part of this process certainly sounds like “cargo-culting” in software to me.  The difference is that:

  1. I explicitly acknowledge that I’m taking a shortcut, with the distinct intention of returning to the topic later.  (Not doing so would be an abject failure on my part.)  This is the “first approximation” part of the notion: the shortcut is a bootstrapping mechanism, not a final destination.
  2. I am extremely selective when it comes to whose work I’ll choose to look at seriously.  Code pasted into a Stack Overflow answer doesn’t qualify, nor does whatever is found in most popular “technical” books.  Libraries and systems built by people who have spent decades, careers working on problems similar to those I’m facing? Getting closer; but even given that sort of “population”, care must be taken to match up the matrix of requirements as closely as possible.  e.g. if I’m in the neighborhood of distributed systems, taking hints from someone focused on embedded applications may be fraught with peril.

I’ve never been able to read a dissertation or book or three, and *foom*, produce out of whole cloth something well-designed, efficient, and extensible — but I’ve seen others do just that.  So, I know that what I’ve discussed here is an inefficient path, at least in certain domains and for certain people.  I think it is a natural response to attempting to build nontrivial things given fifty-ish years of software and engineering history to consider.

Finally, the terminology.  Perhaps given this notion of historical artifact, a better phrase than “cargo culting” might be “anthropological reconstructive software archaeology”?  Doesn’t quite have the same ring to it though, eh?

On the stewardship of mature software

I just flipped the switch on v2.5.0 of PDFTextStream.  It’s a fairly significant release, representing hundreds of distinct improvements and bugfixes, most in response to feedback and experiences reported by Snowtide customers.  If you find yourself needing to get data out of some PDF documents, you might want to give it a look…especially if existing open source libraries are falling down on certain documents or aren’t cutting it performance-wise.

But, this piece isn’t about PDFTextStream, not really.  After prepping the release last night, I realized that PDFTextStream is ten years old, by at least one reckoning: though the first public release was in early 2004, I started the project two years prior, in early 2002, ten years ago. Ten years.

It’s interesting to contemplate that I’m chiefly responsible for something that is ten years old, that is relied upon by lots of organizations internally, and by lots of companies as part of their own products.  Aside from the odd personal retrospectives that can be had by someone in my situation (e.g. friends of mine have children that are around the same age as PDFTextStream; am I better or worse off having “had” the latter when I did instead of a son or daughter?), some thought has to be given to what the longevity and particular role of PDFTextStream (or, really, any other piece of long-lived software) implies and requires.

I don’t know if there are any formal models for determining the maturity of a piece of software, but it seems that PDFTextStream should qualify by at least some measures, in addition to its vintage.  So, for your consideration, some observations and opinions from someone that has their hand in a piece of mature software:

Mature software transcends platforms and runtimes

PDFTextStream is in production on three different classes of runtimes: all flavours of the JVM, both Microsoft and Mono varieties of the .NET CLR, and the CPython implementation of Python.  This all flows from a single codebase, which reminds me many kinds mature systems (sometimes referred to as “legacy” once they’re purely in maintenance mode — a stage of life that PDFTextStream certainly hasn’t entered yet) that, once constructed, are often lifted out of their original runtime/platform/architecture to sit on top of whatever happens to be the flavour of the month, without touching the source tree.

Often, the effort required to make this happen simply isn’t worth it; the less mature a piece of software is, the easier it is at any point to port it by brute force, e.g. rewriting something in C# or Haskell that was originally written in Java.  This is how lots of libraries made the crossing from the JVM to .NET (NAnt and NHibernate are two examples off the top of my head).

However, the more mature a codebase, and the more challenging the domain, the more unthinkable such a plan becomes. For example, the prospect of rewriting PDFTextStream in C# to target .NET — or, if I had my druthers, rewriting PDFTextStream in Clojure to satisfy my geek id — is absolutely terrifying.  All those years of fixes and tweaks in the PDFTextStream sources…trying to port all of them to a new implementation would constitute both technical and business suicide.

In PDFTextStream’s case, going from its Java sources to a .NET assembly is fairly straightforward given the excellent IKVM cross-compiler.  However, there’s no easy Java->Python transpiler to reach for, and a bytecode cross-compiler wasn’t available either.  The best solution was to invest in making it possible to efficiently load and use a JVM from within CPython (via JNI).  With that, PDFTextStream, derived from Java sources, ran without a hitch in production CPython environments. Maybe it was a hack, but it was, in relative terms, easier and safer than any alternative, and had no downsides in terms of performance or capabilities.

(I eventually nixed the CPython option a few years ago due to a lack of broad commercial interest.)

Thou shalt not break mature APIs

When I first started programming in Java, I sat aghast in the ominous glow of java.util.Date. It was a horror then, and remains so. The whole thing has been marked as deprecated since 1997; and, despite the availability of all sorts of better options, it has not been removed from the standard library.  Similar examples abound throughout the JRE, and all sorts of decidedly mature libraries.

For some time, I attributed this to sloth, or pointy-haired corporate policies, or accommodation of such characteristics amongst the broad userbase, or…god, I dunno, what are those guys thinking? In the abstract, if the physician’s creed is to “do no harm”, it seems that the engineer’s should be “fix what’s broken”; so, continual improvement should be the law of the land, API compatibility be damned.

Of course, it was naïve for me to think so.  Brokenness is often in the eye of the beholder, and formal correctness is a rare thing outside of mathematics.  Thus, the urge one has to “make things better” must be tempered by an understanding of the knock-on effects for whoever is living downstream of you.  In particular, while making “fixes” to APIs that manifest breaking changes — either in terms of signatures or semantics — might make you feel better, there are repercussions:

  • You’ll absolutely piss off all of your customers and users.  They had working code that now doesn’t work. Whether you are charging them money or benefiting from their trust, you are now asking them to take time out of their day to help you feel better about yourself.
  • Since their code is broken already, your customers and users might see this as the perfect opportunity to make their own changes to not have to cope with your self-interested “fixes” anymore.  Surely you can imagine the scene:

    Sarah: “Hey Gene, the new version of FooLib changes the semantics of the Bar(string) function. Do you want me to fix it now?”

    Gene: “Sheesh, again? Well, weren’t you looking at BazLib before?”

    Sarah: “Yeah; BazLib isn’t quite as slick, but Pete over in Accounts said he’s not had any troubles with it.”

    Gene: “I’m sold. Stick with the current version of FooLib for now, but next time you’re in that area of the code, swap it out for BazLib instead.”

This is why semantic versioning is so important: when used and understood properly, it allows you to communicate a great deal of information in a single token.  It’s also why I can often be found urging people to make good breaking changes in v0.0.X releases of libraries, and why PDFTextStream hasn’t had a breaking change in 6 years.

Of course there are parts of PDFTextStream’s API that I’m not super proud of; I’ve learned a ton over the course of its ten year existence, and there are a lot of things I’d do differently if I knew then what I know now.  However, overall, it works, and it works very well, and it would be selfish (not to mention a bad business decision) to start whacking away at changes that make the API aesthetically more pleasant, or of marginally higher quality, but which make customers miss a beat.

It seems to me that a good guideline might be that any breaking change needs to be accompanied by a corresponding 10x improvement in capability in order to be justifiable.  This ties up well with the notion that a product new to the market must be 10x better than its competition in order to win; insofar as a new version of the same product with API breakage can potentially be considered as foreign as competing products, that new version is a new product.

Managing risk is Job #1

If your hand is on the tiller of some mature software — or, some software that you would like to see live long enough to qualify as mature — your first priority at all times is to manage, a.k.a. minimize, risk for your users and customers.

As Prof. Christensen might say, software is hired to do a job.  Now, “managing risk” isn’t generally the job your software is hired to do, e.g. PDFTextStream’s job is to efficiently extract content from any PDF document that is thrown at it, and do so faster and more accurately than the other alternatives.  But, implicit in being hired for a job is not only that the task at hand will be completed appropriately, but that the thing being hired to do that job doesn’t itself introduce risk.

The scope of software as risk management is huge, and goes way beyond technical considerations:

  • API risk, as discussed above in the “breakage” section
  • Platform risk. Aside from doubling the potential market for PDFTextStream, offering it on .NET in addition to the JVM serves a purpose in mitigating platform risk for our customers on the JVM: they know that, if they end up having to migrate to .NET, they won’t have to go find, license, and learn a new PDF content extraction library.  In fact, because PDFTextStream licenses are sold in a platform-agnostic way, such a migration won’t cost a customer of ours a penny.  Of course, the same risk mitigation applies to our .NET customers, too.
  • Purchasing risk. Buying commercial software outside of the consumer realm can be a minefield: tricky licensing, shady sales tactics, pricing jumping all over the map (generally up), and so on.  PDFTextStream has had one price increase in eight years, and its licensing and support model hasn’t changed in six.  Our pricing is always public, as is our discount schedule.  When one of our customers needs to expand their installation, they know what they’re getting, how much it’s going to cost, and how much it’ll cost next time, too.

Even if one is selling a component library (which PDFTextStream essentially is), managing risk effectively for customers and users can be a key way to offer a sort of a whole product.  Indeed, for many customers, managing risk is something that you must do, or you will simply never be hired for that job, no matter how well you fulfill the explicit requirements.

Clojure Atlas (Preview!)

Today, I’m opening up a “preview” site for Clojure Atlas, a new side project of mine that I’m particularly excited about.

Clojure Atlas is an experiment in visualizing a programming language and its standard library.  I’ve long been frustrated with the limitations of text in programming, and this is my attempt to do something about it.  From the site:

While Clojure Atlas has a number of raisons d’être, it fundamentally exists because I’ve consistently thought that typical programming language and API references – being, in general, walls of text and alphabetized links – are really poor at conveying the most important information: not the minutiae of function signatures and class hierarchies, but the stuff that’s “between the lines”, the context and interrelationships between such things that too often are only discovered and internalized by bumping into them in the course of programming. This is especially true if we’re learning a language and its libraries (really, a never-ending process given the march of progress), and what’s standing in our way is not, for example, being able to easily access the documentation or signature for a particular known function, but discovering the mere existence of a previously-unknown function that is perfect for our needs at a given moment.

This is just a preview – all sizzle and no steak, as it were.  I’m working away at the ontology that drives the visualization and user experience, but I want to get some more early (quiet) feedback from a few folks to make sure I’m not committing egregious sins in various ways before throwing open the doors to the world.

In the meantime, if you’re really interested, follow @ClojureAtlas, and/or sign up for email updates on the site.

…wherein I feel the pain of being a generalist

I’ve lately been in a position of offering occasional advice to Lee Spector, a former professor of mine, on various topics related to Clojure, which he’d recently discovered and (as far as I can tell) adopted with some enthusiasm.  I think I’d been of some help to him – that is, until the topic of build tooling came up.

He wanted to “export” a Processing sketch – written in Clojure against the Processing core library and the clj-processing wrapper – to an applet jar, which is the most common deployment path in that sphere.  Helpfully, the Processing “IDE” (I’m not sure what it’s actually called; the that one launches on OS X that provides a Java-ish code editor and an integrated build/run environment) provides a one-button-push export feature that wraps up a sketch into an applet jar and an HTML file one can copy to a web server for easy viewing.

It’s an awesome, targeted solution and clearly hits the sweet spot for people using the Processing kit.

Stepping out of the manicured garden of comes with a cost, though; you’ve lost that vertically-integrated user experience, and have to tangle with everything the JVM ecosystem has to throw at you.  There is no big, simple button to push to get your ready-to-deploy artifact out of your development environment.

So, Lee asked for some direction on how to regain that simple deployment process; my response pointing at the various build tooling options in the JVM neighborhood ended up provoking pain more than anything else due to some fundamental mismatches between our expectations and background.  You can read the full thread here, but I’ll attempt to distill the useful bits here.

Do I really need to bother with this ‘build’ shit?

Building software for deployment is a damn tricky problem, but it’s far more of a people problem than a technical problem: the diversity and complexity of solutions is such that the only known-good solution is immersive exposure to documentation, examples, and community.

In response, Lee eventually compared the current state of affairs with regard to build/deployment issues in Clojure and on the JVM as if one were asking a novelist to learn how to construct a word processor’s interface before being able to write:

This is, I know, a caricature, but imagine a word processing app that came with no GUI because hey, people have different GUI preferences and a lot of people are going to want things to look different. So here’s a word processing app but if you really want to use it and actually see your document you have to immerse yourself in the documentation, examples, and community of GUI design and libraries. This is not helpful to the novelist who wants a word processor! On the other hand if you provide a word processor with a functioning GUI but also make it customizable, or even easy to swap in entirely different GUI components, then that’s all for the good. I (and many others, including many who are long-term/professional programmers, but just not in this ecosystem) are like the novelists here. We want a system that allows us to write Clojure code and make it go (including producing an executable), and any default way of doing that that works will be great. Requiring immersive exposure to documentation, examples, and community to complete a basic step in “making it go” seems to me to be an unnecessary impediment to a large class of potential users.

My response was probably inevitable, as steeped in the ethos of the JVM as I am; nevertheless, Lee’s perspective ended up allowing me to elucidate more clearly than I ever have why I use tools like Maven rather than far simpler (and yes, more approachable) tools like Leiningen, Cake, et al.:

At one time, there were only a few modes of distribution (essentially: build executable for one, perhaps two platforms, send over the wire, done).  That time is long past though, and software developers cannot afford to be strict domain experts that stick to their knitting and write code: the modes of distribution of one’s software are at least as critical as the software itself. Beyond that, interests of quality and continuity have pushed the development and adoption of practices like continuous integration and deployment, which require a rigor w.r.t. configuration management and build tooling as serious as one pays to one’s “real” domain.

To match up with your analogy, programmers are not simply novelists, but must further concern themselves with printing, binding, and channel distribution.

Within that context, I much prefer tools and practices that can ramp from fairly simple cases (as described in my blog), up to the heights of build automation, automated functional testing, and continuous deployment.  One should not have to switch horses at various stages of the growth of a project just to accommodate changing tooling requirements.  Thus, I encourage the use of maven, which has (IMO) the least uneven character along that spectrum; ant, with the caveat that you’ll strain to beat it into shape for more sophisticated tasks, especially in larger projects; and gradle, which appears to be well on its way to being competitive with maven in most if not all circumstances.

In all honesty, I envy Lee and those with similar sensibilities…

The first step to recovery is realizing you have a problem

The complexity that is visited upon us when writing software is enough; in an ideal world, we shouldn’t have to develop all this extraneous expertise in how to build, package, and deploy that software as well.  There are a few things in software that I know how to do really well that make me slightly unique, and I wish I could concentrate on those rather than becoming a generalist in this, yet another vector, which is fundamentally a means to an end.  History and circumstance seem to be stacked against me at the moment, though.

Especially in comparison with monocultures like the .NET and iOS worlds, which have benevolent stewards that helpfully provide well-paved garden paths for such mundane activities, those of us aligned with more “open” platforms like the JVM, Ruby, Python, etc. are constantly pulled in multiple directions by the allure of the shiniest tech on the world and the dreary reality that our vision consistently outpaces our reach when it comes to harnessing the gnarly underbelly of that snazzy kit in any kind of sensible way.  Along the way, the most pernicious thing happens: like the apocryphal frog in a warming pot, we find ourselves lulled into thinking that the state of affairs is normal and reasonable and perfectly sane.

Of course, there’s nothing sane about it…but I’m afraid that doesn’t mean a real solution is at hand.  Perhaps knowing that I’m the frog now is progress enough for now.

The placebo effect is what makes the software world go ’round

I’ve been of the opinion for some time now that software development, regardless of the methodology followed or the tools used, is not an engineering discipline (unfortunately), but rather is a craft.  I recently laid out that opinion in some detail, which was quickly followed by many people (both in the tubes and in private communications) mostly disagreeing, some suggesting that I just don’t understand engineering particularly well.

In general, I’ll quickly concede that last point, but I keep running into people who ostensibly do understand engineering, and who also reject the notion of software development being a variety of it.  As an example, if I had known of Terence Parr‘s essay “Why writing software is not like engineering” before writing on the same topic with essentially the same premise, I would have simply tweeted a link or something.

A more pointed, and in my opinion, brilliant restatement of this thesis was delivered by Linda Rising in a talk at the Philly ETE 2010 conference in April of this year.  The focus of the talk was far, far broader than the topic of how to classify and characterize software development, and very much worth a listen (audio of the talk is available).  But for now, the key relevant quote comes at 25:15:

One of the things we can learn from psychology is that they do real experiments…whereas in our industry, we do not.  We cannot call ourselves “engineering”; [our work] is not based on science.  In fact, last year at the Agile conference I gave a talk1 that said “I think that mostly what runs this industry is what you’d call the ‘placebo effect’.”

Think about [what would happen] if drug testing were done the way we decide to do anything in software development.  We’d bring in a bunch of pills, and we’d say, “Oh, look at this one, it’s blue! I love blue! And it has a nice shape, oh my goodness, it’s hexagonal! Hexagonal things have always had a special place in my heart.  I think these hexagonal blue ones will be very powerful in solving this disease.  Let’s go with the hexagonals.

I’ll give this hexagonal blue one to all my friends.  I’ll tell them, “This is it, this hexagonal blue one will solve all your problems.  My team tried it, and we really liked it, so your team tried it, you’ll really like it too.”

And that’s how we run projects.  There’s certainly no double-blind controlled experiments.  Is agile any good?  Oh yeah, it is, there’s so much excitement, and  so much buzz, and everybody seems to think so!

Most if not all software developers will recognize the parallels between their profession and that parable.  Why did your team choose C++, or Rails, or F#, or Clojure?  There are often some tangible, technical considerations involved in such choices.  However, as Ms. Rising covers in the talk, human beings are generally not capable of making “rational” choices, but we are stellar when it comes to rationalizing the choices we have made.  So, even our most fundamental decisions – which tools and methods to use – are not made with the rigor that would be demanded of decisions made in an engineering setting.  Of course, this is entirely separate from how we go about actually building things using those tools, which can only be characterized as relative chaos.  How and why we are able to make such abundantly arbitrary choices is something I addressed in my last post on this topic.

And here is where I appeal to a somewhat more respected authority.  Aside from her expertise in the software methodology space, Ms. Rising’s credentials are fairly impeccable, apparently having had some significant involvement in the software development process attached to the design of the Boeing 777 airliner.  So, I’d say she’s in a fairly secure position to be making informed comparisons between engineering, science, and what we do when we build software systems.  Just one more data point, really, but a strong one.


1 A description of this talk is here, but I wasn’t able to find a recording of it.  However, there is an InfoQ interview with Ms. Rising on the topic.

Programming and software development, medium-rare

In a past life as a “software engineer” on contract, a favorite analogy of coworkers was to compare software development to construction (perhaps influenced by the early 2000’s housing boom?). Projects were like houses, plans were needed to properly architect the end result, and schedules were laid down to ensure that the “foundations” were done before we started “framing” the building and “furnishing” the details. Conceptualizing software development in such a way is common, and there’s a long history of people involved in what has casually been called “software engineering” thinking that what they do is, or should be, related to “old-world” engineering.

This is largely both nonsense and decidedly unnerving.

I’m not like my grandfathers

Both of my grandfathers were involved in engineering; knowing something of what they did makes me even more sure that what I do is not related to engineering.

One was an electrician, more a tradesman than an engineer, who worked in the construction of large commercial buildings in downtown Hartford and Denver. Each day, he would look at the blueprints painstakingly drafted by the project’s architects and engineers, and go about making those plans a reality – installing high-voltage switching equipment, stringing miles of Romex (or, whatever they used back then), and doing all of it in hazardous conditions.

My other grandfather was, as far as I remember, something of a process engineer at a subsidiary of Olin, helping to design the manufacturing processes that would heat, pound, roll, cut, stamp, and test thousands of varieties of copper and stainless steel foils, strips, and other formulations for later inclusion in all sorts of products, both industrial- and consumer-related.

These men’s careers were very different, but they were involved in what are clearly engineering tasks.

An art’s constraints are what define the art

There’s a lot that separates my discipline from my grandfathers’, but I think the most significant is that, as someone who builds software, I have far more discretion in how I achieve my end results than they had in their work. The degree to which this is the case cannot be overstated, but I’m at a loss for words as to how to concisely characterize it. Materials in the real world behave in ways that are, in the modern age anyway, understood: electricity and copper and steel and wood have known physical characteristics and respond in known ways to all of the forces that might be applied to them.1

In contrast, the world of software has so many degrees of freedom in so many vectors, the possibilities are functionally limitless. This is both a blessing and a curse, as it means that the programmer is something of a god within her domain, free to redefine the its fundamental laws at will. Given this context, it’s no wonder that software projects fail at an astounding rate: we simply have nothing akin to the known natural constraints that are conveniently provided for our real-world engineer friends, so we have no option but to discover those constraints as we go along.

The software community’s response to this has been to erect artificial constraints in an effort to make it possible to get things done without simply going insane: machine memory models with defined semantics, managed allocation, garbage collection, object models, concurrency models, type systems, static analysis, frameworks, best practices, software development methodologies for all stripes and inclinations. This is natural, and good, and the only way we could possibly make sense of this thing called software given how young it is.

Yet, even after all this edifice, ask 100 software developers how to build a website, and you will get at least 500 answers – and the people that respond with a hesitant “It depends” are likely the most clueful of the group. If my grandfather had responded “It depends” to a question about how to produce a 1-ton spool of 2mm-thick copper strip, he’d have been fired on the spot.2

Confessions and snotty ego trips

Interlude: If Top Chefs were programmers

Susur’s work is remarkably intricate and exhibits a delicate precision unmatched by his peers. He prefers the crystalline perfection of a perfectly-balanced type system, and so chooses Haskell for most of his work.

Living up to his “Obi-Wan” moniker, Jonathan makes the most familiar t hings amazing, and people are never sure how. His secret is that he usually works in some lisp or scheme, which he then uses to generate whatever comforting form his customers prefer, often Java or C#.

Marcus likes to surprise people with the most esoteric things possible: his pièce de résistance is written using his own prolog variant, implemented in Factor. People love it either for the ballsiness of it all, or because his stuff works so well they don’t notice.

Rick is a madman, insisting on using C++ for everything, even web applications.

Okay, so software development isn’t engineering. It’s probably safe to say it’s a craft, though (in contrast to pure art like painting or sculpting, but that’s a different blog post). In particular, its similarities to cooking are legion, something that I noticed while watching one of the few bits of television I bother with (and a guilty pleasure), Top Chef (though I redeem myself oh-so-slightly by preferring the Masters incarnation by a mile). *blush*

Watching this show with an eye towards the methods and attitudes of the chefs is like watching a group of programmers struggle to build quality software. Functionally no constraints in process, methodology, and materials? Check. Infinitely many ways to get the job done depending on the skill of the craftsman, the tastes of the customers, and the specifics of the materials? Check. Identiying the best craftsmen is difficult, and most of those at the top of their field have a murky combination of ill-defined qualities? Check. A wide disparity between the effectiveness of individuals in the field? Check. At the edges, people experiment with hacks that sometimes come to be part of everyone’s repertoire? Check. Technical capability is not a necessary requirement for success, due to fluctuating trends (less charitably called “fashion”) and a variety of potentially-compensating personal characteristics? Check. Less mature (and often less capable) members of the community are given to snotty ego trips, bad attitudes, and frequent tantrums? Check.

Given all of the similarities, I think the fact that it’s difficult to assess the quality of individuals in either field is the most striking (and perhaps a key indicator that a field has not (or intrinsically cannot) internalized a certain minimum degree of rigor in its methods). It’s telling that TopCoder predated Top Chef by a decade. Great hackers and those guys that can hit the high notes are few and far between, mixed in with scads of cooks that come to work stoned and “programmers” that put HTML on their resumés that still manage to get hired, somewhere. These are fields that are far from science, far from engineering, and well within the lush, rolling hills of craft.

Enough already. So what?

First, words matter and it’s important to call a spade a spade. Thoughtlessly (or disingenuously) call software development “engineering”, and people walk off with all sorts of notions that are inappropriate. This is particularly damning with customers and other nontechnical “civilians”.

Second, craft, as revered a concept as it is, is not desirable when businesses, livelihoods, and actual lives are at stake. I’ve never worked on aeronautics software or the like, but despite its codified standards, its rigorous testing protocols, and access to millions and billions of dollars in resources, we’ve crashed satellites into planets because of something as triflingly simple as conversion between metric and standard measures. Similar examples are everywhere. We need software development to be built with an engineering discipline, because everything we do depends upon it.

Of course, I have no tidy answers for that challenge. I think that if we can pull ourselves out of this primordial ooze of twiddling bits and get to a point where we describe how to do things relevant to the domains we’re working in, then I think there’s a chance for the species to survive. Ideally, domain experts should be doing the bulk of the “programming” work, given that communication between such experts and programmers is probably the source of at least half, if not the vast majority of software design flaws. This is an old chestnut, dating back to the 4GL programming languages (now 5GL?), declarative programming, expert systems, and so on.

Programmers’ work will be done when we’ve put ourselves out of a job

Roughly, I think the point is to solve for X:

blueprint:building :: X:code 3

One strain of real-life examples – such as Pipes, DabbleDB, and WuFoo – allow people to collect and process data without touching a line of code. The best specimens of these kinds of systems are often labeled with the “tool” slur by programmers, but the fact is such things allow for greater aggregate productivity and less possibility for error than any similar purpose-built application. Meta-software that can deliver the same for any given domain is the future, and the result will be that programmers, as they are known today, should cease to exist in commercial settings.

The craft of programming will survive though, and continue to be the source of innovation. After all, there’s no shortage of demand for great chefs, even with all these McDonald’s and Olive Gardens dotting the landscape.

1 There are obviously still unknowns in the physical world, but those are irrelevant insofar as we’re contrasting engineering and software development within the scope of commercial projects. Research matters are an entirely separate issue in both camps.
2 Another good question to ask: Can you estimate time and materials for projects…and be correct within a 100% margin of error? If so, congratulations, yours might be an engineering discipline.
3 From an informal, but very high quality discussion on this topic in #clojure IRC:

The beauty of letterpress and craft and old arts faithfully renewed

Having worked primarily with PDF documents and all the minutiae of their fonts and such over the years, I’ve come to have a great appreciation for typography. This appreciation has led me down some interesting paths, most notably when I visited the Wilson Printing Office a few years ago, originally built in 1816 in what is now Old Deerfield Village (about a half hour’s drive north). It’s a quaint old building in a quaint old village, exactly what you’d expect in New England:

Inside, you can find a very old, manually-operated, movable-type letterpress. It’s entirely functional, and luckily, visitors are allowed to operate the behemoth.

I was reminded of this recently when I stumbled across this video, where the proprietor of Firefly Press (located in Somerville, MA) talks about his love of letterpress, the state of his craft, and how he expects it to die eventually, simply because people will forget how to do it:

The artfulness, the care, and the precision of the work exhibited there is remarkable. Seeing it makes me want to retire and build a letterpress from scratch, and start pumping out lovingly-crafted stationary and such (although remember, self-sufficiency is the road to poverty).

As Sisyphean as it might seem, I try to bring as much of that spirit as I can to what I do. Despite the sometimes soul-sucking pop culture of software development, the drumbeat of get-it-done-fast that comes on every vector, and the never-ending treadmill of “new” technologies that parade across social news sites, I try to bring a craft to the code I write, the systems I build, and the experiences I assemble for my customers.

I’m heartened that it seems that I’m not alone in this. There are many like me that seem to have re-discovered what’s important and relevant to building sustainable systems – and discovered that, yes, it’s possible to keep that separate from the trendy, the immediate requirements, the moment’s conveniences. Computation, after all, doesn’t appear to change much. Lambdas and pointers are likely to be there, waiting for future generations just as they serve us today – modulo some (hopefully slight) packaging that helps with interacting with the broader world.

(It appears that this perspective may be necessary [though surely not sufficient!] in order to build successful, and not merely elegant systems that solve today’s and tomorrow’s pressing problems. No one will cheer much anymore when an application is delivered that is largely built while sitting on one hand [a Spolkyism there, I believe, referring to IDE wizards and such].)

Something said in that video really rung out to me, reminding me of the traditions of LISP that exist, and where I happen to stand in relation to them:

The old guys got it remarkably right.

Of course, the old vanguards have faded for the most part; many have turned to Clojure and other modern lisps. Thankfully though, that sense of craft and the original intent and spirit of “the old guys'” work is there and alive.

A quick search for ‘letterpress’ uncovers a host of shops, plying their craft, making beautiful things out of cloth and cotton and wood and steel. May that continue to be the case 100 years hence, too.

Activity is not Progress (or, ‘Did you really need to shave that yak’)

Anyone who is accountable for any sufficiently-complex objective is constantly having their focus being pulled away from that larger goal by a thousand different fiddly tasks. Christened as yak shaving some time ago by a fellow at the MIT media lab, the concept has become a favorite shorthand in various programming and software development circles. I only heard of it this year, but it’s helped to coalesce my thinking about focused work and the relationship between activity and progress.

In particular, I think it’s helpful to occasionally check one’s activity using what I’d call “root objective analysis”.

Many people in technical fields are familiar with root cause analysis, where a problem or failure is analyzed in such a way as to determine its root cause. There are lots of flavors of root cause analysis, with Five Whys being popular among programmers due to the Joel Effect and probably some loose association between Five Whys and the lean development/startup methodologies that are all the rage these days.

In contrast, root objective analysis runs in the “opposite direction”, so to speak: for any given activity, you trace the likely causal link between that activity you’re engaged in, and the progress you want to make. In short: “Is what you’re doing right now getting you closer to your end goal?”1 If you do this right, or at all, you’ll go down fewer dead-ends, waste less time, and prioritize the yaks you do shave so that you get to your desired end state sooner rather than later.

There’s obviously a lot of fuzziness in any kind of speculative analysis like this; if there weren’t, then project management would always bring jobs in on time and within budget. However, if your work often leads you far afield of your “main line” of focus, then asking yourself the question above from time to time may help you to ensure that every yak shaving you engage in is necessary, as opposed to a distraction caused by confusing activity for progress.

Yak shaving close to home

A yak shaving that is near and dear to my heart is the fable of the software developer and the PDF documents (not surprising, since we talk to a lot of developers who have worked with lots of PDF documents). There are many variations, but the most extreme goes something like this:

  1. Joe the developer needs to get some chunk of data into his company’s database (maybe it’s financial data, maybe he’s working with excerpts of academic journal articles – such details are mostly irrelevant)
  2. The data is only available in PDF documents, and there’s a lot of them. Thousands, perhaps millions of chunks of data in as many different PDF documents.
  3. Joe’s first thought is that he needs to build a function to extract text from these PDFs so that he can get at the data he needs.  But, after…
    • reading the 1,000+ page PDF specification,
    • adding support for the 8 different versions of the spec,
    • adding support for a half-dozen encryption protocols, and
    • adding support for extracting Chinese (or Japanese, or Korean, or Icelandic with its lovely ð (“eth”) character) along with the embedded fonts that go along with it
  4. …Joe now has spent nearly a year building a one-off PDF text extraction library that (again, depending on the version of the fable) fails on 24% of the documents his company needs to access, and still doesn’t run fast enough to finish in the batch window he has to work with.

Seriously, scouts-honor, I’ve heard this story at least 5 times…and each time right before or right after the developer/company in question purchased PDFTextStream to replace their homebrew PDF library. That, my friends, is activity without progress, yak shaving at its most epic.

1 Careful and clueful readers will recognize this as little more than a distilled version of OODA, the granddaddy of all decision-making formalisms.

Automated Quality Control, Part II

In my last post about quality control, I detailed the challenges we face in testing PDFTextStream in order to minimize hard faults, and some of the patchwork testing ’strategy’ that we employed in the early days. Now, I’d like to walk you though our specific design goals and technical solutions that went into building our current automated quality control environment.

For some months, most of the PDF documents we tested PDFTextStream against were retrieved from the Internet using a collection of scripts. These scripts controlled a very simple workflow:

  1. Query for PDF URLs In this step, a search engine (usually Yahoo, simply because I like its web services API) is queried for URLs that reference PDF documents that contain a search term or phrase.
  2. Download PDFs All of the URLs retrieved from the search engine are downloaded.
  3. Test PDFs with PDFTextStream PDFTextStream is then tested against each of the PDF documents that were successfully downloaded.
  4. Report failures, suspicious results Any errors thrown by PDFTextStream are reported, along with any spurious log messages that might indicate a ’soft failure’.

This approach is solid. It makes it possible to test PDFTextStream against random collections of PDF documents, thanks to the nature of search engine results. However, while the general approach is effective in principle, our implementation of it was unenviable for some time:

  • Being a collection of scripts, the process was manual, so testing runs happened only when someone was ‘at the helm’. This involved providing query strings for the search engine access phase, nursing the downloads in various ways, and then picking through the test results (failures weren’t ‘reported’ so much as they were spit out to a log file, which then had to be grepped through in order to find interesting nuggets).
  • Since the process was manual, it couldn’t scale. That’s obviously bad, and led to significant restrictions on the number of PDF documents that could be reasonably tested in a given period. Beyond that, it led to our test box(es) sitting idle much of the time.
  • Since failures (and ’soft failures’) weren’t actually being reported or even recorded anywhere in any useful way, it was impossible (or really, really hard) to know what failures to concentrate on after the testing was finished. One always wants to focus on the bugs that are causing the most trouble, but we couldn’t readily tell which failures were most common, or even which of two different kinds of failures were more common than the other. This makes prioritizing work very difficult, and much like throwing darts blindly.

So, drawing from these lessons, we set out to design and build a quality control environment. To me, the emphasis on ‘environment’ here is shorthand for a number of qualities that the system resulting from this effort should exhibit:

  • Autonomy Each component of the environment (usually called a node) should operate asynchronously, moving through the workflow presented earlier without any intervention, assistance, or monitoring, either from other systems or components or from people.
  • Scalability Each node (and each group of nodes) should be able to saturate all available resources available to it — CPU capacity, bandwidth, disk, etc. Our aim here is to maximize the number of PDF documents PDFTextStream can be tested against in a given period, so having any resources of any kind sitting idle is simply wasteful.
  • Auditability Any any moment, we should be able to know what every node in the environment is doing, what it’s going to do next, and what it’s done since its inception. Further, we should be able to generate reports on what kinds of faults PDFTextStream has thrown, on which PDF documents, which build of PDFTextStream was used in each test, etc. This makes it very simple to determine which errors should be focussed on, and which can be put on the back burner.

Those that know such fields would recognize these design principles as being very similar to those that are relied upon in multi-agent systems or distributed computation systems programming. That is not accidental: from the start, we recognized that in order to test PDFTextStream to the degree that we thought necessary, we would need to test it against millions of PDF documents. That simply was not going to happen with any kind of manual, or even scheduled system (such as simply running those old scripts from cron). Between that requirement and the notion that we need to have multiple ‘nodes’ running simultaneously in order to utilize all of the resources we have available, it was a no-brainer to use some of the concepts that are taken for granted by those that are steeped in the multi-agent systems field, for example.

So, there’s the design goals of our automated quality control environment, in broad strokes. It retains the fundamental workflow that was implemented long ago in that patchwork of scripts, but includes design principles that make the environment efficient, manageable, and effective in terms of pushing PDFTextStream to its limits.

Automated Quality Control, Part I

Quality control is critical to the success of a business, and in turn, to the success of its customers as well. This is doubly true in the case of software businesses and products, where problems and defects are rarely obvious. In this first post in a series about Snowtide’s approach to quality control, I touch on some of the specific challenges we face in ensuring product quality.

In PDFTextStream’s early days, before we were promoting it widely as a reliable, high-performance library suitable for enterprise applications, our quality control was weak. We had a few beta users, who stumbled across errors and PDF’s that caused PDFTextStream to fail. We had a few scripts that harvested PDF’s from various places (all of the PDF’s linked from a particular webpage, for example), which we could then test against PDFTextStream. For some time in the early days, our testing and quality control ‘procedures’ were . . . weak.

Most software, especially that which is built for some constrained, well-defined purpose, can be tested very readily. Start it up, click a few buttons, browse through a few pages, run some test data, maybe do some stress testing if you have time. Does it work? Yes? Great, deploy it. Otherwise, fix the problem, and repeat.

However, because of the nature of PDF documents, the fact that PDFTextStream didn’t fail on PDF #1 didn’t mean that it wouldn’t fail on PDF #2. Further, just because PDFTextStream didn’t fail on PDF #10,000 didn’t mean it wouldn’t fail on PDF #10,001. Since there is no notion of validity when it comes to PDF documents, there is no way to say that PDFTextStream is Fully Tested™. This is completely different than the circumstances one might find when testing an XML parser, or a web server, or a web application, or an accounting utility, where the inputs and outputs are well-defined and specified.

Our circumstances are, however, very similar to those I would imagine exist when testing a new car, or a light bulb, or an iPod. There, you can only hope to minimize the likelihood of failure, since the nature of the product is such that it will fail eventually. Similarly, PDFTextStream (or any other PDF library or viewer) will never be able to claim that it will never emit an error, because the PDF document format inherently allows for a wide degree of variability.

Once you move past the notion of proving the lack of defects in a piece of software to accepting that the attainable goal is to minimize the likelihood of defects as much as possible, there is a very straightforward approach to quality control: test, test, test. Not to eliminate the possibility of faults, but to minimize their occurrence to a reliable, quantifiable percentage of the total tests performed.

So, it is with this focus that we embarked on building an automated quality control environment. Our plan was simple: build a system that will test PDFTextStream against astronomical numbers of PDF documents — orders of magnitude more than any customer would ever dream of processing — then see where PDFTextStream fails, and fix the problems.

Over the next few weeks, I’ll be posting the highlights from our experiences rolling out this automated QC environment, as well as some notable instances where failures discovered by this process have prompted significant improvements in PDFTextStream.

Update: Part II is here.