On Defensive Programming

tl;dr: unless you work in an environment where the direct consequences of a production bug are death or sanction by the SEC, please do not do it.

030328-M-0000X-005 Southern Iraq (Mar. 27, 2003) -- Kuwaiti firefighters fight to secure a burning oil well in the Rumaila oilfields, set ablaze by Iraqi military forces. Efforts are underway to extinguish fires and protect the region from environmental disaster. Protection of these fields is considered critical to the Iraqi people in support of Operation Iraqi Freedom. Operation Iraqi Freedom is the multi-national coalition effort to liberate the Iraqi people, eliminate Iraqi's weapons of mass destruction, and end the regime of Saddam Hussein. U.S. Marine Corps photo by Pfc. Mary Rose Xenikakis. (RELEASED)

The longer version is based on something I repeat like a stuck record: when you’re developing something from scratch, make it correct, then make it fast. A corollary to the correct -> fast thing is that a misbehaving program should make its failure mode obvious so it can be remediated.

Yeah, I’m mostly talking about Java and in particular a couple of things I’ve seen lately which seem to be informed by either:

  • a misunderstanding of how ExecutorService implementations handle unchecked exceptions
  • an obsession with returning *something* from a non-void method, even if something is null or “ihavenoidea”

1. Exceptions

O.K, let’s be clear about this:  ExecutorService implementations (and for that matter, Jetty queuedthreadpools) will swallow up unchecked exceptions and carry on as before. You do not have to worry about OS thread death. You do not have to worry about suppressing some exception you think may happen. Let it spew a stack trace because that gives somebody who picks up a bug some clue about how a specific routine failed.

If your component relies on a client (HTTP, database, whatever) then it’s fair-game to use a try/catch block here, because I don’t care what library your implementation uses; bubbling up checked exceptions here is lazy and also may become redundant if you change your implementation. So do this:

try {
    client.makeCall(someArgument);
} catch (IOException e) {
    log.error("Couldn't make call!", e);
    throw new RuntimeException(e);
}

// or if you like Guava and javac does too!
try {
    client.makeCall(someArgument);
} catch (IOException e) {
    log.error("Couldn't make call!", e);
    Throwables.propagate(e);
}

Basically, propagate internal checked exceptions as some kind of runtime exception. This leverages a questionable property of Java, but the alternatives are:

  1. force client code to handle null or “?WTF?”
  2. force client code to handle IOException or TotallyUnrelatedInternalException

If the underlying failure is unrecoverable anyway, don’t bubble it up at the level of the compiler. Let the battle-tested logic inside your executor or application server handle shit hitting fans.

2. Return values

Let’s say we have an interface like this:

public interface IStripper {
    
    String stripThing(String originalThing);
}

What do you expect null or generally bad input to do? Something like this?

public GoodStripperImpl implements IStripper {
    
    @Override
    public String stripThing(String originalString) {
        return originalString.substring(0, 8);
    }
}

Or how about this?

public PoorStripperImpl implements IStripper {
    
    @Override
    public String stripThing(String originalString) {
        return originalString == null ? "???" : originalString.substring(0, 8);
    }
}

How much fun would it be to root-cause a bug when all you have to go on is fucking “???”? Again, if the call-site is invoked inside a solid thread-pool implementation, this kind of pessimism is not only unwarranted but suppresses the nub of the issue. And if you’re confident that a null string is possible input, how about an empty one, or one that’s too short? Or contains characters in an encoding you didn’t expect? This springs to mind.

Saving bad input like this is subtle and insidious: it can circumvent the guarantees given by interfaces and if a call chain is written by the same person who wrote the original implementation, you can end up with a pile of code the depends on null-ness or ‘???’-ness, which is a fantastic way of ending up with spaghetti in Java. Please refrain from doing it. Even worse, don’t make this kind of handiwork wake someone up at 4am only for them to realise a useful stack trace has been replaced with a distant NPE and a garbled string.

Why are you venting about this now?

Because I felt like it and I hit some arbitrary style-I-don’t-like threshold. But also because I’ve worked in Erlang shops where the opportunity cost of even minuscule downtime involved conversations with the CFO; it had a similar gravity to a misreporting fuckup. You often hear people in these circles discussing error-kernels* and the let-it-crash philosophy, and I know this approach is gaining some traction. It makes sense that microservices take on some of those traits. Either way, it’s totally at-odds with the notion of defensive programming.

Don’t try and catch failure based on some poorly informed conjecture, expect every single method call to error and have something solid (e.g. java.util.concurrent, Jetty, actors, whatever) take responsibility for it.


*Actually that Erlang programming rules page is pretty much universally applicable

Leave a Reply

Your email address will not be published. Required fields are marked *