What kind of K8s pod do I really need?

I’m sure you’ve been there; by some twist of fate you’ve ended up working in a place where Kubernetes has appeared on the radar and now you’ve been dropped into a brave new world where pushing a code change involves something called a pod. More nomenclature. Shit just got real. While there’s no shortage of detailed and nuanced explorations of the plethora of resource types in both the official Kubernetes documentation and elsewhere on the internet, it’s hard (in words at least) to give a cohesive top-down view of how all these resources fit together, and why you’d choose one over the other.

With that in mind, I went and did a thing: (hope it helps somebody)


Rails thread safety… ain’t a thing

So because of reasons I’ve started writing Ruby again. That mostly makes me happy. Developers prostrate themselves at the foot of the gods of UX and customers’ acceptance criteria but often neglect the whole developer experience and how palatable their own tooling is. Despite some frankly pretty shaky underpinnings, there’s a lot to like in the Ruby ecosystem if you’re just trying to get shit done.

Anyway, just pinning a note mainly for my own benefit* about the liberal use of the ||= conditional assignment operator in Rails (or Rack-based apps in general). Go ahead and run this:

50.times do |i|
  Thread.new { $x ||= i }

sleep 1

puts $x

If you had neither visibility nor atomicity concerns, you’d always expect 0, right? Go ahead and run it a few more times. If you ran it with JRuby then you probably got a nonzero result straight away, anyway. Ok, so there’s a bit of bad news:

  1. ||= does not magically compile down to a CMPXCHG instruction or somesuch magic**. It’s just sugar for ‘if operand is null assign to i else return current value’. Therefore multiple concurrently executing instructions can be interleaved and potentially reference the same address before we’re done
  2.  There’s no well-defined model describing atomicity of variable reassignment or value visibility in Ruby; it’s all architecture, interpreter and interpreter-version dependent
  3. The Rack spec gives no indication whether app.call() could be invoked concurrently or not

Hrmph. So are we screwed? In the case of something like a Rails app where you’re knowingly concurrently mutating something, it’s good to wrap it in a Mutex; for the above example, that’d boil down to something like this:

lock = Mutex.new

50.times do |i|
  Thread.new { lock.synchronize { $x ||= i } }

sleep 1

puts $x

Of course, this sort of programming is horrendously deadlock/livelock prone just as in any other imperative language, regardless of type system, not to mention the fact that you just killed all the parallelism in this case; Ruby’s out-of-the-box concurrency tools are pretty coarse-grained, unfortunately.

You haven’t answered the question

Yeah, I was trying to waffle and do some veiled trolling at the same time. No, you’re not screwed. Follow a few rules of thumb and you’ll be fine:

  • Rails controllers & models seem to be reused on a per-thread basis, so if you’re sharing, e.g. class variables, you’ll at worst end up with one instance per thread (the number of which is normally capped by the server which preemptively creates them)
  • Anything at the Actionpack level or lower can be expected to be concurrently called and regenerated per-request

Rails was declared ‘thread-safe’ around 2.2.x IIRC, but that was achieved by sticking a giant mutex around the app.call() entry point, which is kind of intellectually running away from the problem. Since then the global lock was peeled back but documentation on changes to the threading model changes in the interim seems a bit thin on the ground.

I don’t want to overtly troll too much, because the challenge of concurrent programming is definitely nontrivial; the original mutex-based cop-out is not an admonishment to the Rails framework- it’s one of those things where solid foundations matter (i.e. not using C-Ruby), and one reason JRuby continues to be relevant. It took Sun, subsequently Oracle plus many other concurrency luminaries on their payroll billions of dollars and man-hours to come up with the JMM, a formidable stab at guaranteeing memory consistency across platforms where not only instruction ordering is not guaranteed (x86, ARM, I’m looking at you), but Sun, the main vendor readily agreed to mandate this in a portable, runtime-agnostic spec rather than opaquely baking the behaviour into their reference implementation. And even with all that good-will, political clout and money it took them until Java 5 to get it right. Kinda.

So yeah, congrats- you’ve beaten the odds and the threat of irrelevance and now your app has to scale; it’s time to go tackle some hard problems. Concurrency is one of them 😉


*because c’mon, nobody else fuckin’ reads this
**on MRI? What are you smoking?

Bloom filters

For some reason the subject of Bloom Filters has come up a few times recently in conversations and if you’re not solid on them, they’re definitely worth exploring both for their simplicity and because they are foundational elements in many PB-sized data stores.

Alright, what are they?

The really really short answer is that they’re a fast, space-efficient way of asking whether a thing can be found within a huge set of other things. A Bloom Filter can reply with two answers:

  1. Maybe
  2. No, definitely not

If you need to test for set membership where a set may consist of millions/billions/trillions+ of entries, they’re the go-to data structure. If you use some kind of NoSQL database, you’re probably leveraging them already.

How do they do that?

A Bloom filter, named after Burton Howard Bloom, a seemingly publicity-shy MIT grad came up with the idea that hashing could be harnessed as a means not just of testing identity, but also set membership [i.e. can I find object x in collection y?] at arbitrary scale.

Ok, so let’s agree that a hash function takes a bunch of data and produces some number. Let’s also agree, for the purposes of this discussion that the function is deterministic, otherwise we’ll be here all day. Most languages provide a facility for identifying primitives based on a hash, for example in Java:

"Hello".hashCode(); // will return 6960965

A Bloom Filter builds on this by fitting the output to a slot in a fixed-size bitmap, initially all zeroed out. For example, let’s say it has a size of 8*:


modd’ing the hash value (690965 % 8) says we should flip the bit at index 5:


What we have so far in essence is a crappy hash table- a Bloom Filter builds on this by using multiple hash functions. Without this our bitmap would quickly fill up regardless of size and the collisions would render membership tests useless.

Imagine then that we have 2 more hash functions that we modded and, for our “Hello” string, flipped bits in indices 2 and 7:


Devising good quality hash functions is a black art and also potentially expensive depending on what kind of objects you’re hashing. Commonly used implementations such as Google’s Guava take an idea from a paper by Kirsch and Mitzenmacher showing that taking a pair of positive function outputs and repeatedly multiplying them by a loop variable introduces enough entropy to do away with a bunch of unique implementations. There’s a little more to it but the core code is here.

The Test

We’re pretty much done- I was trying to build up to a great climactic dénouement but frankly, my writing style is too shoddy and Bloom Filters are too simple for that :(

There’s one last thing- how do we test for membership? We just put our object through the hash functions again. If all the bits at the corresponding indices are 1s, it’s a maybe. If any of them are still zeroed, it’s a no.

So if we see “Hello” again:


…we’re in business. Maybe. If we see “Goodbye”, and we don’t hit all ones:


…no dice.

And that really is it; the most elegant data structures are always simple. Wikipedia has some useful material on false positive probabilities if you are in the position of sizing one of these things. If you were just curious, hopefully you can see how Bloom Filters are a cheap and effective way of reducing costly lookups on main data stores- or even a probabilistic replacement for them in the case where data is infinite or can’t be replayed.

*I chose a power of 2 because many commercial implementations do this. x % y is the same as x & (y – 1) where y is a power of 2, and a single logical AND is much cheaper. The modulo operation is on a hot path, being called for every hash function in the filter; filters themselves are often called many 000′ of times a second in place of brute-forcing the backing data store, so small constants like this add up.