On interviewing

I drafted this prior to the present “Hi, I’m x but can’t do y on a whiteboard” thing, but it now seems especially prescient. Go Prescience!

tl; dr

It’s a bit more nuanced than that tweet, but this writer thinks you should know, if not have completely instant recall of some stuff. The programming interview is partially broken, but with a bit of discipline and transparency on the part of interviewers, we can make it better. This one won’t be ‘finished’; it’s something I’ll keep polishing, because that’s the nature of tech interviews.

?

I wonder how many hits the Wikipedia bubble sort page got after that. Anyway… I’ve worked in a few shops now, and I’m almost embarrassed to say how many tech interviews I’ve been in, either as an interlocutor or a witness where the questions posed are worryingly inconsistent or the rush to give everyone a feel for candidate x spooks them. Due to current restrictions on my freedom to write, these technical rants now cost me two laywers’ time in London / NY to make sure I’m on solid legal ground and as such, I’m not going to divulge any particular techniques or habits used when testing candidates for technical roles. What I can say, borne out of the frustration of failed interview loops is what it would do you no harm to learn, regardless of the tech role you’re seeking. It’s a much less exhaustive list than you think; the difference is you have to want it / give a shit. But if you really do want to jump into an engineering gig, then go for it (and keep trying to go for it if it doesn’t work out first time round); life’s too damn short to wonder what if. This industry, laden with 20-something WASPish males is crying out for people to stop wondering what if. Alright Christophe, what should I know?

Big- Oh*

I’m not sadistic enough to ask for proofs on a whiteboard, but one of the things about operating at Web Scale is that you become resource-constrained again. One of the responsibilities that come with this interesting challenge is being cognisant of how much time and/or space something takes up when a million people are doing it, as opposed to when 5 of your best friends are doing it to fake some usage stats for a VC 😉

For example, this takes O(1) time:

var x = 0;

But this takes O(n) time:

for (var i = 0; i < n; i++) {
// do something with n[i] probably
}

e.t.c. I really like people who have a handle on how much time/space something uses at the back of their heads.

When you operate at web-scale you also have to know when something just ain’t feasible; the Big-O cheat sheet will enlighten you if you’re not already sure. Basically, you can absorb most of the knowledge you need to cross the threshold of employability from a decent undergrad DS textbook (Sedgewick, Skienna, and of course Knuth if you’re feeling flush). It’s probably helpful to note that none of them really require more than high-school math.

Data Structures

I bet I flunked more DS modules than you. I fucking hated them. If you have the emotional intelligence or cynicism to tell when a lecturer is blindly reciting a {McGraw Hill / Pearson / whatever} slide deck, it can be a handicap, because:

  1. said lecturer is in no way enthused, and you know it
  2. said lecturer may / may not have instant recall of many of the the structures themselves, and you can sense it

I saw through those cowboys instantly. In fact, I had so little time for them that I looked up their pay scales and figured out that my ETF trading on its own was sufficient to negate the benefits of trying to climb any academic ladder. And so I failed and then grudgingly passed…

With all that said and despite the pervasiveness of horrendous teaching, there are few utterly obvious ones you should have instant recall of, and a couple you should at least be aware of from textbooks. The mandatory, do-not-waste-my-time-otherwise ones:

  • Fixed-size arrays
  • Stacks
  • Linked lists (just learn this parrot-fashion if it comes to it, somebody may ask you how to implement one an interview). Hint: if you look at a Lisp/Scheme implementation, they might even look elegant. The best data structures always are.
  • Maps/dictionaries/hash tables/whatever you wanna call them. How do you hash a thing into a bucket and look it up again? If your first taste of programming was in a dynamic language like JS, Ruby, Python, e.t.c you’ve probably taken this for granted. Lots of places spend lots of time, and pay lots of people lots of money to optimise these kinds of things. Sexier now?

There are some data structures / algorithms where I wouldn’t bother rote-learning the implementation details but would take care to be aware of:

  • Structures with a total-ordering; at least be aware of something like a red-black or AVL tree, even if you can’t recall how you add / remove an element from one from first principles. If you know about skip lists, I probably want to hire you already.
  • Probably something about sorting but that seems to be a hardball question even for ‘senior’ engineers these days, so relax. I would suggest you at least know merge sort, just because it introduces you to the world of theoretically embarrassingly parallel algorithms, and it’s just very elegant. I care about elegance.
  • ANYTHING about persistent data structures (e.g. HAMTs, binomial heaps, any casual reference to Okasaki) puts you ahead of 99.9% of your competition.

By the way, you should understand the time / space complexities for all the aforementioned structures; some job specs may say something about asymptotic analysis, which in practice means: “can you tell me the big-O for some operation?”. While you can learn them parrot-fashion, dragging yourself through an implementation, or rolling your own will stick it in your head for longer (forever?). If somebody says, “how does x perform?” in an interview, that’s your cue.

Bits & bytes

There are 8 bits in a byte. Just fucking learn that. Super-softball question, super easy way to stick a nail in your coffin. It’s like a free ‘5 bucks for turning up’ kind of thing.

Bits & bytes questions are somewhat contingent on byte ordering and platform, e.g. 32 / 64 bit; one of the virtues of hosted languages like Java is that nowadays you can assume 64 bit memory width, everything being signed and big-endian memory addressing. Also remember all primitive Java types have a well-defined size. Whatever language you’re using, you should know the sizes of these fundamental types. So if somebody asks you how you turn the integer -123 into 123, you can confidently say it’s ~i + 1. Also if somebody asks how you represent a signed number in Java, make sure you convert if necessary:

TYPE x = blah;
UNSIGNED TYPE z = y & x.SIZE;

Also make sure you get bit-shifting, because if you do, again you put yourself ahead of a bunch of people who passed this by in college. This kinda thing is crucial when you need to hash something, Let’s say you want to take the upper and lower 32 bits of a Java long:


long x = ...;
int y = (int) x;
int y = (int) x >>> 32;

Make sense? No? First, go learn the sizes of these fundamental types (here they are in Java), then go understand bit-shifting. Ask on SO or something- it really isn’t as terrifying as you think.

Sexy things

Of course, they all have lovely structures. But regardless of whether you’re a seasoned pro or a grad, there are certain things that the market, given the current hardware/business/political constraints likes right now. They’re not necessarily significantly more complex than the stuff you’ll learn in a rigorous CS course, but the market just places more value on them right now:

Probabilistic data structures

That’s a frightening way of describing something like:

  • Skip lists
  • Bloom filters
  • Count-min-sketch

Scary names, but they’re a) beautiful and b) likely to scare the shit out of the bottom 50% of interviewers- in which case you saved yourself a lot of time and worked out you’re too good for that place anywho.

Async <ANYTHING!>

Yeah, really. But the problem with async something is, unless you’re working for a startup or agency which harbours perpetual greenfield projects, it typically has to fit in with a big investment in language / framework X.

One common question I ask when I see a node-heavy CV is, “when wouldn’t you use node?”, trying to imply a differentiation between certain workloads (you see how, apart from gut-feeling, this may pass you by unless you’re comfortable with Big-O?). I don’t think I’ve really had a decent solid answer from any candidates. The glib answer is to go here and read another of my diatribes. The more sane suggestion is to get your head around an event notification system like epoll and understand how it differs from its predecessors. The even less instructive but more reliable suggestion is to develop a large-scale system using a nonblocking framework like node from the ground-up and see what problems you run into. If you’ve done that, expect some questions about the above from a decent place. If they just quiz you on the latest trends on npm then you are not working with top-tier developers.

More?

For the most part, the ideal tech interview should be reassuringly boring but reflect the fact that while constraints and technologies change, the underlying maths and CS don’t (much). If you’re being hit with a tonne of riddles or things which were open computer-science problems for long periods of time (i.e. with non-obvious ways of decomposing the problem), don’t feel bad- the interviewer probably just wanted to make themselves feel smart. Things like detecting cycles in linked lists fall into this category.

Please also understand a huge caveat with this shopping list of CS topics- I believe improving your grasp of these things will multiply your chances of acing a tech interview but algorithmic ability and problem decomposition are necessary but not sufficient prerequisites for becoming a great programmer; it’s unlikely, for example, that you’ll go through an interview loop where your solid whiteboard implementation of a linked list is undone by an admission that you’ve never used something like LISP or ML in anger, even though in my experience such programmers tend to produce qualitatively better code in more mundane but commonplace languages like Java or Python. I still think that an understanding of things like parsers, compilers, consensus algorithms plus FP and pointers underpin what mostly qualifies as great software, open or otherwise. But that’s for another post or ten…

This is hopefully something of a living document, so I’ll add/amend bits as necessary.


 

*not that one. Grow up 😉