JavaScript / CSS minification for JRuby

I didn't think I'd have to do this, but after seeing that the ruby-yui-compressor gem forks a java process every time, I thought that, whilst using JRuby, that's unequivocally Doing It Wrong™. So I wrapped the YUI Java compressor library in a gem and called it JMinify. If anybody needs to compress their assets and they're using JRuby, knock yourselves out.

Filed under  //   ruby  

Comments [0]

Parallel map: more JRuby concurrency mischief

Like my last post this is more for my future benefit but if anybody else finds it useful then that's cool too. Unlike the last one the fruits of my tinkering yielded a nice linear speedup.

Ok, let's parallelize Array#map. We'll break down the task as follows:

  1. Split the array into chunks
  2. Execute the chunks in asynchronously, in parallel, waiting for them all to complete
  3. Merge the chunks into a new array and return it

How many chunks is optimal? There's no definitive answer; In the past I've opted for a very large number of small sub-arrays, e.g. for concurrent divide & conquer reductions where the minimal array length was some low power of the number of processors (I've played with associative reduction algorithms in the past). For our #pmap method I'm just going to split the original array up into as many chunks as there are logical cores on my machine. How do you find that out in JRuby? Java to the rescue again:

Now we need a pool of workers to assign tasks to. As parallel mapping is strictly CPU bound, a thread pool with fixed thread count but an unbounded work queue is probably most appropriate:

That's our ExecutorService up & running, we just need to do a bit of housekeeping before we can write our Array#pmap method. This is where Java's baroque-complexity boilerplate rears its ugly head (wouldn't it be nice if Executors could take lambdas as arguments for mass invocation?). Basically we implement the Callable interface- I instantiate my Task implementation with a block which the executor calls when it executes:

So now we're good to go. Array#pmap here takes an executor as a first argument because I didn't want the class to be responsible for starting / shutting down the work queue, but that's just an implementation decision.

Once the original array is mapped to a set of Task classes, they can be handed to the executor. I call ExecutorService#invokeAll because it blocks until all the submitted work is done. It returns an array of FutureTasks which can be dereferenced immediately (we want the method to return something!):

So does all that tomfoolery actually buy you any more performance? Time for a highly unscientific benchmark, incrementing an array of the first million Fixnums:

Not bad- I ran it on a Sandy Bridge i5 with 2 cores and 4 CPU threads.

More important than a cheesy parallel mapping imeplementation, I've learned (or is that re-learned?) two axioms about playing with concurrency in JVM hosted languages:

  1. As in the previous post, executors expect some kind of anonymous inner class as an argument in the absence of closures. You need to be aware of the cost of converting a ruby closure to a Java Runnable, Callable, Future etc; think of I/O-bound problems where allocating extra objects for each request/event is almost certainly not a good thing. You'll definitly save time writing in straight Java as you won't have to worry whether or not the closure you just used is going to throw all your performance gains out the window.
  2. Don't discount the JVM's ability to make a fool of your benchmarks. The one above isn't worth the screen real-estate it occupies due to its simplicity and duration relative to startup / shutdown time of the VM. Both JIT and Server modes take time to hit a 'quiescent state' where performance stabilises. Don't just look at wall-time, profile your code properly. Lots has been said about this elsewhere so I won't rehash what others have articulated better, but be aware HotSpot has its own performance quirks.

The example in its entirety is here.

Filed under  //   concurrency   jruby   ruby  

Comments [0]

Easy thread safety with JRuby

EDIT: Beauty, as they say, is pain. In exchange for liberating your code from locks, piling the work onto a queue is approximately ~10x slower than a traditional approach. A quick profile suggests that the expense of block creation is non-trivial. It's still nicer to look at though, right?

More for my benefit, but it's handy that JRuby lets you use asynchronous Java queues. Want to make something like this thread-safe?

Just wrap the assignment in a Runnable (JRuby coerces blocks/Procs/lambdas into Runnables for you) and submit it to a single-threaded Executor. All calls to inc and dec execute one at a time, in-order:

Tasks are applied FIFO, @value stays consistent without locks- sound familiar?

Filed under  //   ruby  

Comments [0]

Avoiding stack overflow in Ruby with trampolines

Languages that don't support tail-call elimination out of the box can make people think twice about using recursion. Ever received a SystemStackError when doing something like this?

Ok, the example's contrived, but you've no doubt come across cases where you'd like to avoid using an iterative solution even if it's just for the sake of elegance. Luckily in Ruby this can be easily fixed by:

  1. Having your method return a thunk rather than a direct tail-call
  2. Using a trampoline to avoid growing the stack

A thunk is essentially just the suspended application of a function. Taking our example, it's just a matter of transforming the tail call like so:

The trampoline implementation is equally trivial. It takes a thunk as an argument and iteratively calls it until something other than a continuation is returned:

So now if we start the ball rolling with our trampoline call:

We can run increment as long as we like without blowing the stack. As there's no base case it'll continue indefinitely- a worst-case example. Using continuation-passing here trades a time-penalty in higher-order function calls for the advantage of not increasing the amount of space needed to perform the computation. Not too hard really, is it?

Alternatively we could use Kernel#callcc to replay the computation until we have a return value. Here we get a continuation object to use with callcc { ... }, give @k the correct execution context by reassigning the thunk as before, then call it until we're done:

If you want to get a proper grasp of CPS you could do a lot worse than get hold of Dybvig's book and go through all the call/cc examples.

Filed under  //   ruby  

Comments [2]

SI- Because rvm and rbenv are overkill

It's said that more than any other programming community Lispers tend to dislike working on problems that have already been solved. But this dogmatic avoidance of duplication doesn't seem to have permeated the rubyists' zeitgeist to the same degree.

I'm not in the habit of regularly compiling ruby interpreters for the hell of it- it's something I'd do maybe twice a year tops. I recently rebuilt rubinius and ruby 1.9 for Lion and that'll probably be it for 6 months. If you keep all your interpreters in userland and you aren't such an idiot that you can't install them yourself, how is it that the nebulous concept of a ruby version manager has come into vogue?

Switching interpreters shouldn't be any more complex than having a script to symlink the executable paths for you:

A couple of clarifications:

  • I install all my rubies into ~/.ruby/
  • The current ruby gets all its executables symlinked into ~/.ruby/current/bin
  • I added ~/.ruby/current/bin onto the end of my $PATH
  • That is all

I have it saved in /usr/local/bin/si. SI is short for 'switch interpreter'. A monkey can use it, although none have been known to do so.

The hash of interpreters is hard-coded and I can't even remember what else I did to get it working but that's not the point. It's a 15 minute job I tackled and forgot about 2 years ago. Let's use some initiative, stop writing version managers and do some bloody work.

Filed under  //   diatribe   ruby  

Comments [0]