Chris Mowforth http://chris.mowforth.com And we are the dreamers of dreams posterous.com Fri, 18 May 2012 02:15:00 -0700 A quick preview of the actor API http://chris.mowforth.com/a-quick-preview-of-the-actor-api http://chris.mowforth.com/a-quick-preview-of-the-actor-api

Since the actor / future API is starting to mature, I thought it'd be nice to show what concurrent programming in JavaScript actually looks like. To begin with, here's the canonical actor ping-pong example; two actors collaboratively incrementing a counter without explicitly sharing any state:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// Shamelessly pulled from the Rubinius example
// at http://rubini.us/doc/en/systems/concurrency/
var ping = new actor(function(msg) {
  while (true) {
    if (msg === 1000) {
      console.log(msg);
      break;
    }
    this.reply(msg++);
  }
});

var pong = new actor(function(msg) {
  while (true) {
    if (msg === 1000) {
      console.log(msg);
      break;
    }
    this.reply(msg++);
  }
});

ping.send(1);

You can also safely hot-swap the message handler inside an actor. When you call upgrade, the supplied function gets pushed onto an internal stack and used as the new handler. Calling downgrade pops it off and uses the previous function (or the original one if you're at the bottom of the stack).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var x = new actor(function(msg) {
  console.log('Foo!');
});

x.send('something'); // => 'Foo!'

x.upgrade(function(msg) {
  console.log('Bar!');
});

x.send('something'); // => 'Bar!'

x.downgrade();

x.send('something'); // => 'Foo!'

Because message processing is mutually exclusive (any given actor will only process one message in its mailbox at any one time) this is generally quite safe to perform on a live system- no restarts! I say generally because messages passed to actors must be immutable, and neither JavaScript nor Java can enforce that sufficiently strongly to stop the ignorant or the strong-willed from throwing caution to the wind. For now, you just have to be disciplined. But it's a promising start, anyway.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Thu, 03 May 2012 12:56:00 -0700 Rhinode can talk to the internets! http://chris.mowforth.com/rhinode-can-talk-to-the-internets http://chris.mowforth.com/rhinode-can-talk-to-the-internets

I've held off talking about Rhinode up until now, primarily because going into detail about the actor system, STM and concurrent I/O in JavaScript is all a bit meaningless without a concrete example to flesh it out. Well that's no longer an impediment, because Rhinode now has a structured and extensible (albeit incomplete) event system and a TCP library emulating the core functions from the net module in node.js. What does this mean in practice? For starters it means running this example from the node.js documentation in no longer throws a stack trace:

1
2
3
4
5
6
7
8
9
10
11
12
var net = require('net');
var server = net.createServer(function(c) { //'connection' listener
  console.log('server connected');
  c.on('end', function() {
    console.log('server disconnected');
  });
  c.write('hello\r\n');
  c.pipe(c);
});
server.listen(8124, function() { //'listening' listener
  console.log('server bound');
});

More profoundly, each 'connection' event gets executed inside an actor whose work can be distributed across Rhinode instances, all transparently to the developer. Put another way, I've hit one of my short-term goals for this little toy project: to provide an almost seamless way for developers to transplant their JS to the JVM, scaling to multiple cores in the process, without having to resort to the somewhat less refined tools currently available to node.js. Happy hacking. Lots more to come...

GitHub repo is here.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Thu, 03 May 2012 01:15:00 -0700 Installing Solaris 11 from a customized PXE server http://chris.mowforth.com/installing-solaris-11-from-a-customized-pxe-s http://chris.mowforth.com/installing-solaris-11-from-a-customized-pxe-s

While the Oracle documentation goes to great lengths explaining the process of setting up a netboot server in Solaris, if you're bootstrapping off a machine running another flavour of UNIX/Linux, this isn't particularly helpful. Ok, go:

The basic strategy is as follows:

  1. Setup the PXE server
  2. Drop the Solaris grub binary, kernel and boot archive into the server
  3. Set up an HTTP server to deliver the remainder of the system

First make sure you have a functional PXE server. This article covers most of the fundamental points, just ignore the bit about dropping a Debian kernel. I'm using dnsmasq for DHCP/DNS resolution and tftpd-hpa for my TFTP server. For the purposes of this exercise I'll assume the root of the TFTP server is /tftpboot.

When you've that done, grab a copy of the Solaris Automated Install .iso (referred to as AI in the Oracle doc). Extract its contents somewhere. Our PXE server's only role is to serve up the customized Oracle grub binary and kernel so the client can load a minimal Solaris subsystem. For that to happen we need to copy boot/platform, boot/grub/pxegrub and boot/grub/menu.lst to the root of the tftp server. The /tftpboot substructure should look something like this:

1
2
3
4
5
6
7
/tftpboot
../boot
..../grub/menu.lst
..../pxegrub
../platform
..../i86pc/amd64/boot_archive
....i86pc/kernel/amd64/unix

And the grub config will look something like this (make sure the ip and path for the last argument points to where the system will be downloaded over http):

1
2
3
4
5
6
7
timeout 60

default 0

title Solaris
kernel /tftpboot/platform/i86pc/kernel/amd64/unix -B install_media=http://192.168.1.2/solaris
module /tftpboot/platform/i86pc/amd64/boot_archive

Make sure dnsmasq knows to serve up grub image for PXE boots. Note that forcing a 150 option down the wire seems to be crucial for SunOS; maybe it was just the machines I tested it on:

1
2
dhcp-option-force=150,/tftpboot/boot/grub/menu.lst # This seems to be crucial
dhcp-boot=/tftpboot/boot/grub/pxegrub,netbootserver,192.168.1.1

Now you need to serve up the bulk of the OS over http. Use whatever web server you prefer (I'm using nginx) and drop extracted .iso folder into the webroot (I renamed it solaris). Double check you can reach the URL referred to in the grub config.

At this stage you should be good to go; the AI installs a basic headless system. If you want gnome, you'll have to run:

1
pfexec pkg install solaris-desktop

I find netbooting straight to grub gives you the flexibility to configure & boot any number of OSes easily without having to mess around with the tftp or dhcp config; just download a netbootable kernel of whatever operating system you need, then add a couple more lines to your grub config.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 22 Apr 2012 11:56:00 -0700 DNS resolution using JNDI http://chris.mowforth.com/dns-resolution-using-jndi http://chris.mowforth.com/dns-resolution-using-jndi

This is the kind of area where standing on the shoulders of JDK-giants pays dividends: the JNDI interface for looking up DNS records is surprisingly simple:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import java.util.Hashtable;
import javax.naming.NamingEnumeration;
import javax.naming.NamingException;
import javax.naming.directory.Attribute;
import javax.naming.directory.Attributes;
import javax.naming.directory.DirContext;
import javax.naming.directory.InitialDirContext;

class DNSLookup {

   private static void setupProvider() {
       Hashtable<String,String> env = new Hashtable<String, String>();
       env.put("java.naming.factory.initial", "com.sun.jndi.dns.DnsContextFactory");
       try {
         provider = new InitialDirContext(env);
       } catch (NamingException e) { System.out.println(e); }
   }

   public static String[] resolve(String domain, String record) {
       if (provider == null) setupProvider();

       try {
           Attributes query = provider.getAttributes(domain, new String[] { record });
           Attribute records = query.get(record);
           NamingEnumeration recordData = records.getAll();
           int size = records.size();
           String[] data = new String[size];
           int i = 0;
           while (i < size) {
               data[i] = recordData.next().toString();
               i++;
           }
           return data;
       } catch (NamingException e) {
           System.out.println(e);
           return null;
       }
   }

}

Create a context, sling in a domain and record type, and off you go. Another reason why implementing rhinode has involved so little work on my part... I've not looked, but I suspect the original node.js implementation is a little longer than 40 LOC :)

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Tue, 21 Feb 2012 19:56:00 -0800 JavaScript / CSS minification for JRuby http://chris.mowforth.com/javascript-css-minification-for-jruby http://chris.mowforth.com/javascript-css-minification-for-jruby

I didn't think I'd have to do this, but after seeing that the ruby-yui-compressor gem forks a process every time, I thought that, whilst using JRuby, that's unequivocally Doing It Wrong™. So I wrapped the YUI Java compressor library in a gem and called it JMinify. If anybody needs to compress their assets and they're using JRuby, knock yourselves out.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 25 Dec 2011 13:31:00 -0800 Parallel map: more JRuby concurrency mischief http://chris.mowforth.com/parallel-map-more-jruby-concurrency-mischief http://chris.mowforth.com/parallel-map-more-jruby-concurrency-mischief

Like my last post this is more for my future benefit but if anybody else finds it useful then that's cool too. Unlike the last one the fruits of my tinkering yielded a nice linear speedup.

Ok, let's parallelize Array#map. We'll break down the task as follows:

  1. Split the array into chunks
  2. Execute the chunks in asynchronously, in parallel, waiting for them all to complete
  3. Merge the chunks into a new array and return it

How many chunks is optimal? There's no definitive answer; In the past I've opted for a very large number of small sub-arrays, e.g. for concurrent divide & conquer reductions where the minimal array length was some low power of the number of processors (I've played with associative reduction algorithms in the past). For our #pmap method I'm just going to split the original array up into as many chunks as there are logical cores on my machine. How do you find that out in JRuby? Java to the rescue again:

1
$cores = Runtime.getRuntime.availableProcessors

Now we need a pool of workers to assign tasks to. As parallel mapping is strictly CPU bound, a thread pool with fixed thread count but an unbounded work queue is probably most appropriate:

1
queue = Executors.newFixedThreadPool($cores)

That's our ExecutorService up & running, we just need to do a bit of housekeeping before we can write our Array#pmap method. This is where Java's baroque-complexity boilerplate rears its ugly head (wouldn't it be nice if Executors could take lambdas as arguments for mass invocation?). Basically we implement the Callable interface- I instantiate my Task implementation with a block which the executor calls when it executes:

1
2
3
4
5
6
7
8
9
10
11
12
class Task
  include Callable

  def initialize(&block)
    @work = block
  end

  def call
    @work.call
  end
  
end

So now we're good to go. Array#pmap here takes an executor as a first argument because I didn't want the class to be responsible for starting / shutting down the work queue, but that's just an implementation decision.

Once the original array is mapped to a set of Task classes, they can be handed to the executor. I call ExecutorService#invokeAll because it blocks until all the submitted work is done. It returns an array of FutureTasks which can be dereferenced immediately (we want the method to return something!):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Array

  def pmap(executor, &block)
    # Parcel out the work into chunks to be executed sequentially
    tasks = self.each_slice(self.size / $cores).map do |slice|
      Task.new { slice.map &block }
    end

    # Execute them all, block until they're done
    results = executor.invokeAll(tasks)

    # Dereference and merge all the FutureTasks
    results.reduce([]) { |memo,obj| memo + obj.get }
  end
  
end

So does all that tomfoolery actually buy you any more performance? Time for a highly unscientific benchmark, incrementing an array of the first million Fixnums:

1
2
3
4
5
chris@think-chris:~/Documents/Experimentation$ jruby pmap.rb
"Splendid new Array#pmap"
"That took 0.287166921s"
"Plain old sequential Array#map"
"That took 0.537166921s"

Not bad- I ran it on a Sandy Bridge i5 with 2 cores and 4 CPU threads.

More important than a cheesy parallel mapping imeplementation, I've learned (or is that re-learned?) two axioms about playing with concurrency in JVM hosted languages:

  1. As in the previous post, executors expect some kind of anonymous inner class as an argument in the absence of closures. You need to be aware of the cost of converting a ruby closure to a Java Runnable, Callable, Future etc; think of I/O-bound problems where allocating extra objects for each request/event is almost certainly not a good thing. You'll definitly save time writing in straight Java as you won't have to worry whether or not the closure you just used is going to throw all your performance gains out the window.
  2. Don't discount the JVM's ability to make a fool of your benchmarks. The one above isn't worth the screen real-estate it occupies due to its simplicity and duration relative to startup / shutdown time of the VM. Both JIT and Server modes take time to hit a 'quiescent state' where performance stabilises. Don't just look at wall-time, profile your code properly. Lots has been said about this elsewhere so I won't rehash what others have articulated better, but be aware HotSpot has its own performance quirks.

The example in its entirety is here.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Thu, 22 Dec 2011 13:22:00 -0800 Easy thread safety with JRuby http://chris.mowforth.com/thread-safe-attrwriters-in-jruby http://chris.mowforth.com/thread-safe-attrwriters-in-jruby

EDIT: Beauty, as they say, is pain. In exchange for liberating your code from locks, piling the work onto a queue is approximately ~10x slower than a traditional approach. A quick profile suggests that the expense of block creation is non-trivial. It's still nicer to look at though, right?

More for my benefit, but it's handy that JRuby lets you use asynchronous Java queues. Want to make something like this thread-safe?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Something

  def initialize(value)
    @value = value
  end

  # Not thread-safe

  def inc
    @value += 1
  end

  def dec
    @value -= 1
  end

end

Just wrap the assignment in a Runnable (JRuby coerces blocks/Procs/lambdas into Runnables for you) and submit it to a single-threaded Executor. All calls to inc and dec execute one at a time, in-order:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
require 'java'

java_import java.util.concurrent.Executors

class Something

  def initialize(value)
    @queue = Executors.newSingleThreadExecutor
    @value = value
  end

  def inc
    @queue.execute { @value += 1 }
  end

  def dec
    @queue.execute { @value -= 1 }
  end

  def finalize
    @queue.shutdown
  end
end

Tasks are applied FIFO, @value stays consistent without locks- sound familiar?

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 05 Dec 2011 10:33:00 -0800 Notable omissions in the James-Joyce CS Section http://chris.mowforth.com/notable-omissions-in-the-james-joyce-cs-secti http://chris.mowforth.com/notable-omissions-in-the-james-joyce-cs-secti

Don't get me wrong- the JJ has most of the works you'll need as a CS undergrad, especially if Java's the only thing you're exposed to (although what that says about the reader I leave as an exercise). However it's not that hard to push the boundaries and find a few glaring seminal works either omitted completely or severely lacking in quantity.

It's not sufficient to have a solitary copy of a significant text and 20 copies of a lesser one when the latter is not only objectively better but more likely to age well, a particularly acute problem in a CS section. "Underwater basket-weaving Synergies with Java" or whatever might get mediocre kids through exams in 2011, and maybe that's all the CSI department cares about, in which case I'm wasting my time convincing them that shelling out for a few more copies of SICP every so often would get a decent return on investment. But on the off-chance that they give a shit about something other than producing fodder for cube-farms I'm compiling a list of what I think is currently found wanting.

An omission qualifies as the absence or shortage of physical copy from the shelf; I don't care if they have a link to an electronic copy on the catalogue search, I'm referring to dead trees:

If When I find any more I'll update the list.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Tue, 29 Nov 2011 11:22:00 -0800 Avoiding stack overflow in Ruby with trampolines http://chris.mowforth.com/avoiding-stack-overflow-in-ruby-with-trampoli http://chris.mowforth.com/avoiding-stack-overflow-in-ruby-with-trampoli

Languages that don't support tail-call elimination out of the box can make people think twice about using recursion. Ever received a SystemStackError when doing something like this?

1
2
3
def increment(num=0)
  increment(num + 1)
end

Ok, the example's contrived, but you've no doubt come across cases where you'd like to avoid using an iterative solution even if it's just for the sake of elegance. Luckily in Ruby this can be easily fixed by:

  1. Having your method return a thunk rather than a direct tail-call
  2. Using a trampoline to avoid growing the stack

A thunk is essentially just the suspended application of a function. Taking our example, it's just a matter of transforming the tail call like so:

1
return lambda { increment(num + 1) }

The trampoline implementation is equally trivial. It takes a thunk as an argument and iteratively calls it until something other than a continuation is returned:

1
2
3
4
def trampoline(&thunk)
  thunk = thunk.call while thunk.respond_to?(:call)
  thunk
end

So now if we start the ball rolling with our trampoline call:

1
trampoline { increment }

We can run increment as long as we like without blowing the stack. As there's no base case it'll continue indefinitely- a worst-case example. Using continuation-passing here trades a time-penalty in higher-order function calls for the advantage of not increasing the amount of space needed to perform the computation. Not too hard really, is it?

Alternatively we could use Kernel#callcc to replay the computation until we have a return value. Here we get a continuation object to use with callcc { ... }, give @k the correct execution context by reassigning the thunk as before, then call it until we're done:

1
2
3
4
5
6
def trampoline(&thunk)
  callcc { |@k| }
  thunk = thunk.call
  @k.call unless !thunk.respond_to?(:call)
  thunk
end

If you want to get a proper grasp of CPS you could do a lot worse than get hold of Dybvig's book and go through all the call/cc examples.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 31 Oct 2011 22:23:00 -0700 LiveConnect on OS X Lion http://chris.mowforth.com/liveconnect-on-os-x-lion http://chris.mowforth.com/liveconnect-on-os-x-lion

For anybody who's ever had to get java <-> javascript interaction working but was wondering why Apple's JDK couldn't find netscape.javascript.JSObject, the elusive plugin.jar file is kept in /Library/Java/Home/lib. If you install it into your local maven repository you can get yourself into all kinds of trouble writing applets in clojure.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 31 Oct 2011 10:39:00 -0700 The disruptor pattern for the uninitiated http://chris.mowforth.com/the-disruptor-pattern-for-the-uninitiated http://chris.mowforth.com/the-disruptor-pattern-for-the-uninitiated

The concept of a single-threaded business-logic processor (BLP) which feeds from, and into concurrent and durable disruptors started a large discussion on HN the other day. The LMAX technical paper gives a more elaborate explanation of the architecture but if you can't face reading through it then consider this discussion on StackOverflow.

I can see its utility in a problem domain like trading platforms, but I don't think it's a good solution for embarrassingly parallel applications like signal processing, cryptography etc.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 03 Oct 2011 08:31:00 -0700 Better living through fib functions http://chris.mowforth.com/better-living-through-fib-functions-55461 http://chris.mowforth.com/better-living-through-fib-functions-55461

...or why posting benchmarks of fib a function in x language:

  1. is a slightly puerile approach to discussing performance and scalability
  2. ignores any number of reasons why people would choose node.js, other than throwing out received knowledge for the hell of it

But with the discussion raging (it now appears Haskell is the cure after all) I thought I'd venture a clojure solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
(ns fib.core
  (:use [lamina.core]
        [aleph.http]))

(defn fib-seq []
  ((fn rfib [a b]
     (cons a (lazy-seq (rfib b (+ a b)))))
       0 1))

(defn response []
  (str (reduce + (take 40 (fib-seq)))))

(defn better-living-thru-fib [channel request]
  (enqueue channel {:status 200
                    :headers {"Content-Type" "text/plain"}
                    :body (response)}))

(start-http-server better-living-thru-fib {:port 1337})

And the requisite benchmark of questionable utility:

1
2
3
4
chris-mac:fib chris$ time curl http://localhost:1337
real 0m0.047s
user 0m0.014s
sys 0m0.008s

With that we've proved:

  1. A fib function can indeed be written in clojure, and its return value piped to a web browser
  2. Um, that's about it

Presumably the OP will be enraged to find that clojure also has its own web server (piggybacking off aleph here). That said I didn't realise retro CGI scripting was so avant-garde in 2011; guess I'm not 'deck' enough to realise but then I don't have a fixie or an assymetric haircut either :/

Really I haven't dabbled with node much because I just don't can't warm to javascript; I find the language constructs native to lisp dialects (mostly clojure and scheme) much more elegant and alluring, but it's a subjective thing. Having grown accustomed to continuations, first-class concurrency primitives, pure functions, lambdas etc JS (and by implication node) feels like distinctly barren linguistic territory.

I only wrote this post because I was at a loose end but it is surely proof by contradiction that this dialogue achieves nothing. You need more than contrived benchmarks to qualify the shortcomings in a language. Leave them to college kids like me with nothing better to do ;)

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 02 Oct 2011 10:01:00 -0700 Useful reading material for STM http://chris.mowforth.com/useful-reading-material-for-stm http://chris.mowforth.com/useful-reading-material-for-stm

Mainly for my benefit but if anybody else is interested:

I'll add more once I get through this lot.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Wed, 17 Aug 2011 16:57:00 -0700 SI- Because rvm and rbenv are overkill http://chris.mowforth.com/si-because-rvm-and-rbenv-are-overkill http://chris.mowforth.com/si-because-rvm-and-rbenv-are-overkill

It's said that more than any other programming community Lispers tend to dislike working on problems that have already been solved. But this dogmatic avoidance of duplication doesn't seem to have permeated the rubyists' zeitgeist to the same degree.

I'm not in the habit of regularly compiling ruby interpreters for the hell of it- it's something I'd do maybe twice a year tops. I recently rebuilt rubinius and ruby 1.9 for Lion and that'll probably be it for 6 months. If you keep all your interpreters in userland and you aren't such an idiot that you can't install them yourself, how is it that the nebulous concept of a ruby version manager has come into vogue?

Switching interpreters shouldn't be any more complex than having a script to symlink the executable paths for you:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/Users/chris/.ruby/current/bin/ruby

RUBY_HOME = ENV['RUBY_HOME']

RUBIES = {:jruby => 'jruby-1.6.0', :rubinius => 'rubinius/1.2', :mri => 'mri19'}
GEM_EXECUTABLES = {:rubinius => 'gems/bin', :mri => ''}

if ARGV.empty?
  p 'Available interpreters:'
  RUBIES.each { |key,value| p key }
  exit
end

ruby = ARGV.first

update_symlinks = lambda { |ruby|
  ruby_path = RUBY_HOME + '/' + RUBIES[ruby]
  current_bin_path = RUBY_HOME + '/current/bin'
  
  executables = Dir.glob("#{ruby_path}/bin/*")

  system("rm -rf #{current_bin_path}/*")

  executables.each do |x|
    system("ln -s #{x} #{current_bin_path}/")
  end
  
  gem_bins = "#{ruby_path}/#{GEM_EXECUTABLES[ruby]}"
  gem_bin_path = RUBY_HOME + '/current/gem_bin'
  
  system("rm -rf #{gem_bin_path}")
  
  system("ln -s #{gem_bins} #{gem_bin_path}")
}

A couple of clarifications:

  • I install all my rubies into ~/.ruby/
  • The current ruby gets all its executables symlinked into ~/.ruby/current/bin
  • I added ~/.ruby/current/bin onto the end of my $PATH
  • That is all

I have it saved in /usr/local/bin/si. SI is short for 'switch interpreter'. A monkey can use it, although none have been known to do so.

The hash of interpreters is hard-coded and I can't even remember what else I did to get it working but that's not the point. It's a 15 minute job I tackled and forgot about 2 years ago. Let's use some initiative, stop writing version managers and do some bloody work.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 15 Aug 2011 16:37:00 -0700 More scripting with Clojure http://chris.mowforth.com/more-scripting-with-clojure http://chris.mowforth.com/more-scripting-with-clojure

The old man wanted to pay $49.99 for some p.o.s shareware app to compare and remove duplicate files in Windows. Seeing this as something of a challenge, I got a stopwatch and went to see what alternative I could come up with in half an hour or so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
(ns compo.core
  (:use [clojure.set :only [difference]])
  (:gen-class))

(defn checksum [file]
  (let [input (java.io.FileInputStream. file)
        digest (java.security.MessageDigest/getInstance "MD5")
        stream (java.security.DigestInputStream. input digest)
        bufsize (* 1024 1024)
        buf (byte-array bufsize)]

  (while (not= -1 (.read stream buf 0 bufsize)))
  (apply str (map (partial format "%02x") (.digest digest)))))

(defn list-dir [dir]
  (reverse (sort-by #(.lastModified %)
                    (remove #(.isDirectory %)
                            (file-seq (java.io.File. dir))))))

(defn find-dupes [root]
  (prn "Computing checksums...")
  (let [files (list-dir root)]
    (let [summed (zipmap (pmap #(checksum %) files) files)]
      (difference
       (into #{} files)
       (into #{} (vals summed))))))

(defn remove-dupes [files]
  (prn (str "Found " (count files) " duplicate files which can be removed."))
  (doseq [f files]
    (prn (str (.toString f) " - [y/n]"))
    (if-let [choice (= (read-line) "y")]
      (.delete f))))

(defn -main [& args]
  (if (empty? args)
    (println "Enter a root directory")
    (remove-dupes (find-dupes (first args))))
  (System/exit 0))

Turns out in clojure, the answer is just enough to do the job.

Sometimes less is more (thanks to this post for computing an md5 on a file). And throwing in pmap performs the checksumming in parallel without any extra thought required. Job.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 31 Jul 2011 02:29:00 -0700 Installing Grand Central Dispatch on Linux http://chris.mowforth.com/installing-grand-central-dispatch-on-linux http://chris.mowforth.com/installing-grand-central-dispatch-on-linux

I've been curious about getting libdispatch and the blocks runtime compiling in Ubuntu for some time but naively guessed that it wouldn't be for the faint-hearted since the only useful result yielded by Google was this thread on SO. How wrong I was!

For this exercise I pulled the iso for natty server from the ubuntu site and did a fresh install in a VirtualBox VM. The only extra I added during install was an SSH daemon so I could use the terminal on my Mac.

Libdispatch needs llvm/clang, libkqueue and the blocks runtime which are already available through apt-get in natty so let's install them.

You'll also need libpthread-workqueue0. I had to download the .deb packages from oneiric here but they installed without any hassle:

1
2
3
4
5
6
# Core dependencies
sudo apt-get install clang libblocksruntime-dev libqkueue-dev

# Libpthread
sudo dpkg -i libpthread-workqueue0_0.7-1ubuntu1_i386.deb
sudo dpkg -i libpthread-workqueue-dev_0.7-1ubuntu1_i386.deb

I compiled libdispatch itself from source. Don't grab it from MacOSforge, download the tarball used to make the .deb package for oneiric. The installation will also need make, autoconf, autogen and libtool. Just to save yourself any hassle trying to solve missing header issues, install the build-essential package and gcc-multilib while you're at it:

1
sudo apt-get install make autoconf autogen libtool build-essential gcc-multilib

You should be set to compile libdispatch now. There should already be a configure file in the 'libdispatch' root folder so let's install:

1
2
3
CC=clang ./configure
make
sudo make install

Make sure you force clang as the compiler, not gcc. The end of the install message should tell you where the libs were installed, for me it was /usr/local/lib. Ubuntu rather stupidly ignores erases the LD_LIBRARY_PATH variable if you set it through the shell, so add a file in /etc/ld.so.conf.d/* pointing to this location if you don't have one already.

You should have Grand Central set up by now. Let's write a hello world programme and see if it works:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <dispatch/dispatch.h>
#include <stdio.h>

int main() {
  dispatch_queue_t queue = dispatch_queue_create(NULL, NULL);

  dispatch_sync(queue, ^{
    printf("Hello, world from a dispatch queue!\n");
  });

  dispatch_release(queue);

  return 0;
}

And to compile:

1
clang -o hi hello.c -fblocks -ldispatch

And if all is well you should get the message printed to the screen. I haven't tried anything computationally intensive yet and I wouldn't be surprised if there's a performance discrepancy between FreeBSD / OS X, since libkqueue is just a wrapper around epoll(). But hello world works, and that's half the battle, right?

 

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Fri, 29 Jul 2011 10:38:00 -0700 CMFunctionalAdditions- multicore ruby-like utilities for Objective-C http://chris.mowforth.com/cmfunctionaladditions-multicore-ruby-like-uti-64127 http://chris.mowforth.com/cmfunctionaladditions-multicore-ruby-like-uti-64127

Hot on the heels of the last post, I've decided say a bit more about CMFunctionalAdditions. As I mentioned last time the project is the result of functional and syntatic-sugar widthdrawl symptoms. In a nutshell, CMFunctionalAdditions is my take on being able to call the following in Objective-C without altering the receiver:

  • Map
  • Map with index (each_with_index.map...)
  • Reduce (inject, fold, whatever)
  • Filter (select)
  • Remove (reject, delete, whatever)
  • Partition (the ruby flavour, not the clojure flavour)
  • Unique
  • Flatten
  • Take (take_while)
  • tbc

Additionally many of the linear-time methods above run in O(n ∕ P) time or better with Grand Central Dispatch. I want concurrency to be transparent to the end user wherever possible. A picture speaks a thousand words so to rip the examples from the CLI demo...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// We're using this array to play with:
NSArray* sample = [NSArray arrayWithObjects:@"foo",
                                            @"bar",
                                            @"baz",
                                            @"teapot",
                                            nil];


// Shall we map them so f(x) -> "SOMETHING#{x}"?
NSArray* mapped = [sample mapWithBlock:^id(id obj) {
    return [NSString stringWithFormat:@"SOMETHING%@", obj];
}];


// Let's map x with its index
NSArray* mappedIndex = [sample mapWithIndexedBlock:
  ^id(NSUInteger idx, id obj) {
      return [NSString stringWithFormat:@"%@ AT INDEX %lu",
                                        obj,
                                        idx];
}];


// How about reducing it?
id reduced = [sample reduceWithBlock:^id(id memo, id obj) {
    return [NSString stringWithFormat:@"%@-%@", memo, obj];
} andAccumulator:@""];


// But we don't like teapots; let's remove them
BOOL (^discriminator)(id obj) = ^(id obj) {
  return [obj isEqual:@"teapot"];
};

NSArray* teapotFree = [sample removeWithPredicate:
                              discriminator];


// Even better, let's segregate them
NSArray* segregated = [sample partitionWithBlock:
                              discriminator];


// Now let's break *sample up into
// an array of 2-element NSArrays
NSArray* chunked = [sample splitWithSize:2];

The framework currently requires Snow Leopard or better but Lion might become a prerequisite (see last post). iOS 4 should work but I haven't tried it. I'm also attempting compilation on Linux as this is being written. As the only dependencies are Foundation and libdispatch (compiled with llvm) it'd be a nice bonus.

Proper documentation is in the pipeline. Any feedback would be appreciated.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 24 Jul 2011 06:25:00 -0700 Parallel reduction in Objective-C http://chris.mowforth.com/parallel-reduction-in-objective-c http://chris.mowforth.com/parallel-reduction-in-objective-c

Ruby and Clojure have spoilt me and coming back to Objective-C has starved me of a heap of nice ways to manipulate collections.

Born out of this frustration and a drive to use GCD and blocks for something useful, I've started creating my own framework of categories to add into NSArray, NSDictionary and who knows what else right now (that's for another post).

After toiling for an afternoon and realising that you can now create explicitly concurrent dispatch queues in Lion, I managed to work out a divide & conquer reduce algorithm for associative functions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
- (id)reduceWithBlock:(id (^)(id memo, id obj))block andAccumulator:(id)accumulator withBaseLength:(NSUInteger)baseLength
{
    __block id acc = [[accumulator copy] autorelease];
    NSUInteger base_job_length = baseLength;
    if (!baseLength) {
        // Calculate the base job length
        NSUInteger num_processors = [[NSProcessInfo processInfo] processorCount];
        base_job_length = (NSUInteger)([self count] / sqrt(num_processors)); // (L = n / p squared)
    }
    
    if ([self count] <= base_job_length) {
        // If the size of the array is <= base case, reduce it serially
        for (id obj in self) { acc = block(acc, obj); }
    } else {
        // If the array length is > base case, divide & conquer
        NSArray* sub_jobs = [self splitIntoSubArraysOfLength:base_job_length];
        
        dispatch_queue_t result_queue = dispatch_queue_create(NULL, DISPATCH_QUEUE_CONCURRENT);
        
        dispatch_apply([sub_jobs count], result_queue, ^(size_t i) {
            acc = block(acc, [[sub_jobs objectAtIndex:i] reduceWithBlock:block andAccumulator:accumulator withBaseLength:base_job_length]);
        });
        dispatch_release(result_queue);
    }
    
    return acc;
    
}

I mentioned OS X Lion; specifically look at line 18:

1
2
3
4
5
// In Snow Leopard we'd be stuck with this:
dispatch_queue_t result_queue = dispatch_queue_create(NULL, NULL);

// In Lion we can set the queue to dispatch work concurrently:
dispatch_queue_t result_queue = dispatch_queue_create(NULL, DISPATCH_QUEUE_CONCURRENT);

Since we assume the user has provided an associative function, we don't care about the order in which our recursive calls return. This means we can use the shiny new DISPATCH_QUEUE_CONCURRENT constant in 10.7 to send work off in a parallel manner (thanks for the tip, Kazuki).

For this to work you need a means of splitting the array into sub-arrays of ever decreasing size, until they hit a limit where it's practical to reduce them serially:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
- (NSArray*)splitIntoSubArraysOfLength:(NSUInteger)size
{
    __block NSMutableArray* sub;
    NSUInteger count = [self count];
    NSUInteger remainder = count % size;
    NSUInteger total = (count / size) + remainder;
    
    sub = [NSMutableArray arrayWithCapacity:total];
    
    dispatch_queue_t result_queue = dispatch_queue_create(NULL, NULL);
    
    dispatch_apply(total, result_queue, ^(size_t i) {
        NSUInteger step = i * size;
        NSUInteger upper_bound = size;
        if (step + upper_bound > [self count]) upper_bound = remainder;
        NSRange range = NSMakeRange(step, upper_bound);
        NSArray* sub_array = [self subarrayWithRange:range];
        
        [sub insertObject:sub_array atIndex:i];
    });
    
    dispatch_release(result_queue);
    
    return sub;
}

Right now I'm naïvely assuming that the smallest unit of work is when a sub-array satisfies the inequality:

l ≤ n ∕ P² ; {l, n, P} ⊂ ℤ ∖ {0}

Where l the base case, n is the length of original NSArray to be conqured and P is the number of processors.

Given the fact that n will typically be vastly greater than P on most personal computers and mobile devices you'd intuitively think that mandating a relatively large base case is the sensible thing to do- the smaller the job, the more time is spent recursively splitting arrays and dispatching jobs rather than doing any real work.

I'm purposefully omitting any benchmarking because you know what Disraeli said, but from my somewhat confined set of  experiments I'm seeing a ≃1.4x increase in execution speed of associative functions. This isn't production quality and won't be making its way into CMFunctionalAdditions but it illustrates the potential of higher-order functions and Grand Central Dispatch in Objective-C.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 25 Apr 2011 05:20:00 -0700 Prüfer Sequence Algorithm in Clojure http://chris.mowforth.com/prufer-sequence-algorithm-in-clojure http://chris.mowforth.com/prufer-sequence-algorithm-in-clojure

Because implementing it is the best way of studying for an exam. Yeah. Ok, I'll go and do some work now.

Edit: Thanks to Rasmus Svensson for suggesting I store the adjacent vertices as sets.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
(defn prufer
  "Convert a labelled tree to a Prüfer Sequence."
  ([graph] (prufer graph []))
  ([graph prufer-sequence]
    (loop [g graph p prufer-sequence]
      (if (> (count g) 2) ; A Prüfer sequence is always of length n - 2
        ;v finds the lowest-valued leaf node
        (let [v (apply min-key first (filter #(= 1 (count (last %))) g))]
          ; Remove the label's node key and occurrences in edge sets,
          ; then push its neighbour's label onto front of Prüfer sequence p.
          (recur
            (into {} (for [[k value] (dissoc g (first v))] [k (disj value (first v))]))
            (concat p [(last (last v))])))
        p))))


; Trivial 5-vertex graph where the keys represent labels and
; values represent each node's neighbours i.e. a vertex with one neighbour is a leaf
(def g {1 #{2 3}, 2 #{1 4}, 3 #{1 5}, 5 #{3}, 4 #{2}})

(println (prufer g))

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Wed, 13 Apr 2011 12:49:00 -0700 Selective reinvention of the wheel http://chris.mowforth.com/selective-reinvention-of-the-wheel http://chris.mowforth.com/selective-reinvention-of-the-wheel

It is no measure of health to be well adjusted to a profoundly sick society. ( Jiddu Krishnamurti)

Why do programmers hemorrhage endless man-hours setting up some kind of CMS, configuring servers and toying with the intricacies of themes just to write the occasional blog post? It's not like any IT pro I'd actually want to have a pint with ordinarily needs to placate their narcissism that much. Without wanting it to sound like a shameless plug for Tumblr, Posterous et al, aren't blogs as a service (new buzzword eh?) a classic example of the 80/20 rule? I fail to see what value you add to your views by wasting your weekends on some batshit rails config just so you can masochistically watch it fall over if you ever find yourself at the top of hacker news.

That is all.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth