Chris Mowforth http://chris.mowforth.com And we are the dreamers of dreams posterous.com Tue, 21 Feb 2012 19:56:00 -0800 JavaScript / CSS minification for JRuby http://chris.mowforth.com/javascript-css-minification-for-jruby http://chris.mowforth.com/javascript-css-minification-for-jruby

I didn't think I'd have to do this, but after seeing that the ruby-yui-compressor gem forks a java process every time, I thought that, whilst using JRuby, that's unequivocally Doing It Wrong™. So I wrapped the YUI Java compressor library in a gem and called it JMinify. If anybody needs to compress their assets and they're using JRuby, knock yourselves out.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 25 Dec 2011 13:31:00 -0800 Parallel map: more JRuby concurrency mischief http://chris.mowforth.com/parallel-map-more-jruby-concurrency-mischief http://chris.mowforth.com/parallel-map-more-jruby-concurrency-mischief

Like my last post this is more for my future benefit but if anybody else finds it useful then that's cool too. Unlike the last one the fruits of my tinkering yielded a nice linear speedup.

Ok, let's parallelize Array#map. We'll break down the task as follows:

  1. Split the array into chunks
  2. Execute the chunks in asynchronously, in parallel, waiting for them all to complete
  3. Merge the chunks into a new array and return it

How many chunks is optimal? There's no definitive answer; In the past I've opted for a very large number of small sub-arrays, e.g. for concurrent divide & conquer reductions where the minimal array length was some low power of the number of processors (I've played with associative reduction algorithms in the past). For our #pmap method I'm just going to split the original array up into as many chunks as there are logical cores on my machine. How do you find that out in JRuby? Java to the rescue again:

1
$cores = Runtime.getRuntime.availableProcessors

Now we need a pool of workers to assign tasks to. As parallel mapping is strictly CPU bound, a thread pool with fixed thread count but an unbounded work queue is probably most appropriate:

1
queue = Executors.newFixedThreadPool($cores)

That's our ExecutorService up & running, we just need to do a bit of housekeeping before we can write our Array#pmap method. This is where Java's baroque-complexity boilerplate rears its ugly head (wouldn't it be nice if Executors could take lambdas as arguments for mass invocation?). Basically we implement the Callable interface- I instantiate my Task implementation with a block which the executor calls when it executes:

1
2
3
4
5
6
7
8
9
10
11
12
class Task
  include Callable

  def initialize(&block)
    @work = block
  end

  def call
    @work.call
  end
  
end

So now we're good to go. Array#pmap here takes an executor as a first argument because I didn't want the class to be responsible for starting / shutting down the work queue, but that's just an implementation decision.

Once the original array is mapped to a set of Task classes, they can be handed to the executor. I call ExecutorService#invokeAll because it blocks until all the submitted work is done. It returns an array of FutureTasks which can be dereferenced immediately (we want the method to return something!):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Array

  def pmap(executor, &block)
    # Parcel out the work into chunks to be executed sequentially
    tasks = self.each_slice(self.size / $cores).map do |slice|
      Task.new { slice.map &block }
    end

    # Execute them all, block until they're done
    results = executor.invokeAll(tasks)

    # Dereference and merge all the FutureTasks
    results.reduce([]) { |memo,obj| memo + obj.get }
  end
  
end

So does all that tomfoolery actually buy you any more performance? Time for a highly unscientific benchmark, incrementing an array of the first million Fixnums:

1
2
3
4
5
chris@think-chris:~/Documents/Experimentation$ jruby pmap.rb
"Splendid new Array#pmap"
"That took 0.287166921s"
"Plain old sequential Array#map"
"That took 0.537166921s"

Not bad- I ran it on a Sandy Bridge i5 with 2 cores and 4 CPU threads.

More important than a cheesy parallel mapping imeplementation, I've learned (or is that re-learned?) two axioms about playing with concurrency in JVM hosted languages:

  1. As in the previous post, executors expect some kind of anonymous inner class as an argument in the absence of closures. You need to be aware of the cost of converting a ruby closure to a Java Runnable, Callable, Future etc; think of I/O-bound problems where allocating extra objects for each request/event is almost certainly not a good thing. You'll definitly save time writing in straight Java as you won't have to worry whether or not the closure you just used is going to throw all your performance gains out the window.
  2. Don't discount the JVM's ability to make a fool of your benchmarks. The one above isn't worth the screen real-estate it occupies due to its simplicity and duration relative to startup / shutdown time of the VM. Both JIT and Server modes take time to hit a 'quiescent state' where performance stabilises. Don't just look at wall-time, profile your code properly. Lots has been said about this elsewhere so I won't rehash what others have articulated better, but be aware HotSpot has its own performance quirks.

The example in its entirety is here.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Thu, 22 Dec 2011 13:22:00 -0800 Easy thread safety with JRuby http://chris.mowforth.com/thread-safe-attrwriters-in-jruby http://chris.mowforth.com/thread-safe-attrwriters-in-jruby

EDIT: Beauty, as they say, is pain. In exchange for liberating your code from locks, piling the work onto a queue is approximately ~10x slower than a traditional approach. A quick profile suggests that the expense of block creation is non-trivial. It's still nicer to look at though, right?

More for my benefit, but it's handy that JRuby lets you use asynchronous Java queues. Want to make something like this thread-safe?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Something

  def initialize(value)
    @value = value
  end

  # Not thread-safe

  def inc
    @value += 1
  end

  def dec
    @value -= 1
  end

end

Just wrap the assignment in a Runnable (JRuby coerces blocks/Procs/lambdas into Runnables for you) and submit it to a single-threaded Executor. All calls to inc and dec execute one at a time, in-order:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
require 'java'

java_import java.util.concurrent.Executors

class Something

  def initialize(value)
    @queue = Executors.newSingleThreadExecutor
    @value = value
  end

  def inc
    @queue.execute { @value += 1 }
  end

  def dec
    @queue.execute { @value -= 1 }
  end

  def finalize
    @queue.shutdown
  end
end

Tasks are applied FIFO, @value stays consistent without locks- sound familiar?

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 05 Dec 2011 10:33:00 -0800 Notable omissions in the James-Joyce CS Section http://chris.mowforth.com/notable-omissions-in-the-james-joyce-cs-secti http://chris.mowforth.com/notable-omissions-in-the-james-joyce-cs-secti

Don't get me wrong- the JJ has most of the works you'll need as a CS undergrad, especially if Java's the only thing you're exposed to (although what that says about the reader I leave as an exercise). However it's not that hard to push the boundaries and find a few glaring seminal works either omitted completely or severely lacking in quantity.

It's not sufficient to have a solitary copy of a significant text and 20 copies of a lesser one when the latter is not only objectively better but more likely to age well, a particularly acute problem in a CS section. "Underwater basket-weaving Synergies with Java" or whatever might get mediocre kids through exams in 2011, and maybe that's all the CSI department cares about, in which case I'm wasting my time convincing them that shelling out for a few more copies of SICP every so often would get a decent return on investment. But on the off-chance that they give a shit about something other than producing fodder for cube-farms I'm compiling a list of what I think is currently found wanting.

An omission qualifies as the absence or shortage of physical copy from the shelf; I don't care if they have a link to an electronic copy on the catalogue search, I'm referring to dead trees:

If When I find any more I'll update the list.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Tue, 29 Nov 2011 11:22:00 -0800 Avoiding stack overflow in Ruby with trampolines http://chris.mowforth.com/avoiding-stack-overflow-in-ruby-with-trampoli http://chris.mowforth.com/avoiding-stack-overflow-in-ruby-with-trampoli

Languages that don't support tail-call elimination out of the box can make people think twice about using recursion. Ever received a SystemStackError when doing something like this?

1
2
3
def increment(num=0)
  increment(num + 1)
end

Ok, the example's contrived, but you've no doubt come across cases where you'd like to avoid using an iterative solution even if it's just for the sake of elegance. Luckily in Ruby this can be easily fixed by:

  1. Having your method return a thunk rather than a direct tail-call
  2. Using a trampoline to avoid growing the stack

A thunk is essentially just the suspended application of a function. Taking our example, it's just a matter of transforming the tail call like so:

1
return lambda { increment(num + 1) }

The trampoline implementation is equally trivial. It takes a thunk as an argument and iteratively calls it until something other than a continuation is returned:

1
2
3
4
def trampoline(&thunk)
  thunk = thunk.call while thunk.respond_to?(:call)
  thunk
end

So now if we start the ball rolling with our trampoline call:

1
trampoline { increment }

We can run increment as long as we like without blowing the stack. As there's no base case it'll continue indefinitely- a worst-case example. Using continuation-passing here trades a time-penalty in higher-order function calls for the advantage of not increasing the amount of space needed to perform the computation. Not too hard really, is it?

Alternatively we could use Kernel#callcc to replay the computation until we have a return value. Here we get a continuation object to use with callcc { ... }, give @k the correct execution context by reassigning the thunk as before, then call it until we're done:

1
2
3
4
5
6
def trampoline(&thunk)
  callcc { |@k| }
  thunk = thunk.call
  @k.call unless !thunk.respond_to?(:call)
  thunk
end

If you want to get a proper grasp of CPS you could do a lot worse than get hold of Dybvig's book and go through all the call/cc examples.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 31 Oct 2011 22:23:00 -0700 LiveConnect on OS X Lion http://chris.mowforth.com/liveconnect-on-os-x-lion http://chris.mowforth.com/liveconnect-on-os-x-lion

For anybody who's ever had to get java <-> javascript interaction working but was wondering why Apple's JDK couldn't find netscape.javascript.JSObject, the elusive plugin.jar file is kept in /Library/Java/Home/lib. If you install it into your local maven repository you can get yourself into all kinds of trouble writing applets in clojure.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 31 Oct 2011 10:39:00 -0700 The disruptor pattern for the uninitiated http://chris.mowforth.com/the-disruptor-pattern-for-the-uninitiated http://chris.mowforth.com/the-disruptor-pattern-for-the-uninitiated

The concept of a single-threaded business-logic processor (BLP) which feeds from, and into concurrent and durable disruptors started a large discussion on HN the other day. The LMAX technical paper gives a more elaborate explanation of the architecture but if you can't face reading through it then consider this discussion on StackOverflow.

I can see its utility in a problem domain like trading platforms, but I don't think it's a good solution for embarrassingly parallel applications like signal processing, cryptography etc.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 03 Oct 2011 08:31:00 -0700 Better living through fib functions http://chris.mowforth.com/better-living-through-fib-functions-55461 http://chris.mowforth.com/better-living-through-fib-functions-55461

...or why posting benchmarks of fib a function in x language:

  1. is a slightly puerile approach to discussing performance and scalability
  2. ignores any number of reasons why people would choose node.js, other than throwing out received knowledge for the hell of it

But with the discussion raging (it now appears Haskell is the cure after all) I thought I'd venture a clojure solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
(ns fib.core
(:use [lamina.core]
[aleph.http]))

(defn fib-seq []
((fn rfib [a b]
(cons a (lazy-seq (rfib b (+ a b)))))
0 1))

(defn response []
(str (reduce + (take 40 (fib-seq)))))

(defn better-living-thru-fib [channel request]
(enqueue channel {:status 200
:headers {"Content-Type" "text/plain"}
:body (response)}))

(start-http-server better-living-thru-fib {:port 1337})

And the requisite benchmark of questionable utility:

1
2
3
4
chris-mac:fib chris$ time curl http://localhost:1337
real 0m0.047s
user 0m0.014s
sys 0m0.008s

With that we've proved:

  1. A fib function can indeed be written in clojure, and its return value piped to a web browser
  2. Um, that's about it

Presumably the OP will be enraged to find that clojure has its own web server (piggybacking off aleph here). That said I didn't realise retro CGI scripting was so avant-garde in 2011; guess I'm not 'deck' enough to realise but then I don't have a fixie or an assymetric haircut either :/

On a serious note I haven't dabbled with node much because I just don't can't warm to javascript; I find the language constructs native to lisp dialects (for me that's clojure and scheme) much more elegant and alluring, but it's a subjective thing.

I only wrote this post because I was at a loose end but it is surely proof by contradiction that this dialogue achieves nothing. You need more than contrived benchmarks to qualify the shortcomings in a language. Leave them to college kids like me with nothing better to do.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 02 Oct 2011 10:01:00 -0700 Useful reading material for STM http://chris.mowforth.com/useful-reading-material-for-stm http://chris.mowforth.com/useful-reading-material-for-stm

Mainly for my benefit but if anybody else is interested:

I'll add more once I get through this lot.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Wed, 17 Aug 2011 16:57:00 -0700 SI- Because rvm and rbenv are overkill http://chris.mowforth.com/si-because-rvm-and-rbenv-are-overkill http://chris.mowforth.com/si-because-rvm-and-rbenv-are-overkill

It's said that more than any other programming community Lispers tend to dislike working on problems that have already been solved. But this dogmatic avoidance of duplication doesn't seem to have permeated the rubyists' zeitgeist to the same degree.

I'm not in the habit of regularly compiling ruby interpreters for the hell of it- it's something I'd do maybe twice a year tops. I recently rebuilt rubinius and ruby 1.9 for Lion and that'll probably be it for 6 months. If you keep all your interpreters in userland and you aren't such an idiot that you can't install them yourself, how is it that the nebulous concept of a ruby version manager has come into vogue?

Switching interpreters shouldn't be any more complex than having a script to symlink the executable paths for you:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/Users/chris/.ruby/current/bin/ruby

RUBY_HOME = ENV['RUBY_HOME']

RUBIES = {:jruby => 'jruby-1.6.0', :rubinius => 'rubinius/1.2', :mri => 'mri19'}
GEM_EXECUTABLES = {:rubinius => 'gems/bin', :mri => ''}

if ARGV.empty?
  p 'Available interpreters:'
  RUBIES.each { |key,value| p key }
  exit
end

ruby = ARGV.first

update_symlinks = lambda { |ruby|
  ruby_path = RUBY_HOME + '/' + RUBIES[ruby]
  current_bin_path = RUBY_HOME + '/current/bin'
  
  executables = Dir.glob("#{ruby_path}/bin/*")

  system("rm -rf #{current_bin_path}/*")

  executables.each do |x|
    system("ln -s #{x} #{current_bin_path}/")
  end
  
  gem_bins = "#{ruby_path}/#{GEM_EXECUTABLES[ruby]}"
  gem_bin_path = RUBY_HOME + '/current/gem_bin'
  
  system("rm -rf #{gem_bin_path}")
  
  system("ln -s #{gem_bins} #{gem_bin_path}")
}

A couple of clarifications:

  • I install all my rubies into ~/.ruby/
  • The current ruby gets all its executables symlinked into ~/.ruby/current/bin
  • I added ~/.ruby/current/bin onto the end of my $PATH
  • That is all

I have it saved in /usr/local/bin/si. SI is short for 'switch interpreter'. A monkey can use it, although none have been known to do so.

The hash of interpreters is hard-coded and I can't even remember what else I did to get it working but that's not the point. It's a 15 minute job I tackled and forgot about 2 years ago. Let's use some initiative, stop writing version managers and do some bloody work.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 15 Aug 2011 16:37:00 -0700 More scripting with Clojure http://chris.mowforth.com/more-scripting-with-clojure http://chris.mowforth.com/more-scripting-with-clojure

The old man wanted to pay $49.99 for some p.o.s shareware app to compare and remove duplicate files in Windows. Seeing this as something of a challenge, I got a stopwatch and went to see what alternative I could come up with in half an hour or so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
(ns compo.core
(:use [clojure.set :only [difference]])
(:gen-class))

(defn checksum [file]
(let [input (java.io.FileInputStream. file)
digest (java.security.MessageDigest/getInstance "MD5")
stream (java.security.DigestInputStream. input digest)
bufsize (* 1024 1024)
buf (byte-array bufsize)]

(while (not= -1 (.read stream buf 0 bufsize)))
(apply str (map (partial format "%02x") (.digest digest)))))

(defn list-dir [dir]
(reverse (sort-by #(.lastModified %)
(remove #(.isDirectory %)
(file-seq (java.io.File. dir))))))

(defn find-dupes [root]
(prn "Computing checksums...")
(let [files (list-dir root)]
(let [summed (zipmap (pmap #(checksum %) files) files)]
(difference
(into #{} files)
(into #{} (vals summed))))))

(defn remove-dupes [files]
(prn (str "Found " (count files) " duplicate files which can be removed."))
(doseq [f files]
(prn (str (.toString f) " - [y/n]"))
(if-let [choice (= (read-line) "y")]
(.delete f))))

(defn -main [& args]
(if (empty? args)
(println "Enter a root directory")
(remove-dupes (find-dupes (first args))))
(System/exit 0))

Turns out in clojure, the answer is just enough to do the job.

Sometimes less is more (thanks to this post for computing an md5 on a file). And throwing in pmap performs the checksumming in parallel without any extra thought required. Job.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 31 Jul 2011 02:29:00 -0700 Installing Grand Central Dispatch on Linux http://chris.mowforth.com/installing-grand-central-dispatch-on-linux http://chris.mowforth.com/installing-grand-central-dispatch-on-linux

I've been curious about getting libdispatch and the blocks runtime compiling in Ubuntu for some time but naively guessed that it wouldn't be for the faint-hearted since the only useful result yielded by Google was this thread on SO. How wrong I was!

For this exercise I pulled the iso for natty server from the ubuntu site and did a fresh install in a VirtualBox VM. The only extra I added during install was an SSH daemon so I could use the terminal on my Mac.

Libdispatch needs llvm/clang, libkqueue and the blocks runtime which are already available through apt-get in natty so let's install them.

You'll also need libpthread-workqueue0. I had to download the .deb packages from oneiric here but they installed without any hassle:

1
2
3
4
5
6
# Core dependencies
sudo apt-get install clang libblocksruntime-dev libqkueue-dev

# Libpthread
sudo dpkg -i libpthread-workqueue0_0.7-1ubuntu1_i386.deb
sudo dpkg -i libpthread-workqueue-dev_0.7-1ubuntu1_i386.deb

I compiled libdispatch itself from source. Don't grab it from MacOSforge, download the tarball used to make the .deb package for oneiric. The installation will also need make, autoconf, autogen and libtool. Just to save yourself any hassle trying to solve missing header issues, install the build-essential package and gcc-multilib while you're at it:

1
sudo apt-get install make autoconf autogen libtool build-essential gcc-multilib

You should be set to compile libdispatch now. There should already be a configure file in the 'libdispatch' root folder so let's install:

1
2
3
CC=clang ./configure
make
sudo make install

Make sure you force clang as the compiler, not gcc. The end of the install message should tell you where the libs were installed, for me it was /usr/local/lib. Ubuntu rather stupidly ignores erases the LD_LIBRARY_PATH variable if you set it through the shell, so add a file in /etc/ld.so.conf.d/* pointing to this location if you don't have one already.

You should have Grand Central set up by now. Let's write a hello world programme and see if it works:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <dispatch/dispatch.h>
#include <stdio.h>

int main() {
  dispatch_queue_t queue = dispatch_queue_create(NULL, NULL);

  dispatch_sync(queue, ^{
    printf("Hello, world from a dispatch queue!\n");
  });

  dispatch_release(queue);

  return 0;
}

And to compile:

1
clang -o hi hello.c -fblocks -ldispatch

And if all is well you should get the message printed to the screen. I haven't tried anything computationally intensive yet and I wouldn't be surprised if there's a performance discrepancy between FreeBSD / OS X, since libkqueue is just a wrapper around epoll(). But hello world works, and that's half the battle, right?

 

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Fri, 29 Jul 2011 10:38:00 -0700 CMFunctionalAdditions- multicore ruby-like utilities for Objective-C http://chris.mowforth.com/cmfunctionaladditions-multicore-ruby-like-uti-64127 http://chris.mowforth.com/cmfunctionaladditions-multicore-ruby-like-uti-64127

Hot on the heels of the last post, I've decided say a bit more about CMFunctionalAdditions. As I mentioned last time the project is the result of functional and syntatic-sugar widthdrawl symptoms. In a nutshell, CMFunctionalAdditions is my take on being able to call the following in Objective-C without altering the receiver:

  • Map
  • Map with index (each_with_index.map...)
  • Reduce (inject, fold, whatever)
  • Filter (select)
  • Remove (reject, delete, whatever)
  • Partition (the ruby flavour, not the clojure flavour)
  • Unique
  • Flatten
  • Take (take_while)
  • tbc

Additionally many of the linear-time methods above run in O(n ∕ P) time or better with Grand Central Dispatch. I want concurrency to be transparent to the end user wherever possible. A picture speaks a thousand words so to rip the examples from the CLI demo...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// We're using this array to play with:
NSArray* sample = [NSArray arrayWithObjects:@"foo",
                                            @"bar",
                                            @"baz",
                                            @"teapot",
                                            nil];


// Shall we map them so f(x) -> "SOMETHING#{x}"?
NSArray* mapped = [sample mapWithBlock:^id(id obj) {
    return [NSString stringWithFormat:@"SOMETHING%@", obj];
}];


// Let's map x with its index
NSArray* mappedIndex = [sample mapWithIndexedBlock:
  ^id(NSUInteger idx, id obj) {
      return [NSString stringWithFormat:@"%@ AT INDEX %lu",
                                        obj,
                                        idx];
}];


// How about reducing it?
id reduced = [sample reduceWithBlock:^id(id memo, id obj) {
    return [NSString stringWithFormat:@"%@-%@", memo, obj];
} andAccumulator:@""];


// But we don't like teapots; let's remove them
BOOL (^discriminator)(id obj) = ^(id obj) {
  return [obj isEqual:@"teapot"];
};

NSArray* teapotFree = [sample removeWithPredicate:
                              discriminator];


// Even better, let's segregate them
NSArray* segregated = [sample partitionWithBlock:
                              discriminator];


// Now let's break *sample up into
// an array of 2-element NSArrays
NSArray* chunked = [sample splitWithSize:2];

The framework currently requires Snow Leopard or better but Lion might become a prerequisite (see last post). iOS 4 should work but I haven't tried it. I'm also attempting compilation on Linux as this is being written. As the only dependencies are Foundation and libdispatch (compiled with llvm) it'd be a nice bonus.

Proper documentation is in the pipeline. Any feedback would be appreciated.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 24 Jul 2011 06:25:00 -0700 Parallel reduction in Objective-C http://chris.mowforth.com/parallel-reduction-in-objective-c http://chris.mowforth.com/parallel-reduction-in-objective-c

Ruby and Clojure have spoilt me and coming back to Objective-C has starved me of a heap of nice ways to manipulate collections.

Born out of this frustration and a drive to use GCD and blocks for something useful, I've started creating my own framework of categories to add into NSArray, NSDictionary and who knows what else right now (that's for another post).

After toiling for an afternoon and realising that you can now create explicitly concurrent dispatch queues in Lion, I managed to work out a divide & conquer reduce algorithm for associative functions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
- (id)reduceWithBlock:(id (^)(id memo, id obj))block andAccumulator:(id)accumulator withBaseLength:(NSUInteger)baseLength
{
    __block id acc = [[accumulator copy] autorelease];
    NSUInteger base_job_length = baseLength;
    if (!baseLength) {
        // Calculate the base job length
        NSUInteger num_processors = [[NSProcessInfo processInfo] processorCount];
        base_job_length = (NSUInteger)([self count] / sqrt(num_processors)); // (L = n / p squared)
    }
    
    if ([self count] <= base_job_length) {
        // If the size of the array is <= base case, reduce it serially
        for (id obj in self) { acc = block(acc, obj); }
    } else {
        // If the array length is > base case, divide & conquer
        NSArray* sub_jobs = [self splitIntoSubArraysOfLength:base_job_length];
        
        dispatch_queue_t result_queue = dispatch_queue_create(NULL, DISPATCH_QUEUE_CONCURRENT);
        
        dispatch_apply([sub_jobs count], result_queue, ^(size_t i) {
            acc = block(acc, [[sub_jobs objectAtIndex:i] reduceWithBlock:block andAccumulator:accumulator withBaseLength:base_job_length]);
        });
        dispatch_release(result_queue);
    }
    
    return acc;
    
}

I mentioned OS X Lion; specifically look at line 18:

1
2
3
4
5
// In Snow Leopard we'd be stuck with this:
dispatch_queue_t result_queue = dispatch_queue_create(NULL, NULL);

// In Lion we can set the queue to dispatch work concurrently:
dispatch_queue_t result_queue = dispatch_queue_create(NULL, DISPATCH_QUEUE_CONCURRENT);

Since we assume the user has provided an associative function, we don't care about the order in which our recursive calls return. This means we can use the shiny new DISPATCH_QUEUE_CONCURRENT constant in 10.7 to send work off in a parallel manner (thanks for the tip, Kazuki).

For this to work you need a means of splitting the array into sub-arrays of ever decreasing size, until they hit a limit where it's practical to reduce them serially:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
- (NSArray*)splitIntoSubArraysOfLength:(NSUInteger)size
{
    __block NSMutableArray* sub;
    NSUInteger count = [self count];
    NSUInteger remainder = count % size;
    NSUInteger total = (count / size) + remainder;
    
    sub = [NSMutableArray arrayWithCapacity:total];
    
    dispatch_queue_t result_queue = dispatch_queue_create(NULL, NULL);
    
    dispatch_apply(total, result_queue, ^(size_t i) {
        NSUInteger step = i * size;
        NSUInteger upper_bound = size;
        if (step + upper_bound > [self count]) upper_bound = remainder;
        NSRange range = NSMakeRange(step, upper_bound);
        NSArray* sub_array = [self subarrayWithRange:range];
        
        [sub insertObject:sub_array atIndex:i];
    });
    
    dispatch_release(result_queue);
    
    return sub;
}

Right now I'm naïvely assuming that the smallest unit of work is when a sub-array satisfies the inequality:

l ≤ n ∕ P² ; {l, n, P} ⊂ ℤ ∖ {0}

Where l the base case, n is the length of original NSArray to be conqured and P is the number of processors.

Given the fact that n will typically be vastly greater than P on most personal computers and mobile devices you'd intuitively think that mandating a relatively large base case is the sensible thing to do- the smaller the job, the more time is spent recursively splitting arrays and dispatching jobs rather than doing any real work.

I'm purposefully omitting any benchmarking because you know what Disraeli said, but from my somewhat confined set of  experiments I'm seeing a ≃1.4x increase in execution speed of associative functions. This isn't production quality and won't be making its way into CMFunctionalAdditions but it illustrates the potential of higher-order functions and Grand Central Dispatch in Objective-C.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Mon, 25 Apr 2011 05:20:00 -0700 Prüfer Sequence Algorithm in Clojure http://chris.mowforth.com/prufer-sequence-algorithm-in-clojure http://chris.mowforth.com/prufer-sequence-algorithm-in-clojure

Because implementing it is the best way of studying for an exam. Yeah. Ok, I'll go and do some work now.

Edit: Thanks to Rasmus Svensson for suggesting I store the adjacent vertices as sets.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
(defn prufer
"Convert a labelled tree to a Prüfer Sequence."
([graph] (prufer graph []))
([graph prufer-sequence]
(loop [g graph p prufer-sequence]
(if (> (count g) 2) ; A Prüfer sequence is always of length n - 2
;v finds the lowest-valued leaf node
(let [v (apply min-key first (filter #(= 1 (count (last %))) g))]
; Remove the label's node key and occurrences in edge sets,
; then push its neighbour's label onto front of Prüfer sequence p.
(recur
(into {} (for [[k value] (dissoc g (first v))] [k (disj value (first v))]))
(concat p [(last (last v))])))
p))))


; Trivial 5-vertex graph where the keys represent labels and
; values represent each node's neighbours i.e. a vertex with one neighbour is a leaf
(def g {1 #{2 3}, 2 #{1 4}, 3 #{1 5}, 5 #{3}, 4 #{2}})

(println (prufer g))

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Wed, 13 Apr 2011 12:49:00 -0700 Selective reinvention of the wheel http://chris.mowforth.com/selective-reinvention-of-the-wheel http://chris.mowforth.com/selective-reinvention-of-the-wheel

It is no measure of health to be well adjusted to a profoundly sick society. ( Jiddu Krishnamurti)

Why do programmers hemorrhage endless man-hours setting up some kind of CMS, configuring servers and toying with the intricacies of themes just to write the occasional blog post? It's not like any IT pro I'd actually want to have a pint with ordinarily needs to placate their narcissism that much. Without wanting it to sound like a shameless plug for Tumblr, Posterous et al, aren't blogs as a service (new buzzword eh?) a classic example of the 80/20 rule? I fail to see what value you add to your views by wasting your weekends on some batshit rails config just so you can masochistically watch it fall over if you ever find yourself at the top of hacker news.

That is all.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Sun, 07 Nov 2010 08:07:00 -0800 You can even get phished inside Facebook now http://chris.mowforth.com/you-can-even-get-phished-inside-facebook-now http://chris.mowforth.com/you-can-even-get-phished-inside-facebook-now

This pisses me off, mainly because both the implementation and social engineering is trivial:

Screen_shot_2010-11-07_at_17

It's well within facebooks' powers to provide some kind of LSI to root out nefarious users who create these applications purely to farm access tokens. Not trusting any user input is one of the key tenets of application security and although it's quite abstract, a facebook app is still a form of user-supplied data.

In the meantime, if you find a friend posting a link to a nondescript application like the one above, don't click the 'Allow' button, and report it if you have time.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Thu, 04 Nov 2010 06:58:00 -0700 Generating unique identifiers in ActiveRecord models http://chris.mowforth.com/generating-unique-identifiers-in-activerecord http://chris.mowforth.com/generating-unique-identifiers-in-activerecord

A recurring need in a lot of web applications is to give records a form of unique identification (other than its database primary key). This can be useful, for example, to stop users sequentially poking their way through IDs or to generate an access token subsequently used for authentication.

The basic mechanism for taking a unique string and assigning it to a given column if no other record has that value often creates a bad code smell if a lot of models make use of it, but it's easily DRYed up in Rails. Initially I looked at phusion's default_value_for plugin, but as is implicit, it just assigns a default value to the column and doesn't check for uniqueness.

So I put together Identifier, which lets you avail of a similar class method, has_identifier(:column_name) { Block.generating.uid }

It behaves in largely the same fashion as default_value_for, but has_identifier will recursively call the block until it yields a unique value, and :column_name is automatically protected from mass-assignment:

1
2
3
4
5
class User < Activerecord::Base

  has_identifier(:access_token) { SecureRandom.hex(32) }

end

Hack away here.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Tue, 14 Sep 2010 02:21:00 -0700 Automatically setting proxy configuration in the shell http://chris.mowforth.com/automatically-setting-proxy-configuration-in http://chris.mowforth.com/automatically-setting-proxy-configuration-in

I'll make it quick. It'd be nice to have the proxy settings from the OS X network preference pane reflected in an http_proxy variable when they're needed- it's useful if your laptop regularly finds itself in more than one location. Unfortunately most of the solutions google yields are complex exercises in technical masturbation, so here's mine: 9 lines of code, no settings stored in a file somewhere in /etc, no symlinks, no messing around:

Save .proxy_export.rb in your home folder:

Then add this to the end of your .bash_profile

Job done. As Albert said, "Make everything as simple as possible, but not simpler."

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth
Thu, 06 May 2010 09:23:00 -0700 Messing with WebSockets http://chris.mowforth.com/messing-with-websockets http://chris.mowforth.com/messing-with-websockets

Well, here's my contribution to the [so far] rather hacky collection of ruby Websocket clients. It's MacRuby only as it utilises NSInputStreams as opposed to traditional ruby TCP sockets, and the new Grand Central calls in 0.6. Using it is just a case of creating an instance of Websocket, calling #push on it and implementing a callback to receive whatever gets sent back:

More refinement to come, and maybe a demo. Hack away here.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1183432/egypt-image-2-770731601.jpg http://posterous.com/users/5erAnzdAY4cF Chris Mowforth m0wfo Chris Mowforth