2008-07-02

Sorting out Sorting

I believe that refactoring in the functional milieu is as important as it is in the object-oriented world. There's usually room for further improvement of one's definitions. Often, it is no more than the trivial application of selective inlining of definitions that clarify code that turned out more cluttered and opaque that it could have been, possibly from working at a remove from the problem domain by dint of excessive generalization.

More rarely this refactoring occurs at a higher conceptual level than is known in object-oriented programming: instead of indentifying and rooting out repeated code it abstracts common patterns of recursion and applications into higher-order, closed combinators. This is somewhat in tension with the drive to write code as simple as possible, and in a sense it goes in opposition to the previous refactoring.

With practice, category theory guides one's decisions, so that the code one writes comes out "pre-refactored"; in general, having black-box higher-order combinators means that one effectively programs (or in principle could program) in an incomplete (in the technical sense) but total fragment of the language.

Looking back to the code I wrote the last time, I had opportunity for performing both kinds of refactoring. One good chance to apply the first one I took advantage of was the lifting of swap to lists. Inlining all the definitions, it came out like:

let swap = function
| [x; y] when x > y -> [y; x]
| l                 -> l

made especially simple by the application of a pattern guard. This function is the natural (again, in a technical sense) lifting of the min-max binary operator to lists, as is apparent from the code. Now exchange reduces to:

let exchange p = join % List.map swap % pairup $ p

without undue clutter generated by the domain injections and projections. I carried out the second kind of refactoring when I noticed that the recursion schemata for both bitonic and sort were identical. The gist of it is that after an even-odd shuffle (carried out by unzip), the result of two parallel invocations of the procedure on each half are merged back into a complete network. This parallel invocation is, in a sequential setting, a function application to both members of a pair:

let par f (x, y) = f x, f y

(incidentally, par is Spanish for "pair"). Now call this recursion schema the shuffle:

let rec shuffle merge = function
| []  -> []
| [x] -> [x]
| l   -> merge % par (shuffle merge) % unzip $ l

The trivial networks pass unchanged; otherwise, my description is word for word (backwards!) the transformation performed on the network. Let me recall Knuth's recipe:

[W]e can construct a bitonic sorter of order p […] by first sorting the bitonic subsequences ⟨z1, z3, z5, …⟩ and ⟨z2, z4, z6, …⟩ independently, then comparing and interchanging z1:z2, z3:z4, … (Knuth, Sorting and Searching, p 232)

Now, taking advantage of the abstracted recursion scheme, I applied again the first kind of refactoring. "Comparing and interchanging" the sorted pair of sequences ended up like this:

let bitonic p = shuffle exchange % uncurry List.rev_append $ p

And the complete sort like this:

let sort l = shuffle bitonic l

This schema, sort ≡ shuffle merge for a suitable merge phase is due to Batcher. A marvelous exposition is section 4.6 of Jayadev Misra's paper Powerlist: A Structure for Parallel Recursion, which not only inspired this program, but about which I hope to write more in the future.

No comments: