2008-12-17

Proof-directed Debugging: A Real Example

I was reading this excellent introduction to the K programming language (go ahead, get immersed into it. I'll wait for you) when I got piqued by the strange function where:

  &1 2 3 4     / where: returns the number of units specified
0 1 1 2 2 2 3 3 3 3

  &0 1 1 0 0 1 / where: an important use to get the indices of the 1s
1 2 5

Beyond the need (or not) for such a function, my immediate thought was, How would I write that in OCaml? Sadly, it seems I've become monolingual enough that I have to translate most of what I "hear" into my own internal language. The complaint is neither here nor there, the point is that I reasoned pretty operationally: "I have 1 zeroes, 2 ones, 3 twos and 4 threes", hence:

let where l =
  let rec go n l = function
  | []      -> l
  | x :: xs -> go (succ n) (rep_append x n l) xs

where n is the index of the current element x, and rep_append appends a list of x repeated n times onto l:

  and rep_append e n l =
    if n == 0 then l else
    rep_append e (pred n) (e :: l)
  in
  List.rev (go 0 [] l)

with an accumulating parameter tacked in by tail-recursive reflex. I smelled something wrong when the compiler told me that:

val where : 'a list -> 'a list = <fun>

"Polymorphic?" I thought, and I tried:

# where [1;2;3;4] ;;
- : int list = [2; 3; 3; 4; 4; 4]

Yes, wrong, but…

Of course, I didn't debug the function: the type was too wrong, and it was too obvious a clue that I needed look no further than go. Type-checking by hand, I saw that, at the end, the returned list has type α list; that must be because rep_append returns an α list. Type-checking that, its result type is α list because it appends e to an α list, so that e must have type α instead of int. That variable takes the value from x, the head of the argument list, but that is the number of times to repeat n, that very element's index.

The arguments are swapped: I wrote above rep_append appends a list of x repeated n times, when I should've written rep_append appends a list of n repeated x times. The correct function is:

let where l =
  let rec go n l = function
  | []      -> l
  | x :: xs -> go (succ n) (rep_append n x l) xs
  and rep_append e n l =
    if n == 0 then l else
    rep_append e (pred n) (e :: l)
  in
  List.rev (go 0 [] l)

(the difference is in go's call to rep_append) whose type:

val where : int list -> int list = <fun>

is satisfactory. The function works correctly, too, but that was to be expected.

This is a typical example of proof-directed debugging. That is, a concrete answer to the question, What good is a strong statically-typed language? It's not only that the types prevent you from connecting the pieces together when they don't fit, even though that's 90% of their usefulness. It is also, and especially, the fact that the type of a function is a proof that it does what you think it does. If the type is wrong, the function must be wrong too, automatically, and you don't need to run it to know that.

Much as with a real proof, the trick is to work backwards to see where did an error creep in. A classical example are the trick "proofs" that purport to demonstrate that 1 = -1 or some such, "proofs" that crucially depend on dividing by zero. Working mathematicians everywhere are confronted daily with this, and therein lies the rub.

A crucial objection to the discipline of programming with a strong, statically-typed language is the view that "you need a degree in Mathematics to program in it". It is often repeated that it is "more productive" (easier, in non-obfuscated terms) to program in a dynamic language, and to debug in run-time, or using test-driven development.

The pragmatics of using a statically-typed language are not as onerous as that might suggest. Unfortunately you really have to try it for yourself to see that (but the same is true of the extreme and agile methodologies.) It looks difficult, and it is difficult to program with a strict type system if you're not used to it. But, and this is important, the best languages (i.e., not Java) lend you a very helpful hand: you don't have to prove any theorems, because the compiler does it for you. I checked a proof, I didn't have to write one. That's the compiler's job. And this particular check, that every term has a given type, is not very difficult once you have the final term; you just have to acquire the knack of working back the types of everything until you get to the source.

So, don't just take my word for it, give it a spin. You'll just have to trust me in that strong statically-typed languages are an effective alternative to TDD and agile methodologies.

By the way, that function is ridiculously operational. What would SPJ do? Easy: just be lazy!:

let where l = concat % map (fun (i, x) -> take x $ all i) % index $ l

That is (from right to left): index the list's elements; convert each pair of an index and a repetition count into the corresponding list, and flatten the result. Of course, this requires a couple of definitions:

let (%) f g x = f (g x)

let ($) f x = f x

let rec iota =
  let rec go l i =
    if i == 0 then l else
    let i = pred i in
    go (i :: l) i
  in go []

let all e = let rec l = e :: l in l

let rec take n l =
  if n == 0 then [] else
  match l with
  | []      -> []
  | x :: xs -> x :: take (pred n) xs

let index l = combine (iota (length l)) l

all but the last of which are entirely general. This goes to show that the Haskeller's smug remark that OCaml's standard prelude is rather poor is not really smug at all but painfully true.

Edit: Aaaaargh! I made the same mistake twice. It's obvious that I can't think and code at the same time. The second version of the function where is now correct.

2008-12-02

Stream of Consciousness

Wouldn't you like to write code like this:

reading "ebooks/345.txt" (
   lines
|> skip_until (matches "^\\*\\*\\* START OF")
|> skip
|> filter is_empty
|> words
|> fmap String.uppercase
|> take 20
|> to_list
)

and have it spit:

- : string list =
["DRACULA"; "BY"; "BRAM"; "STOKER"; "1897"; "EDITION"; "TABLE"; "OF";
 "CONTENTS"; "CHAPTER"; "1"; "JONATHAN"; "HARKER'S"; "JOURNAL"; "2";
 "JONATHAN"; "HARKER'S"; "JOURNAL"; "3"; "JONATHAN"]

(yes, it's the Project Gutemberg ebook for Bram Stoker's Dracula, text version; a suitably large corpus to run tests with). That would read: "Safely open the named file as input; parse it into lines, skip until finding the line that matches the regular expression; skip that line. Avoid empty lines, and split the remainder into words. Change the words to upper case, and take the first twenty and make a list of those." Well, this can be done with very nice code. But first, I'd like to do away with the somewhat "normal" bits.

Regular expressions in OCaml live in the Str module; they're low-level and inconvenient to use. A little wrapping-up makes them more palatable:

let matches patt =
  let re = Str.regexp patt in
  fun str -> Str.string_match re str 0

Note that I close over the compilation of the regular expression, so that repeated calls to matches with the same regexp don't waste effort. Of course, is_empty is simply:

let is_empty = matches "^[\t ]*$"

The mysterious "piping arrow" is nothing more than function composition in disguise:

let (|>) f g x = g (f x)

This is the reverse (or categorical) composition; its more popular sibling (one could say, its twin) is:

let (%) f g x = f (g x)

The magic sauce in the program is the use of OCaml's imperative streams. They are like lists, but with a lazy tail which is accessed by destructively taking the head. Like lazy lists, they allow for computation with infinite data, and they use just as much as they need. Unlike lazy lists, they don't support persistence, that is, the ability to keep reference to arbitrary versions of a value. For stream processing of text files they're a perfect match. The idea is to work with functions from streams to streams, or stream transformers. This is made very pleasing by the CamlP4 syntax extension for streams and stream parsers. To use it, you must compile your code with camlc -pp camlp4o, or #load "camlp4o.cma" (or #camlp4o if you're using findlib) in your source file, as explained in the manual.

By far the simplest transformer is the one that skips the head of a stream, and returns the (now headless) result:

let skip str = match str with parser
| [< '_ >] ->  str
| [< >]    -> [< >]

Here the single quote indicates that we're talking about a concrete stream element, as opposed to a substream. It's natural to extend this to a transformer that keeps dropping elements as long as they satisfy a test or predicate p:

let rec skip_while p str = match str with parser
| [< 'x when p x >] -> skip_while p str
| [< >]             -> str

This is the first pattern common to all transformers: return the passed in stream once we're done consuming, or chopping off, elements at its head. This is signalled by the use of the empty pattern [< >] which matches any stream. Of course the opposite filter is also handy and comes for free:

let skip_until p = skip_while (not % p)

We now know how skip_until (matches "^\\*\\*\\* START OF") |> skip works. A variation in the beheading game is to drop a number of elements from the stream:

let rec drop n str = match str with parser
| [< '_ when n > 0 >] -> drop (pred n) str
| [< >]               -> str

(note that this function is purposefully partial). The converse is to take a finite number of elements from a stream:

let rec take n str = match str with parser
| [< 'x when n > 0 >] -> [< 'x; take (pred n) str >]
| [< >]               -> [< >]

(note that this function is also purposefully partial). This is the second pattern common to all transformers: insert a substream on the right side of a matching by recursively invoking the transformer. Similar to this code is the filtering of a stream to weed out the elements satisfying a predicate:

let rec filter p str = match str with parser
| [< 'x when p x >] ->        filter p str
| [< 'x          >] -> [< 'x; filter p str >]
| [< >]             -> [< >]

The fact that in the first matching the recursive call to filter is outside a stream bracket pair means that it will eagerly call itself recursively until it finds an element not matching the predicate.

It's not too difficult to devise a fold over the elements of a stream:

let rec fold f e str = match str with parser
| [< 'x >] -> fold f (f e x) str
| [<    >] -> e

Of course, infinite streams cannot be folded over in finite time, but that's to be expected. This lets me write useful operations in a really concise way:

let length str = fold (fun n _ -> succ n) 0 str

let to_list str = List.rev (fold (fun l x -> x :: l) [] str)

With this, a pipeline like skip_until (matches "^\\*\\*\\* START OF") |> skip |> filter is_empty |> take 20 |> to_list just works, if only we had something to feed it. It could be handy to zip two streams together. Unfortunately, I can find no straightforward way to synchronize the advance of both streams, as there isn't a parallel match of two stream parsers. If you run camlp4o on a file containing the definitions so far, you'll see that the syntax gets rewritten (or expanded) to direct OCaml code that calls into the functions marked For system use only, not for the casual user in stream.mli. It is possible to emulate the result of a hypothetical stream parser doing the zipping by writing directly:

let rec zip str str' =
  match Stream.peek str, Stream.peek str' with
  | Some x, Some y ->
    Stream.junk str;
    Stream.junk str';
    Stream.icons (x, y) (Stream.slazy (fun () -> zip str str'))
  | _  -> Stream.sempty

With a simple stream like:

let nat () = Stream.from (fun i -> Some i)

(why is it a function? Also, note Conal Elliot's remark in his recent post about streams being functions over the natural numbers), a transformer for numbering a stream is as easy as (nat |> zip) (). Another way to build streams is to use the syntax:

let rec repeat e = [< 'e; repeat e >]

Stream expressions are not only used in parsers, as the right hand side of the matchings in the preceding functions hint at. This is possibly the simplest infinite stream you can have.

As it is well-known to Haskellers, streams have algebraic and categorical structure. For instance, they are functors; in other words, they can be mapped over their elements with a function that transforms them pointwise:

let rec fmap f str = match str with parser
| [< 'x >] -> [< 'f x; fmap f str >]
| [<    >] -> [< >]

In this case, the call to f is lazy, and not performed until the head of the resulting stream is requested. This can be good or not; for a strict evaluation of the head, a let binding must be used.

Streams also form a monad, entirely analogous to the list monad. The unit simply returns the one-element stream:

let return x = [< 'x >]

The bind concatenates the streams resulting from the application of a function to the each element in turn:

let rec bind str f = match str with parser
| [< 'x >] -> [< f x; bind str f >]
| [<    >] -> [< >]

Note that the difference between fmap and bind is a single quote! This is a direct consequence of the fact that every monad is a functor, via the identity fmap f x = bind x (return . f). Now words is straightforward:

let words =
  let re = Str.regexp "[\n\t ]+" in fun str ->
  bind str (Str.split re  |>  Stream.of_list)

It only remains one last piece of the puzzle: how does an input channel get transformed into a stream of lines? With a transformer (a parser, actually), of course. This is a rather simple one (especially compared to the imperative monstrosity it replaced; you don't want to know), once you dissect it into its components:

let lines =
  let buf = Buffer.create 10 in
  let get () = let s = Buffer.contents buf in Buffer.clear buf; s
  in
  let islf = parser
  | [< ''\n' >] -> ()
  | [< >]       -> ()
  in
  let rec read str = match str with parser
  | [< ''\n' >] ->           [< 'get (); read str >]
  | [< ''\r' >] -> islf str; [< 'get (); read str >]
  | [< 'c    >] -> Buffer.add_char buf c; read str
  | [<       >] -> [< >]
  in read

The characters in a line are accumulated in the buf buffer. Once a line is complete, get resets the buffer and returns its contents. The sub-parser islf tries to eat the LF in a CRLF pair. Since it includes an empty match, it cannot fail. The recursive parser read scans for a LF, or for a CR or CRLF terminator, accumulating characters until a complete line is included in the result stream. Invoking islf it in the right-hand side of the CR match in read is safe, in the sense that it cannot lead to back-tracking.

At last we get to the point where we can read from a file. For that Stream.of_channel directly and efficiently turns the contents of a file into a stream of characters:

let reading file k =
  let inch = open_in_bin file in
  try
    let res = k (Stream.of_channel inch) in
    close_in inch; res
  with e -> close_in inch; raise e

where k is the continuation that is safely wrapped in an exception handler that closes the file even in the event of an error. Safe input-output in the face of exceptions is an idiom borrowed from LISP, the well-known WITH-OPEN-FILE.

You could find various ways to exercise this small DSL; by way of example, a word count:

reading "ebooks/345.txt" (
   lines
|> skip_until (matches "^\\*\\*\\* START OF")
|> skip
|> filter is_empty
|> words
|> length
)
- : int = 163718

Numbering the first lines in the book:

reading "ebooks/345.txt" (
   lines
|> skip_until (matches "^\\*\\*\\* START OF")
|> skip
|> filter is_empty
|> fmap String.uppercase
|> zip (nat ())
|> take 10
|> to_list
)
- : (int * string) list =
[(0, "DRACULA"); (1, "BY"); (2, "BRAM STOKER"); (3, "1897 EDITION");
 (4, "TABLE OF CONTENTS"); (5, "CHAPTER");
 (6, "  1  JONATHAN HARKER'S JOURNAL");
 (7, "  2  JONATHAN HARKER'S JOURNAL");
 (8, "  3  JONATHAN HARKER'S JOURNAL");
 (9, "  4  JONATHAN HARKER'S JOURNAL")]

Which of course is not very accurate, as the first line should be numbered 1, not 0, and the line numbers don't correspond to the text but to the filtered stream. Let's try something else: define the arrow parts of the pair projection functions:

let first  f (x, y) = (f x, y)
and second g (x, y) = (x, g y)

Now, number the lines as read, and then apply the processing to the second component of the paired-up lines:

reading "ebooks/345.txt" (
   lines
|> zip (nat ())
|> skip_until (snd |> matches "^\\*\\*\\* START OF")
|> skip
|> filter (snd |> is_empty)
|> (fmap % second) String.uppercase
|> take 10
|> to_list
)
- : (int * string) list =
[(34, "DRACULA"); (36, "BY"); (38, "BRAM STOKER"); (41, "1897 EDITION");
 (46, "TABLE OF CONTENTS"); (49, "CHAPTER");
 (51, "  1  JONATHAN HARKER'S JOURNAL");
 (52, "  2  JONATHAN HARKER'S JOURNAL");
 (53, "  3  JONATHAN HARKER'S JOURNAL");
 (54, "  4  JONATHAN HARKER'S JOURNAL")]

I hope this helps in reducing Haskell envy a little bit.

2008-11-09

(One Flew Over The) Cuckoo's Hash

In a conversation over reddit, reader nostrademons suggested Cuckoo Hashing as a hash scheme with constant-time worst-case look-ups. References to the algorithm abound, but implementations are scarce, so I set out to write code. It turned out to be more difficult than I anticipated.

The original article describing Cuckoo Hashing explains the basic scheme better than the Wikipedia article does, in my opinion. It uses, however, two hash tables with probes going first to one and then to the other. A later article by the inventor shows a simpler scheme with only one key space. This paper is straightforward, and I set out to write code under its specifications. Instead of a hash map I preferred writing a hash set; it is possible to build the former with the latter storing key-value pairs.

This requires parameterizing the hash table with not only a data type but with an equality predicate and a hash function over that type; in effect this makes the hash table not polymorphic but an existential type, witnessed by those functions:

type 'a hash = {
  equals  : 'a -> 'a -> bool;
  hash    : 'a -> int;
  mutable index : int;
  mutable count : int;
  mutable table : 'a option array;
}

Of index I will say more later, but count is the number of elements stored in table, where the presence or absence of an element at a given position is given by the use of an option. Thanks to one of the (few) strange corners of OCaml's type system I can make both parameters optional by relying on the polymorphic equality and hash functions built-in to the standard library:

let create ?(equals=(=)) ?(hash=Hashtbl.hash) size =
  { equals = equals; hash = hash; index = 1; count = 0; table = Array.make size None }

Note that create and not make is the builder function name in the standard module Hashtbl. The easy part of implementing a hash table are the administrative functions dealing with the specifics of an imperative data type. Clearing a table is very simple:

let clear h =
  h.index <- 1;
  h.count <- 0;
  Array.fill h.table 0 (Array.length h.table) None

And simpler still is to copy a table into another:

let copy h =
  { h with table = Array.copy h.table }

Iterating over the elements in the table is also straightforward:

let iter f h =
  Array.iter (function Some w -> f w | _ -> ()) h.table

With that, a left fold is equally natural:

let fold f e h =
  let res = ref e in
  iter (fun w -> res := f !res w) h;
  !res

The number of elements stored in the table is given by:

let length h = h.count

Again, the name copies the one given to the analogous function in Hashtbl. What about that index lurking in the table? To explain it, I must take a detour.

Hashing is hard

The second paper by Pagh I linked to above assumes that:

We have access to hash functions h1 and h2 such that any function value hi(x) is equal to a particular value in {1, …, r} with probability 1/r, and the function values are independent of each other […] Note that this is only possible if hash functions are somehow chosen in a random fashion. However, in this lecture note we will not describe how to do this.

And herein lies the first rub. How to choose two functions at random? The first paper speaks of a universal family of hash functions, which is no more and no less than a set of functions indexed by some mechanism with the property that the results of each member in the family is independent of each other. I couldn't find much data about how to build such a family, but I wasn't too diligent about it. What I did find was that cryptographic hashes are a good approximation to a universal family when keyed by the index.

A suitable index is readily available as the i for each of the two probes into the Cuckoo table. Independent of this, I had read about a new cryptographic hash family by Schneier et al., the Skein hash. What struck me about it is that it uses a really simple Feistel network to build the mixing rounds:

let ror i x = (x lsr (31 - i)) lor (x lsl i)

let round r (x, y) = let s = x + y in s, s lxor (ror r y)

With some fudging of the values given in that paper, I built a 4-round shuffle:

let shuffle k x = snd (round 22 (round 19 (round 30 (round 7 (k, x)))))

Now each of the two hashes would be keyed by a "random" number and fed to the shuffle:

let keys  = [| 599290962; 771645345 |]

(If you look up those numbers, you'll see they're not that random, and they have something to do with the largest prime 230 - 35 that fits in an OCaml 31-bit machine integer). With that, the indexed hash looks like this:

let hash i h x =
  let p = h.hash x in
  let q = shuffle keys.(i) p in
  let r = (p + h.index * q)  land  0x3fffffff in
  r mod Array.length h.table

And again, what's with the index? Well, there's another rub in Pagh's papers, of which I will say more later on; suffice to say that it serves as a way to build a double hash (indeed, multiple) in the case that insertion fails. This results in a multi-parametric hash family h(k, j, i), where k is the key, j is the index or hashing "round", and i selects one of both Cuckoo hashes.

I tested these hash functions for collisions with a smallish dictionary of words and found it satisfactory. I haven't found a simple, certified implementation of a hash family, and the few references that I've looked up were vague and inconclusive, or relied heavily on strong cryptographic primitives. This is the first failure of Pagh's "algorithm", in that it is under-specified. This hash might fail to fulfill the paper's preconditions as I quoted above, and so this code is not fit for productive usage!

The good: deletions

The surprise came when I had to implement deletion from the table. When I wrote on reddit, I was under the impression that it would need lazy deletion to skip over deleted entries. This turned out to be false, as I found by implementing the table with a variant type denoting both empty and missing slots, and seeing that both paths were identical. Further thought convinced me that, indeed, insert below does not inspect slots, and so there is no need to mark them as deleted as opposed to simply empty. So, the following code, even if not shown in the paper, posed no problem:

let remove h v =
  try for i = 0 to 1 do
    let x = hash i h v in
    match h.table.(x) with
    | Some w when h.equals v w ->
      h.table.(x) <- None; raise Exit
    | _ -> ()
  done with Exit -> ()

As you will see, it is exactly analogous to the look-up below.

The ugly: look-ups

The raison d'être of the Cuckoo scheme is constant-time lookup:

let mem h v =
  try for i = 0 to 1 do
    let x = hash i h v in
    match h.table.(x) with
    | Some w when h.equals v w -> raise Exit
    | _ -> ()
  done; false with Exit -> true

This tries twice to find the element v, first by hashing it with hash 0, then with hash 1. By the postcondition of insertion, it is guaranteed that, if not found in two tries the key is simply not stored in the table. The code looks needlessly imperative, with an exception replacing a short-circuiting or, but it is symmetrical to that for deletion above, and what you would write with imperative loops and breaks. Now for the last hurdle.

The bad: insertions

Pagh shows how to perform the basic shuffle of elements around the table so that the two-probe look-up succeeds. What he doesn't show is how to grow the table, or what to do if insertion finds a loop in the random graph induced by the hashes. He glosses over this with:

This is continued until the procedure finds a vacant position or has taken too long. In the latter case, new hash functions are chosen, and the whole data structure is rebuilt ("rehashed") … In this case the insertion procedure will loop n times before it gives up and the data structure is rebuilt ("rehashed") with new, hopefully better, hash functions.

(emphasis mine). This is where I started looking for implementations, to see how they resolved this oversight. I found none, and so here's my attempt. If the hash function above is good (random) enough, this procedure terminates; as I can't guarantee that the hash I've given above is universal, this function can very well fail, although with an out-of-memory exception and not an infinite loop. Caveat emptor!

First, I'll bound the number of attempts at rehashing with a new set of hash functions before giving up and growing the table:

let add h v =
  let attempts = ref 0 in

I use an auxiliary function go; insert is a wrapper for it that gets it started with an initial hash value and a number of attempts equal to the size of the table:

  let rec insert v =
    go (Array.length h.table) (hash 0 h v) v

This function go takes the number n of attempts left, the current index pos and the value v to insert. If there aren't any more attempts left, there's no other choice than to rehash and try again to insert the value that failed. Otherwise, the current slot's content gets exchanged with the value to be inserted, and the old value, if any, is inserted recursively with the first hash that changes its position:

  and go n pos v =
    if n = 0 then begin rehash h; insert v end else
    let x = h.table.(pos) in
    h.table.(pos) <- Some v;
    match x with
    | None   -> ()
    | Some w ->
      let j = hash 0 h w in
      go (pred n) (if pos = j then hash 1 h w else j) w

This is essentially the inner loop of the algorithm insert given in the paper. It is now necessary to flesh out the mysterious rehash.

The first paper analyzes the algorithm in depth, and it shows that, given that the hashes are universal and independent, a fill factor of less than 0.5 implies that the probability of a cycle is less than the number of entries in the table, and so by the pigeonhole it cannot happen. To account for the imperfect universality of the hashes, I try at most 5 attempts at rehashing, with successively higher values of index. I've found empirically that odd values of the multiplier work best. If the maximum number of attempts is reached, or the table is over-full, I simply double its size. Then, I iterate over the old table, successively inserting the found values in the new.

  and rehash h =
    let table  = h.table in
    let length = Array.length table in
    let size   =
      if !attempts < 5 && 2 * h.count < length
      then begin
        h.index <- h.index + 2;
        incr attempts;
        length
      end else begin
        attempts := 0;
        2 * length
      end in
    h.table <- Array.make size None;
    Array.iter (function Some v -> insert v | _ -> ()) table

It is here that a pathologically bad hash family could lead to problems: insert calls rehash that calls insert that could call rehash again that… Here the procedure would double the table again and again, until it exceeds the limits of OCaml arrays. With that, the insertion is carried out if the entry is not already present in the table:

  in if not (mem h v) then begin
    insert v;
    h.count <- h.count + 1
  end

As you can see, the correctness of the algorithm relies on the statistical properties of the hash family, which in my view is a flimsier foundation than linear chaining. This makes the latter attractive in the presence of hash functions of arbitrary, and not always good, quality: at worst you have an O(n) look-up, instead of non-termination.

Trying it out

This said, the code works surprisingly well. To test it, I've found a list of the 500 most frequent English words:

let words = [
"the";"of";"to";"and";"a";"in";"is";"it";"you";"that";
"he";"was";"for";"on";"are";"with";"as";"I";"his";"they";
"be";"at";"one";"have";"this";"from";"or";"had";"by";"not";
"but";"some";"what";"there";"we";"can";"out";"other";"were";"all";
"your";"when";"up";"use";"word";"how";"said";"an";"each";"she";
"which";"do";"their";"time";"if";"will";"way";"about";"many";"then";
"them";"would";"write";"like";"so";"these";"her";"long";"make";"thing";
"see";"him";"two";"has";"look";"more";"day";"could";"go";"come";
"did";"my";"sound";"no";"most";"number";"who";"over";"know";"water";
"than";"call";"first";"people";"may";"down";"side";"been";"now";"find";
"any";"new";"work";"part";"take";"get";"place";"made";"live";"where";
"after";"back";"little";"only";"round";"man";"year";"came";"show";"every";
"good";"me";"give";"our";"under";"name";"very";"through";"just";"form";
"much";"great";"think";"say";"help";"low";"line";"before";"turn";"cause";
"same";"mean";"differ";"move";"right";"boy";"old";"too";"does";"tell";
"sentence";"set";"three";"want";"air";"well";"also";"play";"small";"end";
"put";"home";"read";"hand";"port";"large";"spell";"add";"even";"land";
"here";"must";"big";"high";"such";"follow";"act";"why";"ask";"men";
"change";"went";"light";"kind";"off";"need";"house";"picture";"try";"us";
"again";"animal";"point";"mother";"world";"near";"build";"self";"earth";"father";
"head";"stand";"own";"page";"should";"country";"found";"answer";"school";"grow";
"study";"still";"learn";"plant";"cover";"food";"sun";"four";"thought";"let";
"keep";"eye";"never";"last";"door";"between";"city";"tree";"cross";"since";
"hard";"start";"might";"story";"saw";"far";"sea";"draw";"left";"late";
"run";"don't";"while";"press";"close";"night";"real";"life";"few";"stop";
"open";"seem";"together";"next";"white";"children";"begin";"got";"walk";"example";
"ease";"paper";"often";"always";"music";"those";"both";"mark";"book";"letter";
"until";"mile";"river";"car";"feet";"care";"second";"group";"carry";"took";
"rain";"eat";"room";"friend";"began";"idea";"fish";"mountain";"north";"once";
"base";"hear";"horse";"cut";"sure";"watch";"color";"face";"wood";"main";
"enough";"plain";"girl";"usual";"young";"ready";"above";"ever";"red";"list";
"though";"feel";"talk";"bird";"soon";"body";"dog";"family";"direct";"pose";
"leave";"song";"measure";"state";"product";"black";"short";"numeral";"class";"wind";
"question";"happen";"complete";"ship";"area";"half";"rock";"order";"fire";"south";
"problem";"piece";"told";"knew";"pass";"farm";"top";"whole";"king";"size";
"heard";"best";"hour";"better";"true";"during";"hundred";"am";"remember";"step";
"early";"hold";"west";"ground";"interest";"reach";"fast";"five";"sing";"listen";
"six";"table";"travel";"less";"morning";"ten";"simple";"several";"vowel";"toward";
"war";"lay";"against";"pattern";"slow";"center";"love";"person";"money";"serve";
"appear";"road";"map";"science";"rule";"govern";"pull";"cold";"notice";"voice";
"fall";"power";"town";"fine";"certain";"fly";"unit";"lead";"cry";"dark";
"machine";"note";"wait";"plan";"figure";"star";"box";"noun";"field";"rest";
"correct";"able";"pound";"done";"beauty";"drive";"stood";"contain";"front";"teach";
"week";"final";"gave";"green";"oh";"quick";"develop";"sleep";"warm";"free";
"minute";"strong";"special";"mind";"behind";"clear";"tail";"produce";"fact";"street";
"inch";"lot";"nothing";"course";"stay";"wheel";"full";"force";"blue";"object";
"decide";"surface";"deep";"moon";"island";"foot";"yet";"busy";"test";"record";
"boat";"common";"gold";"possible";"plane";"age";"dry";"wonder";"laugh";"thousand";
"ago";"ran";"check";"game";"shape";"yes";"hot";"miss";"brought";"heat";
"snow";"bed";"bring";"sit";"perhaps";"fill";"east";"weight";"language";"among";
]

One possibility is to preallocate enough space for them so as to avoid growing the table:

# let h = create 1000 ;;
val h : '_a hash =
  {equals = <fun>; hash = <fun>; index = 1; count = 0;
   table =
    [|None; None; None; None; None; None; None; None; None; None; None; None;
      None; None; None; None; None; None; ...|]}
# List.iter (add h) words ;;
- : unit = ()
# Array.length h.table ;;
- : int = 1000
# h.index ;;
- : int = 1

This shows that it never needed a rehash. The other possibility is to start small, and let the algorithm grow as needed:

# let h = create 16 ;;
val h : '_a hash =
  {equals = <fun>; hash = <fun>; index = 1; count = 0;
   table =
    [|None; None; None; None; None; None; None; None; None; None; None; None;
      None; None; None; None|]}
# List.iter (add h) words ;;
- : unit = ()
# Array.length h.table ;;
- : int = 1024
# h.index ;;
- : int = 3

Now, both rehashing and growing were necessary. Of course, the table does contain the entries:

# length h ;;
- : int = 500
# List.filter (not % mem h) words ;;
- : string list = []

Acknowledgements: to nostrademons that, as I've come to expect from reddit users, made me think more and harder than I would otherwise had.

Wordle

Are wordles a novel visualization technique, or the Applet-equivalent of fridge magnet poetry? I'm not decided; meanwhile, I've compiled one of all my posts in the last year or so. Pure vanity, perhaps, but with pretty results:

2008-10-31

Group-By Redux

I sometimes like to defunctionalize a recursive function to see what shadow it projects against the wall of the cave. OCaml being strict, the tail recursive, eager, "imperative" version of the function is dual to the "pure", functional one. I've written before about the chain from direct style to CPS to defunctionalization, so I'll not make this a tutorial but a worked-out example. Starting from the last version of group ~by:

let rec group ~by =
  let rec split e = function
  | x :: xs when by e x ->
    let g, ys = split e xs in
    x :: g, ys
  | l       -> [], l
  in function
  | []      -> []
  | x :: xs ->
    let g, ys = split x xs in
    (x :: g) :: group ~by ys

Defunctionalization is a somewhat error-prone technique to apply by hand, but it is methodical and practice makes things relatively smooth. The first thing I like to do is to convert the function to A-normal form:

let rec group ~by l =
  let rec split e l = match l with
  | x :: xs when by e x ->
    let g, ys = split e xs in
    let g' = x :: g in
    g', ys
  | l       -> [], l
  in match l with
  | []      -> []
  | x :: xs ->
    let g, ys = split x xs in
    let g' = x :: g in
    let gs = group ~by ys in
    g' :: gs

Note how each function call is explicitely let-bound. Also, I've moved the parameter l so that it is explicitely named and it doesn't interfere with the next steps. Then, every recursive call receives the rest of the computation as an explicit continuation, using the introduced bindings:

let rec group ~by l k =
  let rec split e l k = match l with
  | x :: xs when by e x ->
    split e xs (fun (g, ys) -> k (x :: g, ys))
  | l       -> k ([], l)
  in match l with
  | []      -> k []
  | x :: xs ->
    split x xs (fun (g, ys) -> group ~by ys (fun gs -> k ((x :: g) :: gs)))

The CPS conversion of split is correct, but that of group went too far since the recursive call is buried in the initial continuation for split. This is why I mean by error-prone; I don't always get it right on the first attempt. Indeed, the call to split is irrelevant to the recursive call to group, and should be kept in its place. Also, I'll make group a helper so that its continuation is kept encapsulated:

let group ~by l =
  let rec split e l k = match l with
  | x :: xs when by e x ->
    split e xs (fun (g, ys) -> k (x :: g, ys))
  | l       -> k ([], l)
  in
  let rec group l k = match l with
  | []      -> k []
  | x :: xs ->
    let g, ys = split x xs (fun x -> x) in
    group ys (fun gs -> k ((x :: g) :: gs))
  in group l (fun x -> x)

Now defunctionalization involves reifying every continuation as an explicit data structure. I have two functions, so I'll need two types. Each has an initial identity continuation. Then split invokes its continuation with x of polymorphic type α free. Also, group invokes its continuation with x :: g free of polymorphic type α list. This dictates my types:

type 'a split_cont =
| SInit
| SSplit of 'a * 'a split_cont
and 'a group_cont =
| GInit
| GGroup of 'a list * 'a group_cont

Each continuation is now changed into a worker function that simulates invoking it with the supplied free values; a pair (g, ys) for split's, a list gs for that of group:

let rec split_apply (g, ys) = function
| SInit         -> (g, ys)
| SSplit (x, k) -> split_apply (x :: g, ys) k
and group_apply gs = function
| GInit         -> gs
| GGroup (g, k) -> group_apply (g :: gs) k

Now the continuations are replaced in the bodies of the functions by explicit constructors:

let group ~by l =
  let rec split e l k = match l with
  | x :: xs when by e x ->
    split e xs (SSplit (x, k))
  | l       -> split_apply ([], l) k
  in
  let rec group l k = match l with
  | []      -> group_apply [] k
  | x :: xs ->
    let g, ys = split x xs SInit in
    group ys (GGroup (x :: g, k))
  in group l GInit

Everything works, as it's easy to verify:

# group ~by:(fun x y -> y - x < 3) (iota 13);;
- : int list list = [[0; 1; 2]; [3; 4; 5]; [6; 7; 8]; [9; 10; 11]; [12]]

Here's where the magic occurs: note that in split_apply, the second member ys of the pair is passed around unchanged; this means that I can pull it out of the continuation argument and return it directly from split:

let rec split_apply g = function
| SInit         -> g
| SSplit (x, k) -> split_apply (x :: g) k
and group_apply gs = function
| GInit         -> gs
| GGroup (g, k) -> group_apply (g :: gs) k

Now both functions are identical modulo renaming, with α group_cont = α list split_cont. And α split_cont is isomorphic to α list, with SInit[], and SSplit:: or cons. Under this isomorphism, both functions split_apply and group_apply are List.rev_append in disguise! Applying the isomorphism to split and group, with the proviso that List.rev_append l [] = List.rev l:

let group ~by l =
  let rec split e l g = match l with
  | x :: xs when by e x ->
    split e xs (x :: g)
  | l       -> List.rev g, l
  in
  let rec group l gs = match l with
  | []      -> List.rev gs
  | x :: xs ->
    let g, ys = split x xs [] in
    group ys ((x :: g) :: gs)
  in group l []

And this is essentially the first version of group ~by I've written before.

2008-10-29

Extreme Recursion

For a data processing application, I needed a way to group data records together according to some criteria. This is the "reduce" phase of the map-reduce transformation, or the group-by phase of a select-project-group query.

My specific problem called for a way to group records in an input list into output sublists by testing them against the first such record considered. More specifically, the records were timestamped, and I had to group them into one-hour segments. If the leading record was timestamped on, say, 12:23:15PM October 29th, 2008, it would be the first in a group including records timestamped up to 1:23:14PM October 29th, 2008.

There's a venerable technique, the control break with advanced reading to solve this problem. I reasoned that I needed to keep track of the current group, and of the list of as-yet-unexamined records. Using accumulating parameters, the following wrote itself:

let group ~by = function
| []      -> []
| x :: xs ->
  let rec scan g gs e = function
  | []      -> List.rev ((e :: List.rev g)  ::  gs)
  | x :: xs ->
    if by e x
    then scan (x :: g) gs e xs
    else scan [] ((e :: List.rev g)  ::  gs) x xs
  in scan [] [] x xs

The grouping function is by, and an empty list contains no groups. Otherwise, the first element x in the list would be the group leader e in the "advanced read", used to find how far that group would extend with a call to scan. This function keeps track of the current group g and the list of found groups gs, and examines the input list so far. If it's empty, it puts the group leader e at the head of the current group, and outputs the result as the last group of the output list. Since the accumulating parameters are built head-first, this requires a couple of reversals. Otherwise, the current record x is tested to see if it belongs in the same group as the leader e; if so, it is added to the current group; otherwise, the current group is closed and a new one is started with that record as the new group leader. In any case, recursion is in tail position, which betrays that this algorithm is imperative and sequential in nature.

All in all, correct by design:

# group ~by:(fun e x -> x - e < 3) [1;2;3;4;5;6;7;8;9;10;11;12];;
- : int list list = [[1; 2; 3]; [4; 5; 6]; [7; 8; 9]; [10; 11; 12]]

and deeply unsatisfactory. What would a Haskeller do? That is, what would be a non-operational, divide-and-conquer truly recursive solution look like? Well, I reasoned, given a record e, split the remaining records into the group led by it and the remainder, and output the group and the result of grouping the remainder by its group leader. After some back-and-forth refactoring and a couple of tries, I came up with:

let group ~by =
  let rec split e l = match l with
  | []      -> [], []
  | x :: xs ->
    if not (by e x) then [], l else
    let g, ys = split e xs in
    x :: g, ys
  in
  let rec scan e l =
    let g, ys = split e l in
    (e :: g) :: match ys with
    | []      -> []
    | x :: xs -> scan x xs
  in function [] -> [] | x :: xs -> scan x xs

I actually sketched scan first: I supposed the function split was given as I described above, recurred on the remainder, and tacked the found group on the result. Refactoring the common e :: g that I originally wrote on both arms of the match ys with… resulted in the rather odd-looking function above. Then, writing split was relatively easy: an empty list gives an empty group and remainder, but a list with a record must be tested for membership on the current group. If not a member, the group is closed and the remainder includes that record; otherwise, split the rest and tack the record at the beginning of the found group.

In truth, I reversed that conditional on a later rewrite, because the arms of the if looked really unbalanced. I didn't think I had improved matters, however: first, OCaml, not being lazy, doesn't really benefit from using right recursion; second, this code is both longer and —to my old-fashioned, structured-programming brain— more roundabout.

But a bit of refactoring never hurts: note that the top-level match (the very last one) and that inside scan are exactly the same. In this case, factoring out that code involved a recursive call made directly to group; this made scan a non-recursive helper that I manually inlined, thus:

let rec group ~by =
  let rec split e l = match l with
  | []      -> [], []
  | x :: xs ->
    if not (by e x) then [], l else
    let g, ys = split e xs in
    x :: g, ys
  in function
  | []      -> []
  | x :: xs ->
    let g, ys = split x xs in
    (x :: g) :: group ~by ys

A guard pattern makes things a bit more functional, even to the point where I can merge two patterns into one:

let rec group ~by =
  let rec split e = function
  | x :: xs when by e x ->
    let g, ys = split e xs in
    x :: g, ys
  | l       -> [], l
  in function
  | []      -> []
  | x :: xs ->
    let g, ys = split x xs in
    (x :: g) :: group ~by ys

Which is exactly right, and what I should've come up with in the first place! For comparison, here's again the first function, with a bit of clean-up done to it:

let group ~by =
  let snoc x xs = x :: List.rev xs in
  let rec scan g gs e = function
  | []      -> snoc e g  ::  gs
  | x :: xs ->
    if by e x
    then scan (x :: g) gs e xs
    else scan [] (snoc e g  ::  gs) x xs
  in function
  | []      -> []
  | x :: xs -> List.rev (scan [] [] x xs)

A clear mongrel of a function. Refactoring functional code is not complete, I think, until the code is "just right", which sometimes is never. Hence the need to step back and rethink the whole point to the function in question; there shouldn't be a single ugly function in a program. The problem with this is that whereas I couldn't explain what I mean by "ugly" or "pretty" function, I know when I've reached aesthetic closure; and yet I'd like to know, what is the "process" by which great Haskell programmers write their code?

2008-10-15

Top Drawing

I wrote the last time about simple ways to "just get something onto the screen", and I ended by exploring a simple Top Draw program. I hinted at the ease with which Top Draw allows compositing a complex "painting", taking more or less by fiat the examples included with the program. I since took advantage of the basic rosette drawing subroutine to make a wallpaper that I could actually install:

A desktop wallpaper generated with Top Draw

In my opinion there are two advantages to using Top Draw:

  1. It abstracts the compositing paradigm with a Photoshop-like Layer object,
  2. It allows to interleave procedural drawing with arbitrary Core Image filters,

thus leveraging the built-in capabilities of Mac OS X Quartz subsystem.

I first refactored my script so that the drawing code is independent of the global transform being applied:


function wheel(m, n) {
  if (m < 1 || n < 5) return;
  var color = new Color();
  color.alpha = 0.3;
  var a = Math.PI / n;
  var k = Math.cos(a), s = Math.sin(a);
  var t = (k + s) / (k - s);
  var d = k + s * t;
  var r = Math.pow(k - s, m);
  for (var e = 0; e != m; e++) {
    color.saturation = 0.5*(1 + e/m); // 0.5 -- 1.0
    color.brightness = 0.5*(2 - e/m); // 1.0 -- 0.5
    for (var i = 0; i != n; i++) {
      var b = (2*i + e) * a;
      var x = r * Math.cos(b), y = r * Math.sin(b);
      color.hue = i / n;
      desktop.fillStyle = color;
      desktop.beginPath();
      desktop.moveTo(x, y);
      desktop.lineTo(d * (k * x - s * y), d * (k * y + s * x));
      desktop.lineTo(t * x, t * y);
      desktop.lineTo(d * (k * x + s * y), d * (k * y - s * x));
      desktop.lineTo(x, y);
      desktop.fill();
    }
    r /= k - s;
  }
}

Gone are centers and offsets in each parameter to the drawing commands, as gone is the global scaling of the drawing. Then, I use random number generators to place the rosettes all over the screen:

desktop.fillLayer(new Color(0));
var db = desktop.bounds;

var xrnd = new Randomizer(0, db.width );
var yrnd = new Randomizer(0, db.height);

In order to give some depth to the final result, I'll blur each drawing a bit so that newer rosettes are sharper than older ones:

var f = new Filter("CIGaussianBlur");
f.setKeyValue("inputRadius", 2);

I'll draw a number of rosettes, four in this case, in random places:

for (var i = 1; i <= 4; i++) {
  var cx = xrnd.intValue;
  var cy = yrnd.intValue;

I make sure that the rosette fits the screen with the given center:

  var sz = Math.max(cx, cy, db.width - cx, db.height - cy);

In order to isolate each drawing from the modifications to the global transform, I save the drawing context prior to changing scale and center:

  desktop.save();
  desktop.translate(db.x + cx, db.y + cy);
  desktop.scale(sz, sz);

I then draw the rosette and restore the context. I could easily give random parameters to the drawing routine, but I find that by having uniform shapes the rosettes nicely stack one over another:

  wheel(8, 10);
  desktop.restore();

Finally, I apply the prepared filter to the resulting image:

  desktop.applyFilter(f);
}

As you can see, it's really easy to get good results with remarkably little code. Keep hitting Command-R until you find a result you like, and install it on your desktop.

2008-10-13

Procedural Drawing

Pretty pictures, again. There are now a number of "toys" with which to "recreate" the experience of turning on the old computer (C-64, Spectrum, Apple II or what had you) and being greeted with a BASIC prompt. I've written before about how SDL can give you a simple, direct (!) buffer to which to write pixels; the problem it has is that it does not know anything about drawing any object more complicated than a pixel. I'd like to explore some alternatives.

I was inspired by this image and wanted to replicate it. I happen to have Mathematica at hand, and sometimes it's my fall-back choice. After some fudging, I came up with:

wheel[m_Integer?Positive, n_Integer?Positive] :=
 With[{a = Pi/n},
  With[{ka = Cos[a], sa = Sin[a]},
   With[{ta = (ka + sa)/(ka - sa)},
    With[{d = ka + sa*ta},
     Graphics[
      Table[
       With[{r = 1/(ka - sa)^e},
        Table[
          With[{k = Cos[a (2 i + e)], s = Sin[a (2 i + e)]},
          {Hue[i/n, 0.5*(1 + e/m), 0.5*(2 - e/m)],
           Polygon[
            r {{k, s},
             d {k*ka - s*sa, s*ka + k*sa},
             ta {k, s}, 
             d {k*ka + s*sa, s*ka - k*sa},
             {k, s}}]}],
        {i, 0,  n - 1}]],
      {e, 0, m - 1}]]]]]]

I won't delve into the semantics of this function, except to say that it is pure (it returns a Graphics object), and that I tend to use immutable bindings introduced by With as much as possible, but this not always affords me readable code. An invocation of wheel[12, 16] generates the following:

color daisy drawn with Mathematica

Pretty! I was satisfied with this for maybe twelve hours. I thought that it could be nice if it could be animated, or something. Doing this in OCaml was out of the question; I have no intention of installing Cairo and a host of dependencies and bindings just to be able to. Using Processing would have been an alternative; I find it, however, crude and unresponsive. Flash would have been ideal, as I'm impressed with Krazydad's wizardry. This would have the unfortunate prerequisite of me having to learn to program Flash in the first place, which is something I have not in my plans for the foreseeable future. I could have used Java, I could…

I remembered I had downloaded Top Draw. I don't think it has animation capabilities; it sits, I think, squarely in the category of "toys". It has, however, a strong graphics model, with filters and compositing inherited from Quartz. Its documentation is succint but informative, and it's Javascript, so I would feel at home. It turns out that Mathematica's retained model means that I could forget about scaling and layout; with Top Draw, those considerations must be taken into account:

var bounds = desktop.bounds;
var cx = bounds.x + bounds.width  / 2;
var cy = bounds.y + bounds.height / 2;

I had a choice between showing most of the graphic or filling most of the screen; it's a matter of a min over a max:

var sz = Math.max(bounds.width, bounds.height) / 2;

Again, the drawing should be generated with a simple call:

desktop.fillLayer(new Color(0));
wheel(12, 16);

The function wheel is where the fun begins (pun not intended). It takes two parameters, the number m of layers and the number n of rhombs to place around the circle. It starts by validating the inputs:

function wheel(m, n) {
  if (m < 1 || n < 5) return;

Each rhomb has a color determined by its position on the graphic. Each would be placed two n-ths of the way around the circle, or α = π / n. Some trigonometry shows that the diagonal of the rhomb is t = (1 + tan α)/(1 - tan α), and d is the distance to the center of the rhomb. As you can see, the rhombs grow exponentially with a factor of 1/(cos α - sin α); in order to make the entire drawing fit the prescribed size I must calculate the starting radius r accordingly:

  var color = new Color();
  var a = Math.PI / n;
  var k = Math.cos(a), s = Math.sin(a);
  var t = (k + s) / (k - s);
  var d = k + s * t;
  var r = sz * Math.pow(k - s, m);

Each tier of rhombs e has a particular saturation and brightness computed as a fraction of the total:

  for (var e = 0; e != m; e++) {
    color.saturation = 0.5*(1 + e/m); // 0.5 -- 1.0
    color.brightness = 0.5*(2 - e/m); // 1.0 -- 0.5

The rhombs spiral out of the center, since each tier is started at an angle of α from the previous one. This makes the new rhombs interlock between a pair of preexisting rhombs at the current point x, y:

    for (var i = 0; i != n; i++) {
      var b = (2*i + e) * a;
      var x = r * Math.cos(b), y = r * Math.sin(b);

(As you might have noticed, all angles are relative to the semicircle; that is, they are actually half-angles). The hue is relative to the position around the circle:

      color.hue = i / n;
      desktop.fillStyle = color;
      desktop.beginPath();

Two of the rhomb's four corners are aligned with the current point; the two others are at an angle of α at either side. Instead of computing six sines and cosines, I use the formulas for the sum and difference of angles:

      desktop.moveTo(cx + x, cy + y);
      desktop.lineTo(cx + d * (k * x - s * y), cy + d * (k * y + s * x));
      desktop.lineTo(cx + t * x, cy + t * y);
      desktop.lineTo(cx + d * (k * x + s * y), cy + d * (k * y - s * x));
      desktop.lineTo(cx + x, cy + y);
      desktop.fill();
    }

As I've indicated above, the drawing is done relative to the center of the desktop. And that's it. It only remains to scale the radius for the next tier:

    r /= k - s;
  }
}

Now it is clear that is important that n > 4 so that 0 < cos α - sin α < 1. The result of this program is essentially identical to the previous one:

color daisy drawn with Top Draw

With the added bonus that now I can start playing with filters and tweaks in real-time until I'm satisfied, or tired, or both. Let the games begin!

2008-08-31

Just draw something on the screen!

Nostalgia for times past sometimes leads to frustration and anger. Yes, times are harder for would-be pixel artists, but nothing a little work can't solve. Let's do fractals in OCaml!

Suppose you've already installed the wonderful findlib (vielen Dank, Herr Stolpmann!). Suppose, furthermore, you've succeded in compiling OcamlSDL. This might look like a heavy prerequisite to fulfill before showing pretty graphics, but it's a job you have to do only once. Then, something like plotting Newton fractals requires less than ninety lines of code, and the result is this:

I find natural to abstract iteration patterns into higher-order functions. In this case, centering and scaling the pixel plane over the complex plane (the so-called screen-space transformation) is tedious but repetitive code that could be reused. Furthermore, the details of striding, color conversion and such are the same from program to program, where the only thing that varies is the actual calculation performed on each point:

let iter_surface ?(mag=1.0) f g surface =
  let info   = Sdlvideo.surface_info surface
  and pixels = Sdlvideo.pixel_data_32 surface in
  let size   = min info.Sdlvideo.w info.Sdlvideo.h in
  let scale  = mag /. float size
  and cx, cy = info.Sdlvideo.w / 2, info.Sdlvideo.h / 2 in
  for i = 0 to info.Sdlvideo.h - 1 do
    let b = i * info.Sdlvideo.pitch / 4
    and y = float (cy - i) *. scale in
    for j = 0 to info.Sdlvideo.w - 1 do
      let x = float (j - cx) *. scale in
      let c = f x y in
      pixels.{b + j} <- Sdlvideo.map_RGB surface c
    done;
    g i
  done

The magnification mag specifies how many real units the shortest dimension spans. The outer loop iterates with i for every scanline, whose base address into the pixels array is b. The inner loop iterates with j the index to successive pixels, and from both i and j the coordinates x, y are computed. With those the function f is called, and the result c (an RGB triple of type Sdlvideo.color) is mapped and assigned to the pixel. At the end of every scanline, the function g is called so that the main program can display progress.

The simplest SDL event loop waits for the user to quit the application:

let rec event_loop () =
  match Sdlevent.wait_event () with
  | Sdlevent.QUIT -> ()
  | _ -> event_loop ()

Anything more complicated is easy to accomodate. Now, there are a myriad schemes for color rendering of mathematical data, but using the color wheel is very common, especially since the complex argument maps naturally to a hue value. Converting HSV colors to RGB is somewhat involved, but the following code is entirely reusable:

let quomod d x =
  let f, e = modf (float d *. (x -. floor x)) in truncate e, f

let byte_of_frac x =
  let mask x = lnot (x asr 30) land 0xff in
  let n = truncate (ldexp x 8) in
  (n land (mask n))  lor  mask (n - 256)

let hsv_to_rgb (h, s, v) =
  let epsilon = 1e-3 in
  let w = byte_of_frac v in 
  if abs_float s < epsilon then (w, w, w) else
  let (e, f) = quomod 6 h in
  let x = byte_of_frac (v *. (1. -. s))
  and y = byte_of_frac (v *. (1. -. s *. f))
  and z = byte_of_frac (v *. (1. -. s *. (1. -. f))) in
  match e with
  | 0 -> (w, z, x)
  | 1 -> (y, w, x)
  | 2 -> (x, w, z)
  | 3 -> (x, y, w)
  | 4 -> (z, x, w)
  | 5 -> (w, x, y)
  | _ -> assert false

quomod scales the fractional part of x by the integer d, and splits the result into an entier part e and a fractional part f. This clasifies the hue into a sextant on which the fractional part linearly interpolates. To map a real number in the interval [0… 1] into [0… 256), byte_of_frac scales the fraction by 28 and clips the result to a byte by constructing the appropriate bit-mask.

The code so far doesn't incorporate any specific functionality, and could be shipped off to a standard module to use again and again. The particular application that I had in mind, Newton fractals, are the plot on the complex plane of the basins of attraction to the complex roots of a polynomial, as found by applying the Newton method.

Polynomials and their derivatives are especially easy to evaluate at a point in one pass using Horner's method, when represented as the list of their coefficients with the independent term at the head, a so-called dense representation of polynomials:

open Complex

let eval p x =
  List.fold_left (fun (y, dy) p ->
    (add p (mul y x), add y (mul dy x)))
    (zero, zero) p

With that, the Newton iteration is very simple:

let nr_root ?(max_iter=32) ?(epsilon=1e-6) p x =
  let rec go (x, i as res) =
    if i = max_iter then res else
    let y, dy = eval p x in
    if norm y < epsilon then res else
    let dx = div y dy in
    go (sub x dx, succ i)
  in go (x, 0)

The root finder returns the number of iterations required, up to a maximum and/or specified precision, together with the closest estimate to the root. This allows me to "color" the fractal according to the time needed to converge to a root. With the chosen representation, the polynomial zn - 1 (which represents, by definition, the n-th roots of unity) is straightforward to construct:

let pow n x =
  let rec go i l =
    if i = 0 then l else go (pred i) (zero :: l)
  in go n [x]

let nth_roots n = (neg one) :: pow (n - 1) one

These twenty lines are all the mathematical machinery required. It only remains to put all the pieces together:

let () =
  let max_iter = 128 in
  Sdl.init [`EVERYTHING];
  let screen = Sdlvideo.set_video_mode ~w:640 ~h:480 ~bpp:32 [`SWSURFACE] in
  Sdlwm.set_caption ~title:"nrfractal" ~icon:"";
  let p = nth_roots 3 in
  iter_surface ~mag:2.0 (fun x y ->
    let (z, n) = nr_root ~max_iter p { re = x; im = y } in
    let hue = Complex.arg z *. 0.159154943091895346 +. 0.5
    and vlu = float (max_iter - n) /. float max_iter in
    hsv_to_rgb (hue, 1., vlu) )
    (fun i -> if i land 7 = 0 then Sdlvideo.flip screen)
    screen;
  Sdlvideo.flip screen;
  event_loop ();
  Sdl.quit ()

The iteration limit max_iter controls the fractal detail. I first initialize the SDL library and create a big enough window, with a descriptive title. Then, I build a polynomial p representing the third roots of unity, and invoke iter_surface with a function that finds p's root starting from the complex value corresponding to the coordinates of the point to be plotted. I convert the complex angle of the returned approximation z to a hue value (by dividing by 2π and normalizing to the range [0… 1]); analogously, the number of iterations determines the vividness, or value of the color. This is converted to an RGB triple and returned. Every eight scanlines the screen is refreshed with the contents of the backing buffer. Once every point is plotted, the event loop is entered until the user decides to quit.

So, not counting the generic, reusable code, less than 40 lines of code suffice to just draw something on the screen. Have fun!

2008-08-17

Aperiodic Tilings

When you thought that your obsession had run its course and left you in peace, finally, along comes something new and shiny to once more lead you astray. I've written half a dozen posts about sorting networks and genetic algorithms; I thought I had put that topic to rest. What am I to do when I find the Tilings Encyclopedia later the same week? How could I resist this aperiodic fest of self-similar tilings?

The nice thing about the Encyclopedia is not that the tilings are extensively cross-referenced and annotated with the relevant bibliographies; that's all very useful, I don't doubt, for its intended audience; but what I immediately liked was that the tilings are presented as rewriting rules. That should make it easy to replicate them, I thought; I wasn't very wrong, but I find that a functional program is not finished until its structure is strong and regular enough to let it stand on its own legs. After refactoring and restructuring and rewriting the code for a week or so, I had a robust, general program that let me add new tilings, write the tilings in different formats and so on; my last post deals with the details of the back-end mechanism.

Look at the page for the Amman-Beenker rhomb triangle tiling, for instance (go ahead, I'll wait for you). The tiling uses three primitives: two oriented triangles and a parallelogram. Each is "split" or rewritten into a configuration of the same tiles. The process by which the plane is tiled by rewriting is called inflation; the rate of expansion of a primitive tile into a configuration of tiles at each step is called the inflation coefficient.

In typical OCaml fashion, the tyranny of early bound top-level definitions means that I have to sweat the small details first, so that the intended objective is reached in a grand finale by a simple function that synthesizes all the bottom-up development. That's is at odds with writing an engaging literate program, I'm afraid, but it has to be done. The first ingredient, then, in my recipe for pretty pictures is something that lets me put the pieces where they should go. A little bit of linear algebra, in short.

I'll use the most direct representation possible of affine transformations, an affine matrix T such that a vector v gets mapped to vT. Only, I'll be sloppy and identify points on the plane (x, y) ∈ R2 with homogeneous vectors (x, y, 1) ∈ A2. I always use row vectors and multiply them on the left, which means that the last column of the matrix T is fixed to (0, 0, 1)T, so I don't need to represent it. To simplify things, I flatten the matrix in column-major order:

type xform = float array

With that, the symmetries of the dihedral group D2 are simple constant matrices:

let identity = [|  1.;  0.;  0.;  0.;  1.;  0. |]
and fliph    = [|  1.;  0.;  0.;  0.; -1.;  0. |]
and flipv    = [| -1.;  0.;  0.;  0.;  1.;  0. |]
and reflect  = [| -1.;  0.;  0.;  0.; -1.;  0. |]
and qrot     = [|  0.; -1.;  0.;  1.;  0.;  0. |]

The affine transforms are equally direct:

let scale s = [| s; 0.; 0.; 0.; s; 0. |]
and translate (x, y : point) = [| 1.; 0.; x; 0.; 1.; y |]
and skew (x, y : point) = [| 1.; x; 0.; y; 1.; 0. |]
and rotate a =
  let s, c = sin a, cos a in
  [| c; -. s; 0.; s; c; 0. |]

Note that I use homogeneous scaling. The composition of two transforms is simply the product of the corresponding matrices. I denote the composition with the % operator, but perhaps a better choice would have been an infix symbol that highlighted the non-commutative nature of transformations:

let ( % ) (t : xform) (u : xform) : xform =
  [|
    t.(0) *. u.(0) +. t.(3) *. u.(1);
    t.(1) *. u.(0) +. t.(4) *. u.(1);
    t.(2) *. u.(0) +. t.(5) *. u.(1) +. u.(2);
    t.(0) *. u.(3) +. t.(3) *. u.(4);
    t.(1) *. u.(3) +. t.(4) *. u.(4);
    t.(2) *. u.(3) +. t.(5) *. u.(4) +. u.(5);
  |]

To apply a transformation to a point is as simple as vector-matrix multiplication:

let apply (t : xform) (x, y : point) : point =
  (x *. t.(0) +. y *. t.(1) +. t.(2),
   x *. t.(3) +. y *. t.(4) +. t.(5))

The underlying symmetries of the tilings I'll reproduce are polygonal (pentagonal, octagonal, decagonal and so on), so I find easier to work in degrees:

let pi = 3.141592653589793238462643383276
let deg x = x *. 0.017453292519943295769236907684
let rad x = x *. 57.295779513082320876798154814169

The gist of a tiling is captured in the following signature:

module type TILING = sig
  type t
  val name  : string
  val shape : t -> point list
  val color : t -> rgbcolor
  val seed  : (t * xform) list
  val rules : (t * xform) -> (t * xform) list
end

The type t abstracts away the tiles used. name is the full name of the substitution, for purposes of meaningfully labeling the output. shape t returns the polygon corresponding to the tile t as a list of points, and color t returns an RGB triple for that tile. seed is a list of placed tiles with which the inflation process will start. Finally, the meat of the tiling is encoded in rules, that for each tile t returns the expansion as a list of placed tiles. By convention, I'll scale the resulting expansion so that the transformations are local to each tile; in other words, I'll avoid the need to keep track of the overall scale. With that, the core algorithm is captured in the following function:

let iterate f l n =
  let rec go l n =
    if n == 0 then l else
    go (List.concat (List.map f l)) (pred n)
  in go l n

This rather generic function repeatedly expands a tiling l by applying n times the rules f. Interpreted under the list monad, iterate is nothing more than functional iteration of the monadic function f; but clearly bringing the entire machinery of the non-deterministic monad for just this is overkill.

With this, I have all the necessary pieces to actually implement the drawing algorithm. In keeping with the theme of the previous post, I'll output EPS code to a temporary file which I'll then open with Preview. The Generator is parameterized by a TILING:

module Generator (S : TILING) = struct

But, there's a snag to solve: as I've set up things, the final size of the tiling will be that of the seed, and I use, in general, shapes with sides of length 1. This will make the final tiling the size of a pixel! Precisely this is the drive behind using "abstract" back-ends: I can draw the tiling once to get its "real" size, rescale it, and then draw it for real:

  module Tiling (G : GRAP) = struct
    module M = Monad(G.M)
    open M.Op

    let draw = M.mapm_ (fun (t, x) ->
      let vs = List.map (apply x) (S.shape t) in
         G.color (S.color t)
      >> G.poly ~fill:true vs
      >> G.gray 0.
      >> G.poly ~fill:false vs)
  end

Remember that a GRAP back-end contains the monad M that implements the back-end. Drawing a tiling involves taking each placed tile (t, x), transforming the polygon S.shape t by the placement transform x, and drawing it. Again, this is completely generic on the "drawing" method. I take advantage of this genericity first to find the bounding box of the tiling:

  let bounds tiling =
    let module T = Tiling(Bounds) in
    Bounds.bounds (T.draw tiling)

Then, I draw the tiling simply by setting an appropriate line width and using the correct back-end:

  let show width height tiling =
    let module T = Tiling(Eps) in
    Eps.show width height (T.M.seq (Eps.weight 0.5) (T.draw tiling))

This pair of functions is the pay-off of so much refactoring and abstraction! Finally, generate ties all together by finding the bounding box, scaling the tiling and outputting it:

  let transform x' = List.map (fun (t, x) -> (t, x % x'))

  let generate dim levels =
    let tiling = iterate S.rules S.seed levels in
    let bbox   = bounds tiling in
    let width  = bbox.Bounds.right -. bbox.Bounds.left
    and height = bbox.Bounds.bottom -. bbox.Bounds.top in
    let size   = dim /. width in
    let xform  = translate (-. bbox.Bounds.left, -. bbox.Bounds.top) % scale size in
    show (width *. size) (height *. size) (transform xform tiling)
end

This function has a rather arbitrary design decision: whether to scale to the bounding box of the seed, or to that of the final tiling. If the expansion were confined to the boundary of the parent tile, it would make no difference (except for the work involved!); not all tilings, however, are so tidy. One of the most famous aperiodic tilings, Penrose's, overstep the boundaries of the parent tile.

The job is done, except for the actual tilings. First, the Amman-Beenker rhomb-triangle tiling:

module AmmanBeenkerRhombTriangle : TILING = struct
  type t = Tu | Td | Pa

  let name = "Amman-Beenker"

  let rho  = sqrt 2.
  and rho' = sqrt 0.5

  let shape = function
  | Tu -> [0., 0.; rho', rho'; rho, 0.; 0., 0.]
  | Td -> [0., 0.; rho, 0.; rho', -. rho'; 0., 0.]
  | Pa -> [0., 0.; 1., 0.; 1. +. rho', -. rho'; rho', -. rho'; 0., 0.]

  let color = function
  | Tu -> (0.976, 0.4, 0.063)
  | Td -> (0.973, 0.796, 0.4)
  | Pa -> (0.804, 0.796, 0.855)

  let seed = 
    let xform = rotate (deg 45.) in
    [Tu, xform; Td, xform]

  let rules (t, x) =
    let x' = scale (rho -. 1.) % x in match t with
  | Tu -> [
    Tu, rotate (deg (-135.)) % translate (1., 1.) % x';
    Pa, rotate (deg (-90.)) % translate (1. +. rho', 1. +. rho') % x';
    Tu, rotate (deg 135.) % translate (2. +. rho', rho') % x';
    Pa, rotate (deg 0.) % translate (1. +. rho', rho') % x';
    Td, rotate (deg 180.) % translate (1. +. rho, 0.) % x'; 
  ]
  | Td -> [
    Td, rotate (deg 135.) % translate (1., -1.) % x';
    Pa, rotate (deg 135.) % translate (1. +. rho', -1. -. rho') % x';
    Td, rotate (deg (-135.)) % translate (2. +. rho', -. rho') % x';
    Pa, rotate (deg 45.) % translate (1. +. rho', -. rho') % x';
    Tu, rotate (deg 180.) % translate (1. +. rho, 0.) % x'; 
  ]
  | Pa -> [
    Pa, x';
    Td, rotate (deg 0.) % translate (1., 0.) % x';
    Tu, rotate (deg 135.) % translate (2. +. rho, -1.) % x';
    Pa, rotate (deg 0.) % translate (1. +. rho, -1.) % x';
    Td, rotate (deg 180.) % translate (1. +. rho +. rho', -1. -. rho') % x';
    Tu, rotate (deg (-45.)) % translate (rho', -. rho') % x';
    Pa, rotate (deg (-90.)) % translate (1. +. rho, 0.) % x';
  ]
end

There's not much to say about this code, except that it encodes precisely the data presented in its Encyclopedia page. Let me just add that I found encoding the transformations quite error-prone; but then again I'm not a visual thinker. The result is:

let () =
  let module G = Generator(AmmanBeenkerRhombTriangle) in
  G.generate 256. 4

This tile could make excellent backgrounds; note that the weaving of triangles and parallelograms creates a somewhat disturbing optical illusion. Another beautiful tiling is the Robinson triangle tiling:

module RobinsonTriangle : TILING = struct
  type t = Su | Sd | Au | Ad

  let name = "Robinson.Triangle"

  let phi  = 0.25 *. (sqrt 5. +. 1.)
  let phi' = 0.25 *. (sqrt 5. -. 1.)
  and eta  = sqrt (0.125 *. (5. +. sqrt 5.))
  and eta' = sqrt (0.125 *. (5. -. sqrt 5.))

  let shape = function
  | Su -> [0., 0.; 2. *. phi , 0.; phi ,    eta'; 0., 0.]
  | Sd -> [0., 0.; 2. *. phi , 0.; phi , -. eta'; 0., 0.]
  | Au -> [0., 0.; 2. *. phi', 0.; phi',    eta ; 0., 0.]
  | Ad -> [0., 0.; 2. *. phi', 0.; phi', -. eta ; 0., 0.]

  let color = function
  | Su -> (0., 0., 0.376)
  | Sd -> (0.808, 0.796, 0.89)
  | Au -> (1., 0.8, 0.4)
  | Ad -> (1., 0.4, 0.)

  let seed =
    let t = rotate (deg 36.) in
    [ Su, t; Sd, t ]

  let rules (t, x) =
    let x' = scale (0.5 /. phi) % x in match t with
  | Su -> [
    Su, rotate (deg (-144.)) % translate (0.5 +. phi, eta) % x';
    Ad, rotate (deg ( -36.)) % translate (0.5 +. phi, eta) % x';
    Sd, reflect % translate (1. +. 2. *. phi, 0.) % x';
  ]
  | Sd -> [
    Sd, rotate (deg 144.) % translate (0.5 +. phi, -. eta) % x';
    Au, rotate (deg  36.) % translate (0.5 +. phi, -. eta) % x';
    Su, reflect % translate (1. +. 2. *. phi, 0.) % x';
  ]
  | Au -> [
    Au, rotate (deg 108.) % translate (1., 0.) % x';
    Su, rotate (deg (-108.)) % translate (0.5, 2. *. eta *. phi) % x';
  ]
  | Ad -> [
    Ad, rotate (deg (-108.)) % translate (1., 0.) % x';
    Sd, rotate (deg 108.) % translate (0.5, -2. *. eta *. phi) % x';
  ]
end

The seed brings together to obtuse triangles to form a parallelogram; there is, however, no way to "cut" the tile into a rectangular tile with matching borders, so it won't work as a background:

let () =
  let module G = Generator(RobinsonTriangle) in
  G.generate 256. 6

Finally, the classic Penrose rhomb tiling:

module PenroseRhomb : TILING = struct
  type t = R0 | R1

  let name = "Penrose.Rhomb"

  let phi  = 0.25 *. (sqrt 5. +. 1.)
  let phi' = 0.25 *. (sqrt 5. -. 1.)
  and eta  = sqrt (0.125 *. (5. +. sqrt 5.))
  and eta' = sqrt (0.125 *. (5. -. sqrt 5.))

  let shape = function
  | R0 -> [0., 0.; eta', -. phi ; 0., -2. *. phi ; -. eta', -. phi ; 0., 0.]
  | R1 -> [0., 0.; eta , -. phi'; 0., -2. *. phi'; -. eta , -. phi'; 0., 0.]

  let color = function
  | R0 -> (0.973, 0.796, 0.4)
  | R1 -> (0.08, 0.027, 0.4)

  let seed = [
    R0, identity;
    R0, rotate (deg 72.);
    R0, rotate (deg 144.);
    R0, rotate (deg 216.);
    R0, rotate (deg 288.);
  ]

  let rules (t, x) =
    let x' = scale (2. *. phi') % x in match t with
  | R0 -> [
    R0, reflect % translate (0., -2. *. phi) % x';
    R1, rotate (deg 36.) % translate (eta', -. phi) % x';
    R0, rotate (deg 144.) % translate (0., -1. -. 2. *. phi) % x';
    R0, rotate (deg (-144.)) % translate (0., -1. -. 2. *. phi) % x';
    R1, rotate (deg (-36.)) % translate (-. eta', -. phi) % x';
  ]
  | R1 -> [
    R1, rotate (deg 108.) % translate (-. eta', phi' -. 0.5) % x';
    R1, rotate (deg (-108.)) % translate (eta', phi' -. 0.5) % x';
    R0, rotate (deg 108.) % translate (0., -1.) % x';
    R0, rotate (deg (-108.)) % translate (0., -1.) % x';
  ]
end

As you can see, the tiles overflow their parents. I used a "sun" seed to generate a pentagonally-symmetrical result:

let () =
  let module G = Generator(PenroseRhomb) in
  G.generate 256. 4

I've uploaded the complete program, with the code in the last post and in this, to the OCamlCore Forge. There are a number of really nice tilings I'll maybe some day implement noted in comments at the end of the file. Or, find the ones you like and add a snippet to the forge. Have fun!

2008-08-15

Monads are Plug-ins

As Haskell gains popularity, a dozen articles or more a month about them make the front page at the Programming reddit. Most deal with the necessity of using and understanding monads to control side-effects in Haskell. This is only natural and to be expected since, in Haskell, they're a language design decision, and as such as inescapable as laziness and as structural as the predefined operator precedence.

My background is in OCaml (gratuitous biographical aside: I found a notebook with OCaml code I wrote in '96), and so I view the IO and attendant monads with some disdain. Let me explain myself: my view is that using them to control side-effects is a rather pedestrian application of monads.

There's another, much higher-level view of them, and that is monads as categorical constructions. Category theory, by its very nature as a mathematical framework, is a language in which to express relations and operations between things. It's the mathematical equivalent of the GoF's patterns. So, reading an article like this one left me with the slightly uneasy feeling of having simultaneously not understood much of it (because I don't know Haskell, and because the article has no context to speak of) and knowing that an isomorphism between the IO monad and anything else cannot allow you to "weave in and out of using IO": freeness demands that the "other" side of the isomorphism contain, encode or fake enough of IO so as to be undistinguishable from it (the type of IO' as presented in the article lent me evidence of this). This is where category theory comes handy: understanding by pattern-matching.

The practical upshot of this is that it enables a third view of monads, Moggi's, as capsules of meaning that are, if not composable, certainly pluggable. First in Notions of computation and monads, then in Monads and Effects (whence IO and friends), Moggi shows that monads are the right tool for the (hard) job of making your code independent of, or at least parameterizable by, semantics. Monads are plug-ins.

In what way are monads plug-ins? Well, a plug-in must have two things: specific behavior that you can invoke to augment your program in unforeseen ways even after you've written it, and a standard interface by which the program can call it. Monads have a rather minimalist interface:

module type MONAD = sig
  type 'a m
  val return : 'a -> 'a m
  val bind   : 'a m -> ('a -> 'b m) -> 'b m
end

First comes a type, α m: this is the "protocol" by which your program and the plug-in communicate without one having to "know too much" (or more than the standardized interface allows) about the other. It lets the monad say, in effect: "for me to be arbitrary but compatible, you'll have to accept that I do things my way". The monad gets to have its "private space" in which to do its job, without fear that the program will mess up its stuff.

Then comes the method return, that the program uses to pass values down to the monad. It lets the code say, "here, have this value for your own use". Finally, there's the operation bind that lets the program and the monad maintain a conversation. With it, the program can say to the monad "oh, when you are done with anything, here's another thing I need you to do", to which the monad responds "OK, I'll take that and give you back the result". This "wiring" aspect to the bind is emphasized in it being usually called >>=, an infix operator.

This is all nice, operational and anthropomorphizing, but hardly illuminating. The fact that the monad functions as a protocol is formalized in the monad laws which every concrete, "plug-in" monad must abide to:

  1. bind (return a) k  ≡  k a
  2. bind m return  ≡  m
  3. bind (bind m k) h  ≡  bind m (fun x -> bind (k x) h)

The first law states that return on the left of bind does nothing. The second law states that return on the right of bind does nothing. The third law means that a bind of binds associates freely. But that's exactly what a monoid is: a "thing" with an associative operation that has a neutral, "do-nothing" element on the left and right of it.

Consider by way of example the concatenation of strings, denoted by ^. There is a special string, "" or the empty string, that "doesn't change" a string when catenated to it: for all strings s, "" ^ s is the same as s, and so is s ^ "". Also, it doesn't matter which string you concatenate before and which after; it's the order in which you concatenate that matters. In symbols, s ^ (t ^ u) is the same as (s ^ t) ^ u.

You might think, what have monoids to do with monads? The answer is that monads "behave" algebraically as monoids. You might think next, aren't monads monoids, then? Well, no. The crucial difference between a monoid and a monad is that, packed inside the return and the bind operations of a monad there is meaning tucked in. It could send output to a console. It could update a memory location. It could try simultaneously different possible computations and return any, or all of them. It could record transactional data. The key is that the monad laws are a contract that allow you, no matter how complicated the "stuff" the monad does underneath, operate with it in a structured, predictable way. Arbitrary behavior under a fixed interface. A plug-in, in short.

If you followed me so far you might have a nagging doubt by now. How do you "escape" the monad, once you use it? That is, return serves to put plain old values in a "wrapper" the monad can use but, how can you get back what the monad did with it? Where's the function with type α m → α? The bad news is that there isn't any… in general. That is, there is no guaranteed way to get back a value from an arbitrary monad. However, many useful monads do provide an escape to "retrieve" the useful work the monad did.

At this point I should show a concrete monad. A classical example (one that I'll eventually need to use) is the state monad. It allows to "plug-in" statefulness to an otherwise "pure" or "stateless" function:

module State (S : sig type t end) = struct
  type 'a m = St of (S.t -> 'a * S.t)
  let return a = St (fun s -> (a, s))
  let bind (St m) f = St (fun s ->
    let (x, s') = m s in
    let (St m') = f x in
    m' s')

  let get = St (fun s -> (s, s))
  let put = fun s -> St (fun _ -> ((), s))

  let eval (St m) = fun s -> fst (m s)
  let run  (St m) = fun s -> snd (m s)
end

The innards are perhaps not very illuminating on a first reading, so let me show its type:

module State : functor (S : sig type t end) -> sig
  type 'a m = St of (S.t -> 'a * S.t)
  val return : 'a -> 'a m
  val bind   : 'a m -> ('a -> 'b m) -> 'b m
  val get    : S.t m
  val put    : S.t -> unit m
  val eval   : 'a m -> S.t -> 'a
  val run    : 'a m -> S.t -> S.t
end

This monad is multi-parametric, in the sense that not only the values your function computes are arbitrary (the α in the monad's type), but the states you might want to track are too. Suppose you want to tally the arithmetic operations a function uses to calculate a value. The state can be as simple as a single integer, or as complex as a tuple recording separate counts for additions and subtractions, multiplications, divisions, square roots and so on. Whichever the case, the key to this monad is that an impure function can be split into a pure part and a state transformer part that takes an input state and returns an output state together with the value of the function. With that, it's not difficult to see that if return is to be a neutral element, it must pass its input state unchanged. In other words, the effect of a value is nil. In yet other words, a value is always pure. Also, bind must keep track of intermediate states between effects, and thread them appropriately.

Immutable state is no state at all but just another value, so this monad comes with a pair of operations get and put that allow your program to retrieve the state at some point of your function, and modify it. Also, after your computation terminates, you should be able to retrieve not only its final value with eval but also to access the final state with run. This variant assumes you'll want one or the other; it is possible to get both at once without having to re-compute the function.

Note that get is a monadic value, not a function. This might not make sense unless you look into the monad and see how it is constructed. For a less operational view, think of get as representing, by itself, whatever the state is at that point in the program. Since it's a monadic value, the only useful thing to do with it is to bind it to a function:

module St = State(struct type t = int end)

let incr = St.bind St.get (fun s -> St.put (succ s))

With that, it's simple to write a monadic addition operator that counts how many times it is invoked:

let ( +! ) mx my =
  St.bind mx   (fun x ->
  St.bind my   (fun y ->
  St.bind incr (fun _ -> St.return (x + y))))

And using it is equally straightforward:

let test () =
  let msum l = List.fold_right (fun x m -> St.return x +! m) l (St.return 0)
  in St.run (msum [1;2;3;4]) 0

This is all quite tedious to repeat each time I need a monad. Fortunately, most of the operations you can build out of monadic bind and monadic return are structural, and so generic on the particular monad. Enter the Monad functor:

module Monad (M : MONAD) = struct
  include M

  let seq m f = bind m (fun _ -> f)

  let join m = bind m (fun m -> m)

  let fmap f m = bind m (fun x -> return (f x))

  let liftm = fmap

  let liftm2 f m m' =
    bind m  (fun x ->
    bind m' (fun y ->
    return (f x y)))

  let liftm3 f m m' m'' =
    bind m   (fun x ->
    bind m'  (fun y ->
    bind m'' (fun z ->
    return (f x y z))))

  let mapm f l =
    List.fold_right (liftm2 (fun x xs -> f x :: xs)) l (return [])

  let sequence l = mapm (fun x -> x) l

  let mapm_ f l =
    List.fold_right (fun x -> seq (f x)) l (return ())

  let sequence_ l = mapm_ (fun x -> x) l

  module Op = struct
    let ( >>= ) = bind
    let ( >>  ) = seq
  end
end

Again, looking at the types is more illuminating than squinting at the implementation:

module Monad : functor (M : MONAD) -> sig
  type 'a m = 'a M.m
  val return    : 'a -> 'a m
  val bind      : 'a m -> ('a -> 'b m) -> 'b m
  val seq       : 'a m -> 'b m -> 'b m
  val join      : 'a m m -> 'a m
  val fmap      : ('a -> 'b) -> 'a m -> 'b m
  val liftm     : ('a -> 'b) -> 'a m -> 'b m
  val liftm2    : ('a -> 'b -> 'c) -> 'a m -> 'b m -> 'c m
  val liftm3    : ('a -> 'b -> 'c -> 'd) -> 'a m -> 'b m -> 'c m -> 'd m
  val mapm      : ('a -> 'b) -> 'a m list -> 'b list m
  val sequence  : 'a m list -> 'a list m
  val mapm_     : ('a -> 'b m) -> 'a list -> unit m
  val sequence_ : 'a m list -> unit m
  module Op : sig
    val ( >>= ) : 'a m -> ('a -> 'b m) -> 'b m
    val ( >> )  : 'a m -> 'b m -> 'b m
  end
end

seq allows you to invoke the monad on the left not for its value but for its effect. join lets you "squash down" a monadic monad to the underlying monad; the fact that it is generic and independent of the underlying semantics means that the monad itself, viewed as a categorical functor, is idempotent, that is, it's useless to monadize a monad. fmap "maps" a function inside a monad, which effectively makes a functor out of a monad. liftmn is an indexed family of mappers for functions of one, two… arguments, I've stopped at three but you could go on as you need.

A list of monads can be composed together to turn them "inside out" into a monad of lists. That is what mapm and sequence do, commuting the monad and the list, with or without transforming the list's elements first. Also, a list of monads can be evaluated in sequence purely by their side effects, that is where the mapm_ and sequence_ come in handy (the names of these functions are exactly the same as that of their Haskell counterparts).

Lastly, the two very common operations return and seq have very convenient operators associated with them. With all this, I can write the following:

module StM = Monad(St)
open StM.Op

let count2 op x y = incr >> StM.liftm2 op x y

let ( +! ) = count2 ( + )
and ( -! ) = count2 ( - )
(* and so on *)

This is the big payoff monads offer: functions can be generic on the structure of the monad, and so independent of its semantics. Monads are plug-ins.

Let me try with another very different monad, a very simple Out:

module Out = struct
  type 'a m = Out of (out_channel -> 'a)
  let return a = Out (fun _ -> a)
  let bind (Out m) f = Out (fun out ->
      let x = m out in
      let (Out m') = f x in
      m' out)

  let fmt fmt =
    let ship s = Out (fun out ->
      output_string out s; output_char out '\n')
    in Printf.kprintf ship fmt

  let write name (Out m) =
    let out = open_out_bin name in
    try let x = m out in close_out out; x
    with e -> close_out out; raise e
end

This monad interleaves formatted writing operations, fmt, as a side effect of whatever the code using it does. What is special of this monad is that the "escape" function, write ships out all the accumulated shows to a named output file, safely closing it even in the event of an exception.

With this, I'm ready to show you how to actually consume plug-ins from your program. Suppose you want to have a simple graphics DSL. I have a program with exactly such a need, and as a MacOS X user, the easiest way I found to satisfy it was to generate EPS and to use /usr/bin/open to handle the data for me. But I didn't want to encapsulate (!) my drawing code inside a PostScript interface, but needed it to be as "abstract" as possible (my original back-end was a Mathematica one). I settled on a simple hybrid:

type point = float * float

type rgbcolor = float * float * float

module type GRAP = sig
  module M : MONAD
  type t = unit M.m
  val weight : float -> t
  val gray   : float -> t
  val color  : rgbcolor -> t
  val save   : t -> t
  val dot    : point -> t
  val line   : point -> point -> t
  val rect   : ?fill:bool -> point -> point -> t
  val poly   : ?fill: bool -> point list -> t
end

The module is not parameterized by a monad, but it does include a monad that will provide the "glue" to whatever back-end I want to write. As you can see, it is rather primitive but includes operations for drawing in color. It is worth mentioning that this is not a so-called retained interface, but an immediate one, and so save runs the drawing operations on a "sub-context" that is restored when all the commands are shipped out. Also, the type that carries the drawing operations is an alias for the unit monadic value, as drawing doesn't really have a value. My first implementation was the EPS back-end:

let mm = ( *. ) (72. /. 25.4)

module Eps = struct
  module M = Monad(Out)
  type t = unit M.m
  open Out
  open M.Op

In this case, all drawing is to be output to a file, so the monad that implements the plug-in driver is Out. The code uses fmt extensively to format the PostScript commands. For instance, an EPS file must specify a bounding box against which all drawing will be clipped to:

  let iso_time () =
    let tm = Unix.gmtime (Unix.time ()) in
    Printf.sprintf "%04d%02d%02dT%02d%02d%02d"
      (tm.Unix.tm_year + 1900) (tm.Unix.tm_mon + 1) tm.Unix.tm_mday
      tm.Unix.tm_hour tm.Unix.tm_min tm.Unix.tm_sec

  let eps margin width height drawing =
    let app_name = Filename.basename Sys.executable_name in
       fmt "%%!PS-Adobe-3.0 EPSF-3.0"
    >> fmt "%%%%BoundingBox: 0 0 %d %d"
      (truncate (ceil (width  +. 2. *. margin)))
      (truncate (ceil (height +. 2. *. margin)))
    >> fmt "%%%%Creator: %s" app_name
    >> fmt "%%%%CreationDate: %s" (iso_time ())
    >> fmt "%%%%DocumentData: Clean7Bit"
    >> fmt "%%%%EndComments"
    >> drawing
    >> fmt "showpage"
    >> fmt "%%%%EOF"

The monad is used purely for its side-effects, namely writing to a file, so I don't bind, I always seq. After the drawing is complete, I actually want to see it on screen; for that, I write a temporary EPS file that I pass to /usr/bin/open, which in turn opens Preview, which converts the file to PDF and shows it:

  let show ?(margin=0.) width height f =
    let name = Filename.temp_file "grap" ".eps" in
    write name (eps margin width height f);
    ignore (Unix.system (Printf.sprintf "/usr/bin/open %s" name))

Indirect, but a showcase of the power of UNIX at work. The drawing operations themselves are pretty straightforward:

  let weight w        = fmt "%g setlinewidth" w
  and gray   w        = fmt "%g setgray" w
  and color (r, g, b) = fmt "%g %g %g setrgbcolor" r g b

  let save f = fmt "gsave" >> f >> fmt "grestore"

  let moveto (x, y) = fmt "%g %g moveto" x y
  let lineto (x, y) = fmt "%g %g lineto" x y
  let path f = fmt "newpath" >> f >> fmt "closepath"
  let draw   = fmt "stroke"
  let paint  = fmt "fill"

As is save, path is a higher-order function that constructs a complete path with the drawing commands. This comes handy:

  let dot (x, y) =
    path (fmt "%g %g currentlinewidth 1.5 mul 0 360 arc" x y) >> paint

  let line p q =
    fmt "newpath" >> moveto p >> lineto q >> draw

  let rect ?(fill=false) (x, y) (x', y') =
    let w, h = x' -. x, y' -. y in
    fmt "%g %g %g %g rect%s"
      x y w h (if fill then "fill" else "stroke")

Polygons are built out of a sequence of points. The first one is the starting point and must be passed to moveto, the rest are to be arguments of linetos. A mapping would do, except for the fact that the code is in monadic style, and must use mapm_:

  let polyline closing l = match l with
  | []      -> return ()
  | p :: ps -> path (moveto p >> M.mapm_ lineto ps) >> closing

  let poly ?(fill=false) = polyline (if fill then paint else draw)
end

Note also that I avoid invoking graphics primitives if the polygon is empty. With this, a simple example works as expected:

let () =
  let module M = struct
    open Eps
    open Eps.M.Op
    let draw p = show 100. 100. (poly p >> (M.mapm_ dot (List.tl p)))
  end in M.draw (List.map (fun i ->
    let a = float i *. 2. *. pi /. 10. in
    let c, s = cos a, sin a in
    (50. +. 20. *. c, 50. +. 20. *. s)
  ) (iota 11))

Success! A decagon with fat vertices. The only problem with an EPS back-end is that it is difficult to get the optimal width and height for an image. In other words, it would be ideal to have some auto-scaling. For that, I have to know the size of a drawing before turning it into PostScript code. Enter the Bounds back-end:

module Bounds = struct
  type bounds = { top : float; left : float; bottom : float; right : float; }

  let empty =
    { top = infinity; left = infinity;
      bottom = neg_infinity; right = neg_infinity; }

I keep track of the bounding box of the commands seen by the interface, as a rectangle given by two pairs of coordinates. The initial, empty bounds are such that I can accumulate a point simply by a series of minimum and maximum operations:

  let add b (x, y) =
    { top    = min y b.top;
      left   = min x b.left;
      bottom = max y b.bottom;
      right  = max x b.right; }

Now the commands will have to preserve the bounding box seen so far. This is where the State monad comes in:

  module St = State(struct type t = bounds end)

  module M = Monad(St)
  type t = unit M.m
  open M.Op

A command that doesn't paint to the output area will be a nop. A command that adds a point p updates the bounds with that point and changes the state accordingly (cf with incr above):

  let nop      = M.return ()
  and draw   p = St.get >>= fun b -> St.put (add b p)

Again, I'm not interested in the final value of the drawing, only in the final bounding box, that is, the state after a series of commands are applied to the empty bounding box:

  let bounds f = St.run f empty

The commands themselves are trivial: those that merely alter the graphics state are nops, those that draw accumulate points on the bounding box:

  let weight       _ = nop
  and gray         _ = nop
  and color        _ = nop
  and save         f = f
  and dot          p = draw p
  and line       p q = draw p >> draw q
  and rect ?fill p q = draw p >> draw q
  and poly ?fill   l = M.mapm_ draw l
end

Now the previous code runs almost unmodified, except that the call to Eps.draw is replaced by a call to Bounds.bounds:

let b =
  let module M = struct
    open Bounds
    open Bounds.M.Op
    let draw p = bounds (poly p >> (M.mapm_ dot (List.tl p)))
  end in M.draw (List.map (fun i ->
    let a = float i *. 2. *. pi /. 10. in
    let c, s = cos a, sin a in
    (50. +. 20. *. c, 50. +. 20. *. s)
  ) (iota 11))

Which results in:

val b : Bounds.bounds =
  {Bounds.top = 30.9788696740969272; Bounds.left = 30.;
   Bounds.bottom = 69.0211303259030728; Bounds.right = 70.}

This allows me to scale up the drawing to fit the boundaries of the EPS file, or to adjust the size of the EPS file so that no space is wasted. The important thing is that the code that contains the drawing commands is left unchanged. Monads are plug-ins.

Next, I'll show you pretty pictures: