2007-10-03

Algebraic Data Types and Java Generics (II)

See also Part I.

Update 2007-10-05: I've reworked the code to make it simpler. The insight I had is that, since Java is rank-1 polymorphic, there is no way I can meaningfully abstract over the particular monad being used; however, the monadic class itself is the monadic type constructor. Hence, the type parameters are somewhat simpler.

There are many definitions of monads on the web, many very formal but that don't provide much insight if you're not formally inclined; some very poetic and even somewhat evocative but not very accurate, morally speaking. For our purposes, a monad takes values of type A and "wraps" them in, or marks them as values of type Monad<A>. This wrapper is mostly opaque and inviolable, except for a number of operations that makes using monads less than hopeless:

  • Given a value x of type A, unit(x) wraps it and turns it into a monadic value of type Monad<A>
  • Given a function f that takes values of type A and returns values of type Monad<B>, bind(f, x) "pipes" a value x of type Monad<A> into f, and returns the corresponding value of type Monad<B>. In effect, bind "lifts" f from taking plain values to accepting monadic values, keeping everything wrapped in the same paper, so to speak.

The intent of this rather puzzling machinery is one of representing "special" operations on values by way of their encapsulation in "tainted" procedures that cannot be escaped. The specialness of these operations might reside on their interacting with the environment, or on having "extra" structure on which they depend (state, privilege certificates, etcetera).

These operations must satisfy a number of properties, called the monad laws, with which we won't be concerned here. A monad could have just a little bit of extra structure: if there is a monadic value, call it zero, that satisfies bind(f, zero) = zero for all functions f, the monad is said to be a monad with zero. This is analogous to regular multiplication of numbers. In this case, the monad usually has another operation that "adds" two monadic values into a third, such that adding any value to the zero leaves that value unchanged; this is the monadic addition, and the extended monad is called a monad with plus.

As I wrote in concluding Part I, the Option type gives rise naturally to a monad. This monad is usually conceptualized as the failure monad, which encapsulates computations that might fail to return a value.

Before we start, we need some type-theoretic definitions. We've seen the type-safe functional object:

interface Function<X, Y> {
  Y apply(X value);
}

A functor is a parametric type (think of it as a "container type") with an operation map that, given a function f that transforms objects, applies it "inside itself":

interface Functor<X> {
  <Y> Functor<Y> map(Function<X, Y> f);
}

Another detail to take into account is that functions are strange with respect to specialization and subtyping. The more specific the return type is, the more specific the function returning it; however, the more specific the function the less specific its argument must be. This is called contravariance, and our functional parameters must take it into account.

To our monads. First, we must define what shape they will have in Java. As a container of values of type X, it can be viewed as a functional. Also, any monad should, at the least, provide a bind (since unit is a kind of "static factory", it cannot be included in the interface signature):

interface Monad<X> extends Functor<X> {
  <Y> Monad<Y> bind(Function<? super X, ? extends Monad<Y>> f);
}

The type of bind is complicated by the fact that it must express the covariance-contravariance of its functional argument. A MonadPlus extends a Monad with zero with an operation plus:

interface MonadPlus<X> extends Monad<X> {
  MonadPlus<X> add(MonadPlus<X> x);
}

(again, the zero is a static constructor, and cannot be included in the interface.) Now for casting an Option into a Monadic shape (it will actually be a MonadPlus):

abstract class Option<X> implements MonadPlus<X> {

The interface obligations for MonadPlus will be kept implicit; we will use the same trick as before to defer functionality to inner classes. The first case is that of the Some type constructor. It naturally subclasses Option:

  public static class Some<X> extends Option<X> {
    private final X value;
    private Some(X value) { this.value = value; }

The Functor interface requires implementing map. Naturally, it transforms the value and wraps it in a new Some:

    public <Y> Option<Y> map(Function<X, Y> f) {
      return new Some<Y>(f.apply(this.value));
    }

The Monad interface requires implementing bind. For a Some(x) option, it suffices to apply f to its value and return the result:

    public <Y> Monad<Y> bind(Function<? super X, ? extends Monad<Y>> f) {
      return f.apply(this.value);
    }

To comply with MonadPlus interface, the plus operation must return the first argument that is not zero. Since Some is not the zero of the monad, it returns itself:

    public MonadPlus<X> add(MonadPlus<X> _) {
      return this;
    }
  }

The other constructor is None:

  public static class None<X> extends Option<X> {
    private None() { }

The map on it returns None, as tere is no value to supply to f:

    public <Y> Option<Y> map(Function<X, Y> _) {
      return new None<Y>();
    }

Note: I could have returned this here, as there is conceptually just one value None. However, the type of this is parameterized by X while the result must be parameterized by Y. The alternative is to use an unchecked cast, return (None<Y>) this;. This is type-safe but uglier.

Ditto for bind:

    public <Y> Monad<Y> bind(Function<? super X, ? extends Monad<Y>> _) {
      return new None<Y>();
    }

The plus returns its argument, as the left side is this of class None, which by definition is the zero of the monad:

    public MonadPlus<X> add(MonadPlus<X> x) {
      return x;
    }
  }

As before, we seal the Option class from extension:

  private Option() { }

And let code create optional values using static constructors:

  public static <X> Option<X> unit(X x) { return new Some<X>(x); }
  public static <X> Option<X> zero(   ) { return new None<X>( ); }
}

This completes a type-safe implementation of the Option monad (equivalent to Haskell's MaybeMonad) in Java. We can test its usage:

public class OptionTest {
  public static void main(String[] args) {
    Function<Integer, Option<Integer>> f = new Function<Integer, Option<Integer>>() {
      public Option<Integer> apply(Integer x) {
        System.out.println("n = "+x);
        return Option.zero();
      }
    };
    Function<Integer, Option<Integer>> g = new Function<Integer, Option<Integer>>() {
      public Option<Integer> apply(Integer x) {
        System.out.println("n = "+x);
        return Option.unit(x + 10);
      }
    };
    final Option<Integer> zero = Option.zero();
    zero.add(Option.unit(10)).bind(g).bind(f).bind(g);
  }
}

Note that we have to help Java's type checker by assigning Option.zero() to a variable of the right type.

No comments: