This is for the record, since the derivations took me a while and I'd rather not lose them.

A functor is the signature:

module type FUNCTOR = sig type 'a t val fmap : ('a -> 'b) -> ('a t -> 'b t) end

satisfying the following laws:

Identity: fmap id ≡ id Composition: fmap (f ∘ g) ≡ fmap f ∘ fmap g

An applicative structure or idiom is the signature:

module type APPLICATIVE = sig type 'a t val pure : 'a -> 'a t val ap : ('a -> 'b) t -> ('a t -> 'b t) end

satisfying the following laws:

Identity: ap (pure id) ≡ id Composition: ap (ap (ap (pure (∘)) u) v) w ≡ ap u (ap v w) Homomorphism: ap (pure f) ∘ pure ≡ pure ∘ f Interchange: ap u (pure x) ≡ ap (pure (λf.f x)) u

An applicative functor is the structure:

module type APPLICATIVE_FUNCTOR = sig type 'a t include FUNCTOR with type 'a t := 'a t include APPLICATIVE with type 'a t := 'a t end

that is simultaneously a functor and an applicative structure, satisfying the additional law:

Fmap: fmap ≡ ap ∘ pure

A monad is the structure:

module type MONAD = sig type 'a t val return : 'a -> 'a t val bind : ('a -> 'b t) -> ('a t -> 'b t) end

satisfying the following laws:

Right unit: bind return ≡ id Left unit: bind f ∘ return ≡ f Associativity: bind f ∘ bind g ≡ bind (bind f ∘ g)

Every monad is an applicative functor:

module ApplicativeFunctor (M : MONAD) : APPLICATIVE_FUNCTOR with type 'a t = 'a M.t = struct type 'a t = 'a M.t open M let fmap f = bind (fun x -> return (f x)) let pure = return let ap u v = bind (fun f -> fmap f v) u end

This can be proved by easy (but tedious) equational reasoning. That the proof is rigorous is Wadler's celebrated result.

Proof of the Functor Identity:

fmap id ≡ { definition } bind (return ∘ id) ≡ { composition } bind return ≡ { Monad Right unit } id

Proof of the Functor Composition:

fmap f ∘ fmap g ≡ { definition } bind (return ∘ f) ∘ bind (return ∘ g) ≡ { Monad Associativity } bind (bind (return ∘ f) ∘ return ∘ g) ≡ { Monad Left unit } bind (return ∘ f ∘ g) ≡ { definition } fmap (f ∘ g)

A number of naturality conditions are simple equations between λ-terms. I'll need these later:

Lemma 1 (Yoneda):

fmap f ∘ (λh. fmap h x) ≡ { defn. ∘, β-reduction } λg. fmap f (fmap g x) ≡ { defn. ∘ } λg. (fmap f ∘ fmap g) x ≡ { Functor Composition } λg. fmap (f ∘ g) x ≡ { abstract } λg. (λh. fmap h x) (f ∘ g) ≡ { defn. ∘, η-contraction } (λh. fmap h x) ∘ (f ∘)

Lemma 2:

fmap f ∘ return ≡ { definition } bind (return ∘ f) ∘ return ≡ { Monad Left unit } return ∘ f

Lemma 3:

bind f ∘ fmap g ≡ { definition fmap } bind f ∘ bind (return ∘ g) ≡ { Monad Associativity } bind (bind f ∘ return ∘ g) ≡ { Monad Left unit } bind (f ∘ g)

Lemma 4:

bind (fmap f ∘ g) ≡ { definition fmap } bind (bind (return ∘ f) ∘ g) ≡ { Monad Associativity } bind (return ∘ f) ∘ bind g ≡ { definition fmap } fmap f ∘ bind g

The Applicative Functor condition is easy to prove and comes in as a handy lemma:

ap ∘ pure ≡ { definition } λv. bind (λf. fmap f v) ∘ return ≡ { Monad Left unit } λv. λf. fmap f v ≡ { η-contraction } fmap

Proof of the Applicative Identity:

ap (pure id) ≡ { Applicative Functor } fmap id ≡ { Functor Identity } id

Proof of the Applicative Homomorphism:

ap (pure f) ∘ pure ≡ { Applicative Functor } fmap f ∘ pure ≡ { Lemma 2, defn. pure } pure ∘ f

Proof of the Applicative Interchange:

ap (pure (λf.f x)) u ≡ { Applicative Functor } fmap (λf.f x) u ≡ { definition } bind (return ∘ (λf.f x)) u ≡ { defn. ∘, β-reduction } bind (λf. return (f x)) u ≡ { Lemma 2 } bind (λf. fmap f (return x)) u ≡ { definition } ap u (pure x)

The proof of the Applicative Composition condition is the least straightforward of the lot, as it requires ingenuity to choose the reduction to apply at each step. I started with a long, tedious derivation that required forward and backward reasoning; at the end I refactored it in byte-sized lemmas in order to simplify it as much as I could. As a heuristic, I always try to start from the most complicated expression to avoid having to guess where and what to abstract (that is, applying elimination rules requires neatness, while applying introduction rules requires backtracking):

ap (ap (ap (pure (∘)) u) v) w ≡ { Applicative Functor } ap (ap (fmap (∘) u) v) w ≡ { definition } bind (λf. fmap f w) (bind (λf. fmap f v) (fmap (∘) u)) ≡ { Lemma 3 with f := λf. fmap f v, g := (∘) } bind (λf. fmap f w) (bind ((λf. fmap f v) ∘ (∘)) u) ≡ { Monad Associativity with f := λf. fmap f w, g := (λf. fmap f v) ∘ (∘) } bind (bind (λf. fmap f w) ∘ (λf. fmap f v) ∘ (∘)) u ≡ { defn. ∘ (at right) } bind (λf. (bind (λf. fmap f w) ∘ (λf. fmap f v)) (f ∘)) u ≡ { defn. ∘, β-reduction } bind (λf. bind (λf. fmap f w) (fmap (f ∘) v)) u ≡ { Lemma 3 with f := λf. fmap f w, g := (f ∘) } bind (λf. bind ((λf. fmap f w) ∘ (f ∘)) v) u ≡ { Lemma 1 } bind (λf. bind (fmap f ∘ (λf. fmap f w)) v) u ≡ { Lemma 4 with g := λf. fmap f w } bind (λf. fmap f (bind (λf. fmap f w) v)) u ≡ { definition } ap u (ap v w)

What is this good for? I don't really know; Haskellers can't seem to be able to live without them while I can't seem to justify their application. I suspect laziness has a lot to do with it; in any case, the machinery is more palatable with the appropriate combinators:

module Functor (F : FUNCTOR) = struct include F let ( <$> ) = fmap end module Applicative (A : APPLICATIVE) = struct include A let ( <*> ) = ap end module Monad (M : MONAD) = struct include M include (ApplicativeFunctor (M) : APPLICATIVE_FUNCTOR with type 'a t := 'a t) let ( <$> ) = fmap let ( <*> ) = ap let ( >>= ) m f = bind f m end