Monadt – Algebraic Data Types and Monads in Ruby, Part 2: Monads

In yesterday’s post, I introduced monadt, a gem that adds algebraic data types (ADTs) and monads to Ruby. Today I’m going to dive into how monadt provides monad support, specifically the imperative-looking syntactical sugar you get in languages like Haskell and F#.

I’m not going to cover how monads work in this post, but I will suggest reading either (or both) of the following:

  1. Learn You a Haskell: A Fistful of Monads
  2. F# for Fun and Profit: Dr. Frankenfunctor and the Monadster

Just the Basics

Monadt defines an API for defining and using monads in Ruby, and it includes several common monads out of the box: Maybe, Either, Reader, State, Async, AsyncEither, and ReaderStateEither.

Let’s take a closer look at the AsyncEither monad. Briefly, the Either data structure represents having one of two values: Left X or Right Y. In a monadic context, Either is usually used for the choice between a success value and an error value. In monadt, Either is defined as follows using monadt’s ADT syntax:


class Either
  Left = data :left
  Right = data :right
end

When you use Either in a monad, you perform a series of calculations, opting out early if any of the steps returns an error value.

The Async monad is designed to streamline performing sequences of asynchronous operations. Basically, it lets you write imperative-looking asynchronous code instead of using a bunch of continuation blocks.

Putting these concepts together, we can use an AsyncEither monad, which asynchronously retrieves either a success value or a failure value. If any asynchronous step fails, the failure value at that point will be returned.

To see this in practice, imagine a method get_json in Ruby that returns a Fiber which will return the JSON contents of a URL as a Hash (when accessed with Fiber.resume). Given such a method, we might construct a sequence of operations as follows:


def get_bookface_spouse_data(user)
  AsyncEither.bind (get_url "www.mydata.com/user-info?uid=#{user}") do |profile|
    AsyncEither.bind (get_url "www.bookface.com/profile/#{user_info[:bookface_id]}") do |spouse_id|
      AsyncEither.bind (get_url "www.mydata.com/user-info?bfid=#{profile[:spouse_id]}") do |spouse_data|
        AsyncEither.return data
      end
    end
  end
end

If the request to mydata.com for uid user fails, get_url will return Left "url could not be retrieved #{message}". Consequently, get_url will not be called the second time (using the bookface.com URL) and the method will complete with the Left value.

However, if the first call succeeds, get_url will return Right profile where profile will be passed into the block given to AsyncEither.bind() (where the block has the signature T -> AsyncEither<U>, that is T -> Fiber<Either<U>>). The block given to AsyncEither.bind() in our example extracts the value from key :bookface_id in the user_info to retrieve the bookface profile data.

If this fails, then again you will short-circuit early with a Left value containing error information. If it succeeds, the user’s bookface profile will be used to retrieve their spouse’s ID, which is then used to locate the stored data for a spouse.

While the above code clearly demonstrates the way nested bind() calls work, I find it hard to read. Haskell and F# provide syntactical sugar to make the code look more imperative, which greatly improve readablity. Fortunately, there is a way to emulate this syntactical sugar in Ruby:


def get_bookface_spouse_data2(user)
  Monad.async_either do |m|
    user_info = m.bind (get_url "www.mydata.com/user-info?uid=#{user}")
    profile = m.bind (get_url "www.bookface.com/profile/#{user_info[:bookface_id]}")
    spouse_data = m.bind (get_url "www.mydata.com/user-info?#bfid={profile[:spouse_id]}")
    m.return spouse_data
  end
end

get_bookface_spouse_data2() is equivalent to get_bookface_spouse_data(), but it looks imperative. We’ll take a look at how this works later, but first let’s see how to define your own monadic types.

Defining Additional Monads

To define your own monad, define bind() and return() as class methods:


class MyMonad
  class << self def bind(ma, &blk) # blk is a lambda/proc of signature a -> mb
      # use ma and blk to return something of type mb
    end
    
    def return(a)
      # turn a into ma
    end
  end
end

Then you can call Monadt::Monad.do_m()


Monadt::Monad.do_m(MyMonad) do |m|
  x = m.bind method1(5)
  y = m.bind method2("hello", x)
  m.return (x + y)
end

do_m() takes as its argument a Class which is presumed to have bind() and return() defined. It also takes a block which is the monadic context to execute. The input argument to the block is a special object that knows how to call bind() and return(). The method Monad.async_either() used in our example is just a convenience method that calls Monad.do_m(AsyncEither).

Under the Hood

The do_m() method uses Ruby’s Fiber and a pattern I call “yield abuse” in order to mimic the monadic syntactical sugar you see in Haskell or F#. Fibers let you pass control back and forth between the monad implementation code and the user code, with the monad implementation code having the option to opt out early. By stringing the Fiber block execution through a series of recursive lambdas, we can create a code flow that looks like standard monadic execution. You can pull off the same trick in JavaScript.

Function-Style Functors

Many of the standard monad types have functions as the monad/functor type that wraps the interior type. So for example, the state monad type m a is traditionally a function that takes in a state, and returns a tuple of a generated value and an updated state: state -> output * state. This is going to look pretty nasty in Ruby unless you find a way to turn normal methods into Procs (the closest thing Ruby has to a function object). Fortunately there is the funkify gem, which I’ve written about before. This handy gem will let you turn standard Ruby methods into Procs that can be partially applied, e.g. so you can have a method


def add(x,y)
  x + y
end

and partially apply it like so:


add_two = add(2) # add_two is a Proc taking one argument

Using funkify, you can generate pretty decent-looking state monad blocks:


def state_func_1(arg, state)
  #...
  [value_1, updated_state_1]
end

def state_func_2(arg1, arg2, state)
  # ...
  [value_2, updated_state_2]
end

Monad.run_state(initial_value) do |m|
  x = m.bind (state_func_1 arg1)
  y = m.bind (state_func_2 arg2, x)
  m.return x * y
end

No Multiverses…Yet

One significant limitation of the current version of monadt is that the List monad is not supported by do_m(). The list monad requires the ability to retry bind functions for different inputs, which would require a partial rewind through the Fiber block. Unfortunately, this is not supported by Ruby. My coworker Job and I have some ideas of how to get around this, but we haven’t implemented it yet.

Conclusion

I was pleasantly surprised at how well monadic control flow can be expressed in Ruby. The dynamic typing and monkey patching, fraught with danger though they may be, actually make it possible to implement complex features like typeclasses in Ruby. And Fibers give you a lot of flexibility in how a code block executes. I encourage you to consider monadt for projects that contain structured data you might otherwise express loosely as a Hash or an Array. And even if monadt doesn’t make sense to include in your particular project, I encourage you to dig deeper into the control flow features of Ruby and see how you might be able to leverage them to streamline the logic in your application. Happy (slightly more functional) programming!