Monday 11 October 2010

Solving Polymorphic Problems using Scala Type Classes

Last week I attended Scala Lift Off London. There were a couple of sessions on Type Classes in Scala, which I found very interesting. In fact, they offer a great solution to a problem that I encountered recently on a Java project (although I'm using Scala code for this entire post).

This particular project had a number of domain objects that could be 'published'. In this particular case, published meant converted to an XML representation which was then passed to a remote server for processing. Multiple domain objects can be published at the same time in a 'publish package'. The challenge was that none of these domain objects had any particular similarities or common interfaces. E.g.

class Page {
  // Page attributes and methods
}

class Story {
  // Story attributes and methods
}

// Other domain objects: image, component etc.

Now, there are a number of ways that these classes could be handled for publishing. An initial suggestion might be to add a Publishable interface to each domain object. However, this is a bad idea for a number of reasons:

  • It couples the knowledge of publishing into domain objects that are used for multiple different purposes
  • The domain objects are complicated for all users who don't require publishing
  • It may not be possible to change the domain objects if they are owned by another project.

So, how do we go about adding these domain objects into the publish package. Well, there are a number of approaches, none of which are ideal. The first is to add a separate method to the publish package for each class that can be published:

class PublishPackage {
  def add(page: Page) = ...  // convert to XML and add to package
  def add(story: Story) = ... // convert to XML and add to package
  ... // And so on
}

This creates a publish package that is very unstable and will soon grow beyond a maintainable size, not good. Another approach is to have some kind of dynamic class-based lookup:

class PublishPackage {
  private mappers: Map[Class, PublishMapper] = ...

  def add(obj: AnyRef) = ... // Lookup mapper by object class and call method on it
}

This is much better as the logic to map each domain object to its publish XML is now outside the publish package. However, it still has some issues in that we need to update the map of mappers each time we have a new domain object type to publish, which may involve changing publish code or configuration. There is also a runtime risk here in the the compiler will let us pass any object to the add method even if it can't be published, so we loose some level of type safety.We also have to write additional code in the add method to handle this case.

This general - I want to handle objects polymorphically but can't because they don't share a common base - problem occurs all too frequently in software, especially when dealing with legacy code bases or objects from external dependencies. Fortunately Scala Type Classes give us an elegant solution to the problem. What we will do is take the second solution described above, but use Type Classes to solve the problems of runtime safety and the need to update the map of mappers.

First, we change to publish package to support adding of publish-ready XML instead of any specific or general binding to domain objects:

class PublishPackage {
   def +(nodes: NodeSeq) = ... // Add nodes to publish package
}     

Next we declare a common trait that will be implemented by all the mappers:

trait PublishMapper[T] {
  def formatForPublish(content: T): NodeSeq
}

Next we declare the Type Classes (or Objects in this case) that implement each of the mappers. Note that they are defined as implicit, allowing the correct one to be implicitly selected for the type of domain object being used:

implicit object PagePublish extends PublishMapper[Page] {
  def formatForPublish(content: Page) = ...
}

implicit object StoryPublish extends PublishMapper[Story] {
  def formatForPublish(content: Story) = ...
} 

Next we need to declare the Type Class method that takes an implicit PublishMapper which it will call to convert the domain object into the NodeSeq for adding to the publish package:

implicit def objectToPublishXml[T](t: T)(implicit m: PublishMapper[T]): NodeSeq = m.formatForPublish(t)

Let's talk about this one in more detail. It's a parameterised function marked as being implicit so we can convert from any T to a NodeSeq, provided that there is a PublishAdapter instance available that is also of the parameterised type T. If all the conditions are met then the formatForPublish method is called on the mapper and the result returned.

Now that we have this in place, we can just write the following code:

val somePage = new Page
val someStory = new Story

val p = new PublishPackage
p + somePage
p + someStory

So, now we don't have to modify any existing code when we want to support a new publish type, we just implement a new PublishMapper for it and the compiler will notice that the new implicit conversion is available. Additionally, we are now fully typesafe as the compiler will error if we try to add add any domain objects to the publish package for which there is no publish mapper conversion.

As we have hopefully seen, Scala Type Classes are an elegant solution to a very common problem which gives us continued type safety for a minimal amount of extra code. In fact, I'd argue that the tiny amount of Scala complexity here saves way more code which we would have to write if we were doing class based dispatch and runtime type checking.

No comments:

Post a Comment