Choosing between a class and a record

Question

I tend to see no issue with only requiring constraints on functions. The issue is, I suppose, that your data structure no longer models precisely what you intend it to. On the other hand, if you think of it as a data structure first and foremost, then that should matter less.

I feel like I don’t necessarily still have a good grasp on the question, and this is about as vague as can be, but my rule of thumb tends to be that typeclasses are things that obey laws (or model meaning), and datatypes are things that encode a certain quantity of information.

When we want to layer behavior in complex ways, I’ve found that typeclasses start off enticingly, but can get painful quickly and switching to dictionary-passing makes things more straightforward. Which is to say that when we want implementations to be interoperable, then we should fall back to a uniform dictionary type.

This is take two, expanding a bit on a concrete example, but still just sort of spinning ideas…

Suppose we want to model probability distributions over the reals. Two natural representations come to mind.

A) Typeclass-driven

class PDist a where
        sample :: a -> Gen -> Double

B) Dictionary-driven

data PDist = PDist (Gen -> Double)

The former lets us do

data NormalDist = NormalDist Double Double -- mean, var
instance PDist NormalDist where...

data LognormalDist = LognormalDist Double Double
instance PDist LognormalDist where...

The latter lets us do

mkNormalDist :: Double -> Double -> PDist...
mkLognormalDist :: Double -> Double -> PDist...

In the former, we can write

data SumDist a b = SumDist a b
instance (PDist a, PDist b) => PDist (SumDist a b)...

in the latter we can simply write

sumDist :: PDist -> PDist -> PDist

So what are the tradeoffs? Typeclass-driven lets us specify what distributions we’re given. The tradeoff is that we have to construct an algebra of distributions explicitly, including new types for their combinations. Data-driven doesn’t let us restrict the distributions we’re given (or even if they’re well-formed) but in return we can do whatever the heck we want.

Furthermore we can write a parseDist :: String -> PDist relatively easily, but we have to go through some angst to do the equiv for the typeclass approach.

So this is, in a sense the typed/untyped static/dynamic tradeoff at another level. We can give it a twist though, and argue that the typeclass, along with associated algebraic laws, specifies the semantics of a probability distribution. And the PDist type can indeed be made an instance of the PDist typeclass. Meanwhile, we can resign ourselves to using the PDist type (rather than typeclass) nearly everywhere, while thinking of it as iso to the tower of instances and datatypes necessary to use the typeclass more “richly.”

In fact, we can even define basic PDist function in terms of typeclass functions. i.e. mkNormalPDist m v = PDist (sample $ NormalDist m v) So there’s lots of room in the design space to slide between the two representations as necessary…

Leave a Comment Cancel reply