Tuesday, April 1, 2025

Applicative functors are cooler than monads

These notes were written a few years ago to test my own understanding. The notes are not rigorous and are hand-wavy and probably wrong. Hopefully they at least capture some useful intuition. The notation is informal.

In this post I use “applicative” as shorthand for “applicative functor”. 

I’ve noticed that, in most explanations on the hierarchy of functors < applicatives < monads, attention is paid mostly to the first and the third, whereas applicatives are glossed over and unmotivated, treated as merely a stepping stone in the construction the monad machinery. As such, I found it difficult to develop an intuition for them for their own sake. This post outlines some of notes on the properties of applicatives, how they differ from functors and monads, and why we might care. 

Functors are the most straightforward of the three. They are containers — here, I use “container” hand-wavily to mean a derived type that conceptually represents a basic type that's been augmented/lifted/tagged with some information or behaviour. Other tutorials call this a context for the type, or a box, or a computation which evaluates to the underlying value. 

Functors have a “map” mechanism that essentially forwards function calls on the functor instance to function calls on the underlying value. Whenever we map a function onto a container in Python, we are conceptually treating the container as a functor. Similarly, numpy arrays/vectors act in a functor-like fashion when we invoke functions and arithmetic operations on them that execute element-wise. We can map several functions in sequence onto a functor, and this corresponds to performing a single map of the composition of those functions onto the sequence. Importantly, the mapping function of a functor cannot modify its functor structure — it can’t change a three-element list to a four-element list, or convert an empty Optional<T>  to a populated one.

Monads are… well, I’m not going to write a monad tutorial here, but their usefulness derives solely from their behaviour in pipelines of functions. Monads are derived types which can behave in interesting and dynamic ways when piped through a chain of composed functions; their values can change at any point in the pipeline, as can the monadic structure itself, in response to partial results from earlier in the pipeline. For the list monad, the values in the list can change (as in the list functor), but also the length of the list can change dynamically based on earlier operations applied to the list. Monads provide a lot of flexibility.

HOT TAKE: I think monads are massively overrated – at least, in the way that they're often presented in online tech communities: as a fundamental "change-your-brain-and-code-better" concept that is worthy of being learned by your typical procedural programmer. I spent many many hours grappling with monad tutorials and Reddit explanations, and working through the underlying theory of monoids and endofunctors and all that jazz. By the end, I felt like I had a working grasp of the core principles. And my reaction was... that's it? I might be missing something, but to me, monads appear to be a useful nifty abstraction for when you're working in a purely function language with chains of function calls, where you're otherwise limited in your ability to maintain state. Learning about monads did not change the way I think or the way I program in non-purely-functional languages, where you can just have state.

What about applicatives? Most explanations describe applicatives purely in terms of their mechanics: whereas a functor container accepts a function which it "forwards" onto its "contents", an applicative container accepts a container of functions which it forwards onto its contents (in a manner that obeys certain laws). What often isn’t explained is why we would even want to do this. To me, it feels like an unnatural thing to do — is it just applying multiple functions at once? If so, couldn’t we achieve using the functor machinery, by applying functions one by one? How does a function even get into a container in the first place? 

In a currying environment, applicatives generalise functors to multi-parameter functions 

A functor takes a function of one argument and maps it “element-wise” to each “element” of the container. For instance, we can do map(double, [1,2,3]) and the double function is forwarded element-wise to the list to get [2,4,6]. Or map(double, Optional(5)) which forwards double to the single “element” of the optional container. But what if we want to map the binary function multiply(a,b) element-wise to two parallel lists? We can’t do map(multiply, [1,2,3], [5,7,9]) because map takes two parameters. We could try map(map(multiply, [1,2,3]), [5,7,9]), but due to currying, map(multiply, [1,2,3]) produces the list of partial functions [multiply(1,…), multiply(2,…), multiply(3,…)] and we can’t then map this list onto [5,7,9]. But wait — what if we could? This is exactly the power that the applicative map function gives us — we have a container of functions and a container of elements, and the applicative map function gives us a mechanism to apply the functions to the elements.

So this is one way to think of applicatives: they are a simple and natural extension to functors, a sort of theoretical bit of glue/machinery, which provide a way to map multi-parameter functions to containers in a currying-based programming environment.

Unlike functors, applicatives can induce structural changes to the container. Unlike monads, these changes can be determined statically (i.e. ahead of time, without running the program)

In the former example, we were applying multiply to a pair of three-element lists, element-wise, to produce a three-element result. This inner product is one valid way to map a binary function onto two lists, but it is not the only one. We could instead define our applicative map function to be an outer product (i.e. Cartesian product), producing a nine-element result instead: the result would contain the product of each element of the first list with each element of the second.

If we define the applicative like this, then the mapping on applicatives can produce a result that is structurally different to either of the inputs. Importantly, the mapped function still doesn’t know anything about the structure of the applicative type. It’s the logic built into the applicative map function itself that defines how its structure changes, and it always changes in a manner that is known ahead of time based on the structure of the inputs.

This is not the case with monads, because the structure of the result of a monadic bind can change dynamically based on the values of the monads as well as their monadic structure. For example, considering a "list-shaped" applicative with instances a,b,c. Then map([a,b,c], [1,2,3]) could, I think, have either three elements or nine elements or no elements (are there other possibilities that satisfy the applicative laws? I don't think so, but I haven't thought too deeply about it), and we would know which one of these is the case by looking at the definition of map. But if a,b,c were monads, then we cannot  reason in the same way about the size of the result without knowing the values of a,b,c.

Summary

  • Functors cannot change their container structure under mapping — a three-element list remains three elements under any mapped function. 
  • Applicatives can change their structure under mapping, in a way that can always be known ahead of time (i.e. statically)  according to the fixed structures of the inputs to the map. 
  • Monads can change their structure under mapping dynamically according to the structures and the values of the inputs. As such, the structure of the result of a monadic operation can’t generally be determined statically because we don’t know the values of the inputs to the operation until we actually compute them. 

Unlike monads, applicatives are always composable 

Applicatives and monads sit in a type hierarchy in which all monads are also applicatives. From this, one might conclude that everything that works for an applicative must also work for a monad. But this would be wrong. 

It is a fact about applicatives that they always compose: if we have two applicative types A,B, then the composition of the types A B is also an applicative. It may be helpful to play around  a concrete example of composed applicatives such as Optional<List<T>>, and considering the ways we can act on it. 

But monads do not, in general, compose. For details on this, see https://stackoverflow.com/questions/7040844/applicatives-compose-monads-dont . My first thought was: how is this possible? Applicatives compose, and all monads are applicatives, so all monads must compose. 

What is actually happening here is that applicative composition can be considered as a function A -> B -> C, which maps two applicatives to their composition. We wish to push this function down the type hierarchy to monads. But, as I wrote on this blog a few years ago, functions are contravariant in their input values. So if M is a subtype of A, then a function which requires an M is *not* a subtype of a function that returns a type A. So composition of applicatives can’t be “pushed down” to composition of monads. 

This is quite technical and I only mention it because it helped clarify things for me personally. Generally, we can interpret the non-composability of monads as a result of the two bind functions of the input applicative not being combinable in a natural way to form the bind function of the composed applicative. 

Maybe this is a mainstream view now?


So just as I finished pasting and formatting this post, I Googled "applicatives". And just one day ago on the orange website, there is a great discussion where someone else makes the point I'm trying to make here:

For some reason everyone likes to talk about Monads, but really the other types here are just as interesting. For example, Applicatives are less dynamic than Monads in that you can't `flatMap`/`bind` to decide on the "next" thing to evaluate based on the previous value, but in exchange you get a "static" tree (or graph) of Applicatives that lends itself much better to static analysis, optimization, parallelism, and so on.

[...] Monads seem to have this strange aura around them that attracts certain kinds of personalities, but really they're just one abstraction in a whole toolkit of useful abstractions, and there are many cases where Applicative or some other construct are much more suited

The whole thread has other great comments too. Here's one:

I like to characterize the high level difference as something like "Monads are computations that can _call out to_ other computations and integrate their results", "Applicatives can run computations in parallel and combine the resulting structures / side effects", and "Functors allow you to lift pure functions into a computation". At each step, highlighting both how you are getting less powerful (because you can implement functors in terms of applicatives, and applicatives in terms of monads, but not the other way around), and how having less power can help you reason better about your programs (pros/cons of applicative vs. monadic parsers are a good example here).

I was just about to type something like this out here before I saw this comment – that there is in general a really interesting inverse relationship between the power of a machine and how easy it is to reason about it. Great stuff. And how this is why, these days, I wish I could stop using Python and just use Go instead ...

No comments:

Post a Comment