What is the Scala for-comprehension?

2020-03-19

scala functional

What is the for-comprehension?

You started writing Scala and someone tells you in a review: “you should use a for comprehension here”. You might ask a question or try a different approach but then the reviewer says: “the for-comprehension is just map & flatMap”. But really, what does that mean?

In Scala, and other PLs (with a tendency for functional programming), there are some data structures whose purpose is to wrap a piece of data into a specific context. The context depends on the data structure. The following is some of these data structure with the context they address:

Option[T] -> T might be null
List[T] -> there is 0 or more T
Future[T] -> you’ll have a T, eventually

But what do these data structures have in common? Well, they can be used in a for-comprehension. They have other things in common, but for the purpose of this post, let’s forget it. So what’s a for comprehension if it can be used on both Option and List.

Really, they’re just map and flatMap (maybe foreach and withFilter). These two methods, map and flatMap are ways to transform the data in the structure. Look at the signature of map and flatMap on Option. map takes a function as an argument, and the function is a transformation from one type A (the type of the piece of data in the structure), to another, of your choosing: B. flatMap does the same, but the function returns a piece of data B into the data structure we’re working with. In the case of Option, that would be an Option[B].

Let see some examples

Let’s see this map method in practice. First, our data:

val data: String = "a piece of data"
val liftedInOption: Option[String] = Option(data)
val nullInOption: Option[String] = Option(null)
val stringToInt: String => Int = (value: String) => value.length

We’ve got an Option with a String in it, an Option with null in it, and a function to turn a String into an Int.

Let’s see what happens when we call map on an Option with a piece of data in it. Here are 3 equivalent ways:

val withMap1: Option[Int] = liftedInOption.map(stringToInt)
// withMap1: Option[Int] = Some(15)
val withMap2: Option[Int] = liftedInOption.map(value => value.length)
// withMap2: Option[Int] = Some(15)
val withMap3: Option[Int] = liftedInOption.map(_.length)
// withMap3: Option[Int] = Some(15)

In this case, we’re working with an Option[String], so the map method takes a function String => ?. Our function turns a String into an Int, so when we give it to map, we go from an Option[String], to an Option[Int]. And now, what happens if the String inside the Option is null (like nullInOption), do we have to check for null before calling length?

val nullMap1: Option[Int] = nullInOption.map(stringToInt)
// nullMap1: Option[Int] = None
val nullMap2: Option[Int] = nullInOption.map(value => value.length)
// nullMap2: Option[Int] = None
val nullMap3: Option[Int] = nullInOption.map(_.length)
// nullMap3: Option[Int] = None

We don’t. That’s because it is the Option data structure only purpose to check for null. It’s there to help you work with pieces of data that may be potentially null. What’s flatMap then? As we’ve said earlier, it’s the same as map, but it expects the result of our function to be an Option[?].

Let’s see how it works. In this example, we have a function that returns the length only if the data inside the Option starts with a piece, otherwise we get None. We can apply it to our option using a flatMap

def parse(value: String): Option[Int] = {
  if (value.startsWith("a piece")) Some(value.length) else None
}
val withFlatMap1: Option[Int] = liftedInOption.flatMap(parse)
// withFlatMap1: Option[Int] = Some(15)
val withFlatMap2: Option[Int] = liftedInOption.flatMap(value => if (value.startsWith("a piece")) Some(value.length) else None)
// withFlatMap2: Option[Int] = Some(15)

val nullFlatMap1: Option[Int] = nullInOption.flatMap(parse)
// nullFlatMap1: Option[Int] = None
val nullFlatMap2: Option[Int] = nullInOption.flatMap(value => if (value.startsWith("a piece")) Some(value.length) else None)
// nullFlatMap2: Option[Int] = None

I use Option because it’s a data structure whose goal is relatively easy: deal with nulls. Ok, that’s all great, but what does the for-comprehension has to do with all that. Well, sometimes you’ll have multiple operation to perform on the piece of information inside an Option. That would look like that:

liftedInOption.flatMap { value =>
  parse(value).map { length =>
    List.range(0, length)
  }
}
// res0: Option[List[Int]] = Some(
//   List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
// )

Here we chained a flatMap and a map. It’s not too bad, but we only have two chained methods. If we had more, it would rapidly becomes unreadable. That’s where the for-comprehension comes into play. It helps with the chaining of those map and flatMap. In a for-comprehension, each <- is a call to flatMap where you have access to the values you’ve computed above the current line. The example above, could be written like so:

for {
  value <- liftedInOption // we extract `value`
  length <- parse(value) // we have access to `value` from above (`flatMap`)
} yield List.range(0, length) // we have access to `value` and `length` (`map`)
// res1: Option[List[Int]] = Some(
//   List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
// )

Again, if any of these computations give us a None, the rest of the computations is short-circuited and we end up with a None.

Glueing things together

In our examples, we’ve worked with Option but what’s the behavior with List for example?

for {
  index <- List(1, 2, 3, 4, 5)
  dupplicated <- List(10, 20, 30)
} yield dupplicated
// res2: List[Int] = List(
//   10,
//   20,
//   30,
//   10,
//   20,
//   30,
//   10,
//   20,
//   30,
//   10,
//   20,
//   30,
//   10,
//   20,
//   30
// )

For each element in the first List, we generate a new List with 10, 20, 30 in it.

Can we merge flatMap and map from different data structure inside the same for-comprehension?

for {
  value <- liftedInOption
  index <- List(1, 2, 3, 4, 5)
} yield value
// error: type mismatch;
//  found   : List[String]
//  required: Option[?]
//   index <- List(1, 2, 3, 4, 5)
//   ^

We can’t. In truth, the first line in the for-comprehension sets the type of the data structure we’re working with (Option in the example above) and the rest of the computation have to align with it.

You can also set variables, and use them in the rest of the for comprehension:

for {
  index <- List(1, 2, 3, 4, 5)
  msg = "A new message"
  dupplicated <- List(msg, "some other message")
} yield dupplicated
// res4: List[String] = List(
//   "A new message",
//   "some other message",
//   "A new message",
//   "some other message",
//   "A new message",
//   "some other message",
//   "A new message",
//   "some other message",
//   "A new message",
//   "some other message"
// )

We’ve seen Option and List, we’re not gonna see the effect on other types like Future because there are too many, but the take away here is quite simple:

map helps you run a computation on a piece of data inside the structure
flatMap is like map but the result of the computation has to be in the same data structure
the meaning of those operations is entirely up to the data structure (Option deals with null and the List deals with multiple elements)

What’s next?

Well, there’s nothing like doing it yourself to really wrap your head around something like that but I hope this post was easy to follow. We also did not say the word, but those data structure we’re working with in our for-comprehension, they’re called a Monad in the abstract. You can implement map and flatMap on your own data structures to use them in a for-comprehension.

You can read more here:

the for-comprehension: https://docs.scala-lang.org/tour/for-comprehensions.html
the Monad (in cats): https://typelevel.org/cats/typeclasses/monad.html#flatmap