What is the Scala for-comprehension?
What is the for-comprehension?
You started writing Scala and someone tells you in a review: “you should use a for comprehension here”. You might ask a question or try a different approach but then the reviewer says: “the for-comprehension is just map & flatMap”. But really, what does that mean?
In Scala, and other PLs (with a tendency for functional programming), there are some data structures whose purpose is to wrap a piece of data into a specific context. The context depends on the data structure. The following is some of these data structure with the context they address:
Option[T]
->T
might be nullList[T]
-> there is 0 or moreT
Future[T]
-> you’ll have aT
, eventually
But what do these data structures have in common? Well, they can be used in a for-comprehension
. They have other things in common, but for the purpose of this post, let’s forget it. So what’s a for comprehension if it can be used on both Option
and List
.
Really, they’re just map
and flatMap
(maybe foreach
and withFilter
). These two methods, map
and flatMap
are ways to transform the data in the structure. Look at the signature of map and flatMap on Option
. map
takes a function as an argument, and the function is a transformation from one type A
(the type of the piece of data in the structure), to another, of your choosing: B
. flatMap
does the same, but the function returns a piece of data B
into the data structure we’re working with. In the case of Option
, that would be an Option[B]
.
Let see some examples
Let’s see this map
method in practice. First, our data:
val data: String = "a piece of data"
val liftedInOption: Option[String] = Option(data)
val nullInOption: Option[String] = Option(null)
val stringToInt: String => Int = (value: String) => value.length
We’ve got an Option
with a String
in it, an Option
with null
in it, and a function to turn a String
into an Int
.
Let’s see what happens when we call map
on an Option
with a piece of data in it. Here are 3 equivalent ways:
val withMap1: Option[Int] = liftedInOption.map(stringToInt)
// withMap1: Option[Int] = Some(15)
val withMap2: Option[Int] = liftedInOption.map(value => value.length)
// withMap2: Option[Int] = Some(15)
val withMap3: Option[Int] = liftedInOption.map(_.length)
// withMap3: Option[Int] = Some(15)
In this case, we’re working with an Option[String]
, so the map
method takes a function String => ?
. Our function turns a String
into an Int
, so when we give it to map
, we go from an Option[String]
, to an Option[Int]
. And now, what happens if the String
inside the Option
is null
(like nullInOption
), do we have to check for null
before calling length
?
val nullMap1: Option[Int] = nullInOption.map(stringToInt)
// nullMap1: Option[Int] = None
val nullMap2: Option[Int] = nullInOption.map(value => value.length)
// nullMap2: Option[Int] = None
val nullMap3: Option[Int] = nullInOption.map(_.length)
// nullMap3: Option[Int] = None
We don’t. That’s because it is the Option
data structure only purpose to check for null
. It’s there to help you work with pieces of data that may be potentially null
. What’s flatMap
then? As we’ve said earlier, it’s the same as map
, but it expects the result of our function to be an Option[?]
.
Let’s see how it works. In this example, we have a function that returns the length only if the data inside the Option
starts with a piece, otherwise we get None
. We can apply it to our option using a flatMap
def parse(value: String): Option[Int] = {
if (value.startsWith("a piece")) Some(value.length) else None
}
val withFlatMap1: Option[Int] = liftedInOption.flatMap(parse)
// withFlatMap1: Option[Int] = Some(15)
val withFlatMap2: Option[Int] = liftedInOption.flatMap(value => if (value.startsWith("a piece")) Some(value.length) else None)
// withFlatMap2: Option[Int] = Some(15)
val nullFlatMap1: Option[Int] = nullInOption.flatMap(parse)
// nullFlatMap1: Option[Int] = None
val nullFlatMap2: Option[Int] = nullInOption.flatMap(value => if (value.startsWith("a piece")) Some(value.length) else None)
// nullFlatMap2: Option[Int] = None
I use Option
because it’s a data structure whose goal is relatively easy: deal with null
s. Ok, that’s all great, but what does the for-comprehension has to do with all that. Well, sometimes you’ll have multiple operation to perform on the piece of information inside an Option
. That would look like that:
liftedInOption.flatMap { value =>
parse(value).map { length =>
List.range(0, length)
}
}
// res0: Option[List[Int]] = Some(
// List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
// )
Here we chained a flatMap
and a map
. It’s not too bad, but we only have two chained methods. If we had more, it would rapidly becomes unreadable. That’s where the for-comprehension comes into play. It helps with the chaining of those map
and flatMap
. In a for-comprehension, each <-
is a call to flatMap
where you have access to the values you’ve computed above the current line. The example above, could be written like so:
for {
value <- liftedInOption // we extract `value`
length <- parse(value) // we have access to `value` from above (`flatMap`)
} yield List.range(0, length) // we have access to `value` and `length` (`map`)
// res1: Option[List[Int]] = Some(
// List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
// )
Again, if any of these computations give us a None
, the rest of the computations is short-circuited and we end up with a None
.
Glueing things together
In our examples, we’ve worked with Option
but what’s the behavior with List
for example?
for {
index <- List(1, 2, 3, 4, 5)
dupplicated <- List(10, 20, 30)
} yield dupplicated
// res2: List[Int] = List(
// 10,
// 20,
// 30,
// 10,
// 20,
// 30,
// 10,
// 20,
// 30,
// 10,
// 20,
// 30,
// 10,
// 20,
// 30
// )
For each element in the first List
, we generate a new List
with 10, 20, 30
in it.
Can we merge flatMap
and map
from different data structure inside the same for-comprehension?
for {
value <- liftedInOption
index <- List(1, 2, 3, 4, 5)
} yield value
// error: type mismatch;
// found : List[String]
// required: Option[?]
// index <- List(1, 2, 3, 4, 5)
// ^
We can’t. In truth, the first line in the for-comprehension sets the type of the data structure we’re working with (Option
in the example above) and the rest of the computation have to align with it.
You can also set variables, and use them in the rest of the for comprehension:
for {
index <- List(1, 2, 3, 4, 5)
msg = "A new message"
dupplicated <- List(msg, "some other message")
} yield dupplicated
// res4: List[String] = List(
// "A new message",
// "some other message",
// "A new message",
// "some other message",
// "A new message",
// "some other message",
// "A new message",
// "some other message",
// "A new message",
// "some other message"
// )
We’ve seen Option
and List
, we’re not gonna see the effect on other types like Future
because there are too many, but the take away here is quite simple:
map
helps you run a computation on a piece of data inside the structureflatMap
is likemap
but the result of the computation has to be in the same data structure- the meaning of those operations is entirely up to the data structure (
Option
deals withnull
and theList
deals with multiple elements)
What’s next?
Well, there’s nothing like doing it yourself to really wrap your head around something like that but I hope this post was easy to follow. We also did not say the word, but those data structure we’re working with in our for-comprehension, they’re called a Monad
in the abstract. You can implement map
and flatMap
on your own data structures to use them in a for-comprehension.
You can read more here:
- the for-comprehension: https://docs.scala-lang.org/tour/for-comprehensions.html
- the Monad (in
cats
): https://typelevel.org/cats/typeclasses/monad.html#flatmap