(1) 🔄 `for` vs `map`

for vs map

Example

Suppose we have a list of numbers, and we want to get the square of each number.

Using for loop:

val numbers = List(1, 2, 3, 4, 5)
val squaresWithFor = for (n <- numbers) yield n * n
// squaresWithFor: List[Int] = List(1, 4, 9, 16, 25)

Here, we use a for loop to iterate through the numbers list, applying a square operation (n * n) to each element n, and then collecting the results using the yield keyword.

Using map function:

val numbers = List(1, 2, 3, 4, 5)
val squaresWithMap = numbers.map(n => n * n)
// squaresWithMap: List[Int] = List(1, 4, 9, 16, 25)

In this example, we use the map function to apply the same square operation to each element in the numbers list. The map method accepts a function that defines how to transform each element in the list.

map

In Scala, it is generally recommended to use map instead of for loops to iterate over collections, mainly based on the following reasons:

Note:
Although “#5 map can be combined with Scala’s parallel collections” and “#6 map function is indeed more easily applied to distributed computing environments” sound similar, they actually refer to two different concepts. Please read carefully.

  1. Functional Programming Style: Scala is a language that supports functional programming. Functional programming emphasizes using immutable data and side-effect-free functions. map is a functional programming method that returns a new collection without modifying the original collection, consistent with the philosophy of functional programming.

  2. Expressiveness and Conciseness: Using map can implement the traversal and transformation of a collection in one line of code, making the code more concise and readable. In contrast, using a for loop might require more lines of code and may not be as intuitive in some cases.

  3. Fewer Errors: Since map is a higher-order function, it abstracts the iteration process, thereby reducing the potential for errors, such as index errors, when traversing a collection.

  4. Ease of Chaining Operations: map can easily be chained with other collection operations (such as filter, reduce, etc.), making complex data processing steps more concise.

  5. Combination with Parallel Collections:

    • This refers to parallel processing on a single machine. Scala provides parallel collections (like ParArray, ParVector, etc.), which allow operations like map to be executed in parallel across multiple cores or threads.
    • Using parallel collections can speed up processing, as they allow different parts of a collection to be operated on different processor cores simultaneously.
    • This is mainly about improving the efficiency of using multi-core processing power on a single machine.
  6. Suitable for Distributed Computing Environments:

    • This refers to distributed computing across multiple machines. In distributed environments (like Apache Spark), map can be used to operate on datasets distributed across different machines.
    • In such cases, the map operator is suitable for distributed processing, as the processing of each element is independent and does not need to interact with other elements, making it suitable for parallel processing across different nodes in a network.
    • This focuses on data processing and computational capabilities across multiple machines.

for

When to Use for

In Scala, for loops (especially those with the yield keyword) are more advantageous when dealing with multiple collections or more complex operations. Here is a specific example where the for expression is more suitable than using map alone:

Suppose you have two lists, one of numbers and the other of their descriptions. You want to generate a new list containing combinations of numbers and their corresponding descriptions that meet specific criteria.

val numbers = List(1, 2, 3, 4, 5)
val descriptions = List("one", "two", "three", "four", "five")

// Using a for expression to combine the two lists, and selecting only those combinations where the number is greater than 2
val result = for {
  (number, description) <- numbers zip descriptions
  if number > 2
} yield s"$number is $description"

// Result: List("3 is three", "4 is four", "5 is five")

In this example, the for expression allows you to conveniently traverse the two collections (combined using the zip method) and use an

if condition internally to filter elements.

To achieve the same functionality using the map method, you first need to zip the two lists together, then use the filter method to select elements that meet the condition, and finally use the map method to format the final string. Here’s how it’s done:


val numbers = List(1, 2, 3, 4, 5)
val descriptions = List("one", "two", "three", "four", "five")

// First, combine numbers and descriptions together
val zipped = numbers zip descriptions

// Then filter out elements where the number is greater than 2
val filtered = zipped.filter { case (number, _) => number > 2 }

// Finally, map over the filtered list to generate the final strings
val resultWithMap = filtered.map { case (number, description) => s"$number is $description" }

// Result: List("3 is three", "4 is four", "5 is five")`

Underlying Implementation of for Expressions

In Scala, for expressions are actually syntactic sugar, and the compiler translates them into calls to map, flatMap, and filter. Understanding this translation is key to understanding how for expressions work and their efficiency in Scala.

Simple for Loop

For a simple for loop, such as:

for (x <- list) yield x * 2

The compiler translates it to:

list.map(x => x * 2)

Nested for Loops

When for loops are nested, such as:

for {
  x <- list1
  y <- list2
} yield x * y

The compiler translates it using flatMap and map:

list1.flatMap(x => list2.map(y => x * y))

Using if Filters

If the for expression includes if filters, like:

for {
  x <- list
  if x > 2
} yield x * 2

The compiler translates it into a combination of withFilter (an optimized filter method) and map:

list.withFilter(x => x > 2).map(x => x * 2)

withFilter

  • Does not immediately traverse the collection (lazy)
  • It does not immediately create a new collection

Comprehensive Example

In more complex for expressions, these translations can be combined, such as:

for {
  x <- list1
  y <- list2
  if x > y
} yield x + y

This is translated by the compiler to:

list1.flatMap(x => list2.withFilter(y => x > y).map(y => x + y))

Understanding these translations helps to gain a deeper understanding of Scala’s functional features and enables you to use for expressions more effectively, as well as to manually optimize them when necessary.

Leave a Reply

Your email address will not be published. Required fields are marked *