(3) 💤 lazy

Lazy Evaluation

Overview

  • Lazy Evaluation is a programming technique or strategy that calculates expressions only when the values are truly needed.
  • Scala supports lazy evaluation through lazy val, view, and lazy collections like LazyList.

🥨 Layman’s Overview

Imagine you’re preparing a big meal for your friends. Your menu has many dishes, but you’re not sure if everyone will eat every dish. If you prepare all the dishes in advance, there might be a lot of food left uneaten, leading to waste. This is similar to performing all calculations in programming in advance, even if they may not be needed in the end.

Here, the concept of "lazy" evaluation is like changing your strategy, deciding to start preparing a dish only when someone requests it. This way, you won’t waste time and resources preparing food that won’t be eaten. You only start cooking a dish when your friend says, "I want that dish."

Pros and Cons of Lazy Evaluation

Performance Improvement

Using the lazy keyword can improve performance because it allows delaying the execution of resource-intensive or computationally intensive operations until they are actually needed. The main benefits of lazy initialization include:

  1. Reduction of unnecessary computations: If an expensive operation or object creation is never used, using lazy can avoid these unnecessary computations or resource allocations. This is especially useful when initializing objects with high overhead.

  2. Faster program startup: If a program has many resource-intensive or computationally intensive initialization operations, the program’s startup time may be long. By marking these operations as lazy, you can avoid them during program startup, speeding up the process. These operations are only executed when they are actually needed.

  3. Avoiding circular dependency issues: In some cases, the initialization of two or more objects depends on each other, which can lead to circular dependency issues. Using lazy initialization can help resolve such problems because the actual initialization is deferred until all related objects are accessed.

  4. Handling potentially infinite structures: Here, the word "potentially" is crucial. These structures are designed to be potentially infinite, but in reality, at any given time, we only process a part of them. For example, an infinite sequence that theoretically extends indefinitely, but we only calculate specific parts when needed. This is especially relevant when dealing with concepts like streams or iterators. These structures allow us to write programs in a seemingly infinitely running manner, but in reality, we only perform calculations when necessary.

Drawbacks

In a multi-threaded environment, using lazy initialization can introduce thread safety issues and synchronization overhead for the following reasons:

  1. Thread safety issues: In a multi-threaded environment, if multiple threads simultaneously attempt to access the same lazy initialized variable, there can be a race condition. Since a lazy variable is initialized only on its first access, if two threads almost simultaneously access this variable, they might both try to initialize it. This can result in the variable being initialized multiple times or the initialization process being interrupted, leading to data inconsistency or program errors.

  2. Synchronization overhead: To avoid thread safety issues, Scala’s lazy initialization implementation usually includes some form of synchronization mechanism, such as locks. This means that every time a lazy variable is accessed, there might be lock acquisition and release operations involved. In a high-concurrency environment, this can result in significant performance overhead, as multiple threads might be blocked while waiting to acquire locks.

  3. Complex initialization logic: If the initialization logic of a lazy variable itself is complex and time-consuming, and involves access to other shared resources, it can increase the risk of deadlocks. Handling complex lazy initialization in a multi-threaded environment requires special care to avoid deadlocks and other concurrency issues.

Lazy Variables (Lazy Val)

  • Use lazy val to define variables with lazy initialization.
  • The value of the variable is computed on the first access and then cached for subsequent use.
  • It provides performance improvement in scenarios where variables may not be used

View

Definition

  • View is a feature of Scala collections that provides a way to delay transformation operations on collections (such as map, filter, flatMap, etc.).
  • When these operations are applied to a View, they are not executed immediately; instead, they are stored as a sequence of operations and are executed only when elements are accessed.

Use Cases

Using view is more efficient in scenarios where large amounts of data need to be processed, and only a portion of the results is required. It allows for the delayed execution of data transformation and filtering operations until the data is actually needed. This can significantly reduce unnecessary computations, especially when dealing with large datasets. Without view, even if only a small portion of the results is needed, the entire dataset would be computed and stored, leading to resource wastage.

i.e., it is used to optimize chained operations on collections, avoiding the creation of multiple intermediate collections, reducing memory usage, and improving performance.

Steps

  1. Saving transformation operations: When transformation operations like map, filter, flatMap, etc., are applied to a View, these operations are not executed immediately. Instead, they are saved as a sequence of operations. During this stage, the original collection remains unchanged, and no new collections are created.

  2. Delayed execution: These saved operations are executed only when elements of the View are actually accessed. For example, when you attempt to access or iterate over the results of the View, these operations are applied to the original collection in the order they were saved, and the corresponding elements are computed.

  3. Efficiency optimization: This delayed execution allows View to avoid immediate and unnecessary computation of the entire collection, especially in scenarios where only a portion of the results is required. It helps optimize performance, especially when multiple transformation operations are involved, as it avoids creating multiple intermediate collections.

Example of Batch Data Processing and Conditional Filtering

Design a scenario to demonstrate the advantages of view, focusing on its lazy evaluation feature. Consider the following scenario:

Suppose you are processing a list containing a large amount of data. Your task is to perform a series of transformations and filtering operations on this data, but you only need a small portion of the results, e.g., finding the first 10 elements that satisfy a specific condition.

In this scenario, we can compare the performance difference between using view and not using view:

Using view:

val numbers = (1 to 1000000).toList
val view = numbers.view.map(_ * 2).filter(_ > 5)

// Assume we only need the first 10 elements that satisfy the condition
val resultWithView = view.take(10).toList

In this example, view delays the execution of map and filter operations on the entire list. When take(10) is called, only the first 10 elements that satisfy the condition are actually computed. This means that map and filter operations are not executed on the entire list,

saving a significant amount of unnecessary computation.

Without view:

val numbers = (1 to 1000000).toList
val processedNumbers = numbers.map(_ * 2).filter(_ > 5)

// Similarly, we only need the first 10 elements that satisfy the condition
val resultWithoutView = processedNumbers.take(10)

In this example, map and filter operations are immediately executed on the entire list, creating a new processed list. Despite needing only the first 10 elements, the entire list is computed and stored, leading to performance overhead, especially with large datasets.

Code

In actual execution, using view is approximately 12 times faster:

package week2

object ViewVsNoView extends App {
  import scala.collection.View

  def time[R](block: => R): Long = {
    val start = System.nanoTime()
    block
    System.nanoTime() - start
  }

  val numbers = (1 to 1000000).toList

  // With view
  val timeWithView = time {
    val view = numbers.view.map(_ * 2).filter(_ > 5)
    val resultWithView = view.take(10).toList
  }

  // Without view
  val timeWithoutView = time {
    val processedNumbers = numbers.map(_ * 2).filter(_ > 5)
    val resultWithoutView = processedNumbers.take(10)
  }

  println(s"Time with View: $timeWithView nanoseconds")
  println(s"Time without View: $timeWithoutView nanoseconds")
}

//  Time with View: 6729800 nanoseconds  
//  Time without View : 85503600 nanoseconds

LazyList

Definition

  • LazyList is a lazy collection class in Scala that represents a potentially infinite list where elements are computed on-demand.
  • Each element is computed only when first accessed, and the computed result is cached for subsequent use.

Use Cases

  • Used for creating potentially infinite sequences or efficiently processing large collections.

Example of Filtering Prime Numbers

Using LazyList and the Sieve of Eratosthenes algorithm to generate a sequence of prime numbers.

LazyList.from(1).filter(isPrime)(1)

This expression does not compute the entire sequence and then take the second element. Due to the lazy nature of the sequence, this expression works efficiently:

Comparison

If you were to use a Range or a regular list, the expression would first compute and generate a list of all elements within the range that satisfy the isPrime condition. Subsequently, when using .apply(1) or a similar indexing method, it would retrieve the second element from this precomputed list.

In this non-lazy evaluation scenario, the computation efficiency is lower, especially when only a few elements from the sequence are needed. Even if only one or a few elements are required, the entire sequence would still be fully computed and stored. This is why lazy evaluation (such as using LazyList or similar lazy structures) is often more efficient when dealing with large datasets or potentially infinite sequences.

Code

In actual execution, this example with Lazy List is approximately 12 times faster:

object LazyListvsRange extends App {
  def isPrime(n: Int): Boolean = {
    if (n <= 1) return false
    if (n <= 3) return true
    if (n % 2 == 0 || n % 3 == 0) return false
    var i = 5
    while (i * i <= n) {
      if (n % i == 0 || n % (i + 2) == 0) return false
      i += 6
    }
    true
  }

  val startTimeLazyList = System.nanoTime()
  val resultLazyList = LazyList.from(1).filter(isPrime)(1)
  val endTimeLazyList = System.nanoTime()
  val durationLazyList = endTimeLazyList - startTimeLazyList

  println(s"Result with LazyList: $resultLazyList, Time taken: $durationLazyList nanoseconds")

  val startTimeRange = System.nanoTime()
  val resultRange = (1 to 10000000).filter(isPrime)(1)
  val endTimeRange = System.nanoTime()
  val durationRange = endTimeRange - startTimeRange

  println(s"Result with Range: $resultRange, Time taken: $durationRange nanoseconds")
}

LazyList vs View

  • Timing of Execution:

    • View saves a sequence of transformation operations that are applied to the original collection only when elements are accessed.
    • LazyList computes each element and caches the result upon the first access, suitable for potentially infinite sequences.
  • Caching:

    • View typically has lower caching – it does not cache computed results, recalculating each time it is accessed.
    • LazyList may have higher caching – it caches the computed result for each element.
  • Use Cases:

    • View is suitable for optimizing chained transformation operations on finite collections.
    • LazyList is suitable for generating and operating on potentially infinite sequences or scenarios where caching results is needed.

In summary, both View and LazyList are used for lazy evaluation, but they differ in caching behavior, timing of execution, and optimal use cases.

Pitfalls

Is def considered lazy in Scala?

In Scala, the def keyword is used to define functions or methods and is not directly considered lazy evaluation. However, functions or methods defined using def do have a form of delayed computation because they are executed only when called. While this shares some similarity with lazy evaluation, it’s not the same concept.

  1. On-demand computation: Functions or methods defined using def are executed only when called, similar to lazy evaluation, as computation is deferred until needed.

  2. Recomputation on each call: Unlike lazy val, functions or methods defined using def are recomputed every time they are called. lazy val, on the other hand, computes its value once upon first access and then caches it for subsequent use.

i.e.,

  • In Scala, def is primarily used for defining functions and methods, and while it has a feature of on-demand computation, it’s not a typical representation of lazy evaluation.
  • However, functions or methods defined with def share some similarity with lazy evaluation because they defer computation until they are actually called.
  • The main difference is that def functions or methods are recomputed every time they are called, whereas lazy val computes once and caches the result for reuse.

Is val x: LazyList already lazy, or do you need lazy val x: LazyList?

In Scala, when you declare a variable of type LazyList, the LazyList itself is already lazy, regardless of whether you use the lazy val keyword. The laziness of a LazyList is determined by its internal implementation, where each element is computed only upon first access. This means that even if you use a regular val to define a LazyList, it still maintains the lazy evaluation behavior.

Leave a Reply

Your email address will not be published. Required fields are marked *