Last updated on

K-Window Minimum

Points
100 pts
Topics
Data Structures, Trees, Recursion, Debugging
Files to Submit
src/main/scala/kwindow/KWindow.scala
src/main/scala/kwindow/AVLOrderedSet.scala

Introduction

In this mini-lab, we will illustrate the famous equation “Programs = Algorithms + Data Structures”.

The $k$-window Minimum problem asks, for a sequence of items xs of size at least $k$, compute a sequence, ms, where each element is the minimum value of the corresponding $k$-window of the original sequence.

For example, for the following sequence of integers: $$ 1, 4, 6, 2, 5 $$ and $k = 3$, the corresponding $k$-window minimum sequence is $1, 2, 2$.

Step by Step

We can consider each $3$-window of the sequence and find the minimum as the following table demonstrates:

SequenceWindowMinimum
1, 4, 6, 2, 51, 4, 61
1, 4, 6, 2, 54, 6, 22
1, 4, 6, 2, 56, 2, 52
Is it just too simple?

Of course, in Scala, a simple solution to this problem is to find the minimum over a sliding window:

def kWindowMinSimple[A](xs: Seq[A], k: Int)(using ord: Ordering[A]): Seq[A] =
  require(k > 0)
  require(xs.size >= k)
  xs.sliding(k).map(_.min).toSeq

kwindow/./src/test/scala/KWindow.scala

However, this is very inefficient! We are re-calculating the minimum of every window, so our simple solution cannot handle longer sequences.

In this task, we will implement a more efficient solution to this problem with the help of an Ordered Set data structure.

What are Ordered Sets?

Ordered Sets are an extension of Multi-Sets, supporting the following operations:

  1. Insert a new value into the set.
  2. Remove a value from the set.
  3. Check whether the set contains a value.
  4. Get the minimum value of all values in the set.

Note that unlike Scala Set, ordered sets can have multiple copies of the same value, and the remove operation removes only one instance of the value (if it exists).

In Scala, we define the OrderedSet interface as a trait with the above operations as follows:

/** An ordered set of items of type `A`. Supports querying, insertion, deletion
  * of items as well as getting the minimum item.
  */
trait OrderedSet[A]:
  /** Returns the size of the tree. */
  def size: Int

  def isEmpty: Boolean = size == 0

  /** Returns a new [[OrderedSet]] with one instance of `item` inserted. */
  def insert(item: A): OrderedSet[A]

  /** Returns a new [[OrderedSet]] with one instance of `item` removed. */
  def remove(item: A): OrderedSet[A]

  /** Returns whether `item` is contained in the set. */
  def contains(item: A): Boolean

  /** Returns the minimum item contained in the tree.
    * @throws NoSuchElementException
    *   if the tree is empty.
    */
  def getMin: A

kwindow/./src/main/scala/kwindow/OrderedSet.scala

Other than the four main operations (insert, remove, contains and getMin), the Scala ordered set also supports getting the size and query whether the set isEmpty.

Implementing K-Window using Ordered Sets

Now that we know what an ordered set is, let’s try to solve $k$-window minimum with using ordered sets!

💪 Task: Implement kWindowMin (50 pts)

Implement kWindowMin. Your function should have the following signature:

/** Given a window size `k` > 0 and a sequence of items `xs` (of length at least
  * `k`), return a sequence that is a k-window minimum sequence of `xs`.
  */
def kWindowMin[A](
    xs: Seq[A],
    k: Int,
    emptySet: OrderedSet[A]
): Seq[A] =
  require(k > 0)
  require(xs.length >= k)
  ???

kwindow/./src/main/scala/kwindow/KWindow.scala

You are free to chose your algorithm, but it must use the provided emptySet, and its complexity must be no more than $\mathcal{O}(n \times c)$, where $n$ is the number of elements in xs and $c$ is the cost of removing or adding an element to the ordered set. We suggest the following algorithm:

  1. Construct an initial ordered set s from the first k elements of xs.
  2. For each remaining element e of the sequence, remove the oldest element of s and add e to the ordered set.
  3. At each step, extract the minimum of the set.

Note that you are given, in addition to the sequence xs and k, an empty OrderedSet[A].

You can run tests for this task using the testOnly command.

sbt:kwindow> testOnly -- *kWindowMin:*

Note that the tests are run with a NaiveOrderedSet implementation, which is not yet using an efficient ordered set, which we will work with in the next task!

We shall move to the more efficient AVLOrderedSet implementation in the next section, when we are sure our kWindowMin implementation is tested.

AVL Trees

Ordered Sets are usually implemented as binary search trees, where the item at the root of the tree is larger than or equal to all of the items in its left subtree, and smaller than or equal to all of the items in its right subtree. We call this condition the binary search tree (BST) condition of the tree.

Examples of BSTs

The following trees satisfy the BST condition:

Trees satisfying the BST condition

But the following trees do not

Trees not satisfying the BST condition

Note that a child sub-tree that is empty is simply omitted from the diagram.

On a binary search tree, obtaining the minimum value is simple: it is on the leftmost node of the tree!

We will be looking at a particular kind of BST implementation called AVL Trees.

Valid AVL Trees are binary search trees satisfying the following requirements:

The height of a tree is defined to be the longest path (in number of nodes) from the root to an empty node.

For simplicity, we sometimes omit the “valid” part and say just “AVL tree”.

Examples

The following trees are valid AVL trees (i.e., they verify the properties above):

AVL trees examples. Note that the first tree is an empty tree.

But the following trees are not AVL trees:

Trees that are not AVL Trees. Note the third example, where the sub-tree rooted at 10 is not an AVL tree.

In Scala, a Tree is defined as either Empty, or a Node with one value and two sub-Trees:

/** A binary tree containing items of type `A`. */
enum Tree[+A]:
  case Empty
  case Node(item: A, left: Tree[A], right: Tree[A])

  /** The height of the tree. */
  lazy val height: Int = this match
    case Empty                   => 0
    case Node(item, left, right) => left.height.max(right.height) + 1

kwindow/./src/main/scala/kwindow/AVLOrderedSet.scala

An AVLOrderedSet, being a binary search tree, also needs to know how the ordering of the elements are defined:

/** AVL tree is a balanced Binary Search Tree, implementing the [[OrderedSet]]
  * trait.
  *
  * It contains a [[Tree]] of items, sorted by the given
  * [[scala.math.Ordering]].
  */
case class AVLOrderedSet[A] private (tree: Tree[A])(using
    private val ord: Ordering[A]
) extends OrderedSet[A]:
  // ...

kwindow/./src/main/scala/kwindow/AVLOrderedSet.scala

… where Ordering[A] defines the ordering of A through its compare, lt (less than) and gt (greater than) methods.

💪 Task: Is a Tree a valid AVL Tree? (20 pts)

Implement the function isValidAVLTree, that takes a Tree and an Ordering, and returns whether the tree satisfies the requirements of an AVL tree.

/** Returns whether the given [[Tree]] is a valid AVL tree with respect to the
  * given [[Ordering]].
  */
def isValidAVLTree[A](tree: Tree[A])(using ord: Ordering[A]): Boolean =
  ???

kwindow/./src/main/scala/kwindow/AVLOrderedSet.scala

You can run tests for this task using the testOnly command.

sbt:kwindow> testOnly -- *isValidAVLTree*

Using kWindowMin with AVL Trees

With our kWindowMin implementation ready and tested, let’s try to combine it with an efficient ordered set implementation, so that our solution can fully handle tens of thousands of elements.

💪 Task: Fix the AVL Tree! (30 pts)

This is the final task of this exercise. You should only continue when you have passed all testcases from the last tasks!

We asked an anonymous CS-214 TA to provide you with an implementation of Ordered Set operations for AVL trees, that you can freely use. However, due to lack of sleep and overusage of coffee, the implementation is rushed and not well tested, so there might be a bug in one of the methods in the AVLOrderedSet class! The TA promises however that you only have to change at most one line of code to get it working…

Your task is to debug the AVL tree implementation so that kWindowMin tests using AVLOrderedSets pass. You can run tests for this task using the testOnly command.

sbt:kwindow> testOnly -- "*kWindowMin with AVLOrderedSet: *"

You should find details on the implementation of AVL Tree in the docstrings within AVLOrderedSet.scala.