Last updated on

Purely functional data structures

Welcome to the data structure exercises of CS-214!

Data structures are vital in programming: well-designed data structures make the retrieval and storage of information efficient.

In this exercise set, you will implement several data structures and use them in practice.

As before, exercises or questions marked ⭐️ are the most important.

We strongly encourage you to solve the exercises on paper first, in groups. After completing a first draft on paper, you may want to check your solutions on your computer. To do so, you can download the scaffold code (ZIP).

The data structure zoo

You are going to implement a few data structures on your own. For each data structure, we will provide you with its definition, an interface for the behavior of the data structure, and tests. You should read and understand the definition and complete your implementation according to the provided interface. Start on paper before writing code!

Throughout this exercise set, we care about purely functional data structures, without mutation. If you start using vars, while loops, or mutation, something is wrong!

Warm-up: Lists as stacks

A stack is a container that stores items in a first-in-last-out (FILO) manner. It supports two operations: push and pop. push adds an item onto the top of a stack, and pop removes an item from the top. The earliest item that was pushed into a stack lies at the bottom of it; the most-recently pushed item is at the top. When popping items, the later an item comes, the earlier it is popped. That’s why stacks are FILO.

We say “adds” and “removes”, but there should be no mutation: push and pop should return new stacks, leaving the original stacks unchanged. It would more precise (but a bit pedantic) to say that push, for example, creates a new stack with the same elements as the previous stack, plus one additional element at the top.

A stack is simply a list in disguise!

Your task is to implement a stack that conforms to the following interface:

abstract class Stack[X]:
  /** Push an element into the stack. */
  def push(elem: X): Stack[X]

  /** Pop an element from the stack. Return None if the stack is empty. */
  def pop: Option[(X, Stack[X])]

  /** Check whether the stack is empty. */
  def isEmpty: Boolean

src/main/scala/ds/Stack.scala

The methods are quite self-explanatory. We provide you with the following scaffold:

case class StackImpl[X](data: List[X]) extends Stack[X]:
  def push(elem: X): Stack[X] = ???
  def pop: Option[(X, Stack[X])] = ???
  def isEmpty: Boolean = ???

src/main/scala/ds/Stack.scala

Additionally, you have to implement the apply method for the Stack object, which returns an empty stack.

object Stack:
  /** Create an empty stack. */
  def apply[X](): Stack[X] =
    ???

src/main/scala/ds/Stack.scala

To test your implementation, run testOnly -- "*stack:*" in sbt console.

Double-ended queues ⭐

A Queue is a first-in-first-out (FIFO) container. In contrary to a stack, the earlier an item is pushed into a queue, the earlier it comes out of the queue.

A Double-Ended Queue (deque) is a more generalised queue that allows adding and removing elements from both ends.

A deque is typically implemented by two stacks, back-to-back. One stack stores elements from the front and the other from the back. For instance, the following deque:

    Front             Back
    1  ->  2  ->  3  ->  4

can be represented as (1 :: 2 :: Nil, 4 :: 3 :: Nil), if we use lists to implement stacks.

To understand how this deque works, let’s walk through an example of constructing it from an empty deque:

  1. Start with an empty deque: (Nil, Nil).
  2. After pushFirst(2): (2 :: Nil, Nil).
  3. After pushFirst(1): (1 :: 2 :: Nil, Nil).
  4. After pushLast(3): (1 :: 2 :: Nil, 3 :: Nil).
  5. After pushLast(4): (1 :: 2 :: Nil, 4 :: 3 :: Nil).

Now we have the deque shown in the diagram above. When popping from either end, if that end’s list is non-empty, it’s straightforward:

For popFirst() when the front list is non-empty (x :: xs, back):

For popLast() when the back list is non-empty (front, x :: xs):

For example, starting with (1 :: 2 :: Nil, 4 :: 3 :: Nil):

  1. After popFirst(): Returns 1, the new state is (2 :: Nil, 4 :: 3 :: Nil).
  2. After popLast(): Returns 4, the new state is (2 :: Nil, 3 :: Nil).

The interesting part comes when we try to pop from an end that’s empty. If we try to popFirst() when the front list is empty, we need to restructure the deque by:

For example, starting with (Nil, 4 :: 3 :: 2 :: 1 :: Nil):

  1. Reverse the back list: 1 :: 2 :: 3 :: 4 :: Nil.
  2. Make it the new front list: (1 :: 2 :: 3 :: 4 :: Nil, Nil).
  3. Then, popFirst() returns 1, and the new state is (2 :: 3 :: 4 :: Nil, Nil).

It is similar for popLast() when the back list is empty.

Given the deque (1 :: 2 :: Nil, 5 :: 4 :: 3 :: Nil), what happens after:

  1. popFirst()?
  2. Then popFirst() again?
  3. Then popFirst() one more time?
Solution
  1. After popFirst(): Returns 1, the new state is (2 :: Nil, 5 :: 4 :: 3 :: Nil).
  2. After popFirst() again: Returns 2, the new state is (Nil, 5 :: 4 :: 3 :: Nil).
  3. After popFirst() one more time: first restructure the deque to (3 :: 4 :: 5 :: Nil, Nil), then return 3, the new state is (4 :: 5 :: Nil, Nil).

Given the empty deque (Nil, Nil), what’s the state after:

  1. pushLast(1)
  2. pushFirst(2)
  3. pushLast(3)
  4. popFirst()
Solution
  1. After pushLast(1): (Nil, 1 :: Nil).
  2. After pushFirst(2): (2 :: Nil, 1 :: Nil) .
  3. After pushLast(3): (2 :: Nil, 3 :: 1 :: Nil).
  4. After popFirst(): Returns 2, the new state is (Nil, 3 :: 1 :: Nil).

Starting with (Nil, 3 :: 2 :: 1 :: Nil), what happens when you call popFirst()? Walk through the restructuring steps.

Solution
  1. Front list is empty, so we need to restructure.
  2. Reverse back list: 1 :: 2 :: 3 :: Nil.
  3. Make it front list: (1 :: 2 :: 3 :: Nil, Nil).
  4. Then popFirst() returns 1, the new state is (2 :: 3 :: Nil, Nil).

Now, your task is to implement the Deque data structure according to the following interface:

abstract class Deque[X]:
  /** Push an element to the front. */
  def pushFirst(elem: X): Deque[X]

  /** Pop an element from the front. */
  def popFirst: Option[(X, Deque[X])]

  /** Push an element to the back. */
  def pushLast(elem: X): Deque[X]

  /** Pop an element from the back. */
  def popLast: Option[(X, Deque[X])]

  /** Check whether the deque is empty. */
  def isEmpty: Boolean

src/main/scala/ds/Deque.scala

pushFirst and pushLast push an item to the front-end and back-end of the queue respectively. popFirst and popLast remove an item from the respective ends. When the deque is empty, both popFirst and popLast return None. Otherwise, they return the removed item and the resulting deque.

We have provided you with a scaffold:

case class DequeImpl[X](front: List[X], back: List[X]) extends Deque[X]:
  def pushFirst(elem: X): Deque[X] = ???
  def popFirst: Option[(X, Deque[X])] = ???
  def pushLast(elem: X): Deque[X] = ???
  def popLast: Option[(X, Deque[X])] = ???
  def isEmpty: Boolean = ???

src/main/scala/ds/Deque.scala

You should also complete the apply method of the Deque object which creates an empty deque:

object Deque:
  /** Create an empty deque. */
  def apply[X](): Deque[X] =
    ???

src/main/scala/ds/Deque.scala

To test your implementation, implement it on the computer, fire up an sbt console and run testOnly -- "*deque:*".

Quadtrees

In this exercise, you will implement a quadtree, a hierarchical data structure used to partition a 2D space into smaller regions. Quadtrees are particularly useful for spatial indexing, such as efficiently managing points in a large map or image.

We have seen quad trees many times during the semester. This time, it’s your turn to implement them! You can read up about them on Wikipedia.

In this exercise, your task is to first understand quadtrees on your own, then implement them according to our requirements.

You need to implement a quadtree that stores a set of points on a 2D space. The definition of Point is:

case class Point(x: Int, y: Int)

src/main/scala/ds/QuadTree.scala

We have also set up the Boundary class for you: it represents a rectangle region in 2D space.

case class Boundary(bottomLeft: Point, topRight: Point):
  /** Check whether this boundary is empty. */
  def isEmpty: Boolean =
    !(bottomLeft.x < topRight.x && bottomLeft.y < topRight.y)

  /** Check whether this boundary contains a point. */
  def contains(p: Point): Boolean =
    (p.x >= bottomLeft.x && p.x < topRight.x) && (p.y >= bottomLeft.y && p.y < topRight.y)

  /** Intersect this boundary with another. */
  def intersect(other: Boundary): Boundary =
    val startX = bottomLeft.x `max` other.bottomLeft.x
    val endX = topRight.x `min` other.topRight.x
    val startY = bottomLeft.y `max` other.bottomLeft.y
    val endY = topRight.y `min` other.topRight.y
    Boundary(Point(startX, startY), Point(endX, endY))

src/main/scala/ds/QuadTree.scala

This time, we will implement the data structure in a style where data and logic are separated. Specifically:

The first step is to define the data type:

// hint: define tree nodes as a enum?
type QuadTree = Nothing

src/main/scala/ds/QuadTree.scala

Then comes the logic part: we want you to define the following operations on a quadtree:

trait QuadTreeOps[T]:
  /** Create an empty quadtree on a `boundary` with a `capacity`. */
  def empty(boundary: Boundary, capacity: Int): T

  /** Insert a point into a quadtree. */
  def insert(tree: T, p: Point): T

  /** Return the points that fall within a boundary. */
  def query(tree: T, boundary: Boundary): List[Point]

src/main/scala/ds/QuadTree.scala

Specifically, you have to complete this scaffold:

object QuadTreeImpl extends QuadTreeOps[QuadTree]:
  def empty(boundary: Boundary, capacity: Int): QuadTree = ???
  def insert(tree: QuadTree, p: Point): QuadTree = ???
  def query(tree: QuadTree, boundary: Boundary): List[Point] = ???

src/main/scala/ds/QuadTree.scala

Here,

We have seen multiple variants of quadtrees. Here, we assume that:

  • Each quadtree has a boundary: a rectangle corresponding to the whole space that the quadtree describes: this is why the constructor takes a boundary parameter. Attempting to insert a point that doesn’t fall within the boundary of the quadtree should return the original tree, unmodified.
  • Each undivided quadtree has a capacity, above which it should split into smaller quadtrees. This is the capacity parameter of the empty constructor: up to capacity elements can be inserted before the quadtree splits.

To test your implementation on the computer, run testOnly -- "*quadtree:*".

A little spellchecker ⭐

In this section, we will first implement a Trie data structure, and then use it to implement a spell checker.

Tries

Trie is a data structure for efficient string lookup.

Your first task is to understand how trie works on your own, and then implement this data structure according to the following interface:

abstract class Trie[X]:
  def find(key: String): Option[X]
  def delete(key: String): Trie[X]
  def insert(key: String, value: X): Trie[X]

src/main/scala/ds/Trie.scala

The methods are self-explanatory.

We have provided you with a scaffold:

case class TrieImpl[X](next: Map[Char, TrieImpl[X]], value: Option[X]) extends Trie[X]:
  def find(key: String): Option[X] = None
  def delete(key: String): TrieImpl[X] = this
  def insert(key: String, value: X): TrieImpl[X] = this

src/main/scala/ds/Trie.scala

In addition, you should complete the following method which returns an empty trie tree:

object TrieImpl:
  def apply[X](): TrieImpl[X] =
    ???

src/main/scala/ds/Trie.scala

You can test your trie implementation on the computer with testOnly -- "*trie:*".

Spellchecking with tries ⭐

Now you will use the trie you just implemented in practice–building a spell checker!

The spell checker takes a text as input and returns a list of misspelled words:

trait SpellChecker:
  /** Spell-check the input text. Return a list of misspelled words. */
  def checkText(text: String): List[String] =
    tokenize(text).filterNot(checkWord)

  /** Check whether the `word` is in `dict`. */
  def checkWord(word: String): Boolean

src/main/scala/ds/SpellCheck.scala

We have set up the utility functions for splitting the text into words and normalizing them. Your main task is to implement the checkWord function, which determines whether the input word is well spelled.

The TrieSpellChecker is an implementation of this spell-checking interface using tries. You should complete the following scaffold:

class TrieSpellChecker(dict: Dict) extends SpellChecker:
  val trie: Trie[Unit] = build(dict)
  private def build(dict: Dict): Trie[Unit] = ???

  def checkWord(word: String): Boolean =
    // now this spell-checker is ALWAYS sound, though quite not complete :)
    false

src/main/scala/ds/SpellCheck.scala

The Dict is simply a List of strings:

type Dict = List[String]

src/main/scala/ds/SpellCheck.scala

Therefore, in checkWord you need to check whether the input word is inside the dictionary. Use a trie tree to do that. DO NOT use List.contains!

To test your implmentation, run testOnly -- "*spellcheck:*" in the sbt console.