Last updated on
Pattern Matching
Welcome to week 3 of CS-214 — Software Construction!
This exercise set is intended to help you practice pattern matching.
As usual, ⭐️ indicates the most important exercises and questions and 🔥 indicates the most challenging ones. Exercises or questions marked 🧪 are intended to build up to concepts used in this week’s lab.
You do not need to complete all exercises to succeed in this class, and you do not need to do all exercises in the order they are written.
We strongly encourage you to solve the exercises on paper first, in groups. After completing a first draft on paper, you may want to check your solutions on your computer. To do so, you can download the scaffold code (ZIP).
In previous exercises and labs, we used if
conditionals and .
field accessors to write functions on data types such as lists or trees. This week, we’ll use pattern matching to make these functions more succinct and readable. We’ll move from this:
def reduceIf(f: (Int, Int) => Int)(l: IntList): Int =
if l.isEmpty then throw IllegalArgumentException("Empty list!")
else if l.tail.isEmpty then l.head
else f(l.head, reduceIf(f)(l.tail))
src/main/scala/patmat/IntList.scala
to this:
def reduceMatch(f: (Int, Int) => Int)(l: IntList): Int =
l match
case IntNil => throw IllegalArgumentException("Empty list!")
case IntCons(hd, IntNil) => hd
case IntCons(hd, tl) => f(hd, reduceMatch(f)(tl))
src/main/scala/patmat/IntList.scala
Most functional programmers find the second one much more readable, because it aligns the way the data is destructed (taken apart into a head and a tail) and the way the data is constructed (assembled from a head and a tail):
def constructDestruct =
IntCons(1, IntCons(2, IntNil)) match
case IntCons(a, IntCons(b, IntNil)) =>
println(f"Found $a and $b")
case _ => throw Exception("Not possible!")
src/main/scala/patmat/IntList.scala
Previously, we wrote IntNil()
for empty IntList
s. Now that we know about enum
s and case classes, we can use the more succinct and convenient syntax IntNil
(no parentheses).
If you find yourself looking for more examples of pattern matching after completing this set, consider revisiting week 1 and week 2 exercises and redefining all the functions with pattern matching — it’s a great exercise.
Weekdays
Days of the week are a great example of simple enumerations. So, which day is tomorrow? Let’s implement a function to find out.
The enum
representing weekdays is defined as follows:
enum Weekday:
case Monday
case Tuesday
case Wednesday
case Thursday
case Friday
case Saturday
case Sunday
src/main/scala/patmat/Weekday.scala
Complete the following two functions:
-
next
returns the next day of the week:def next(d: Weekday): Weekday = ???
src/main/scala/patmat/WeekdayOps.scala
-
prev
returns the previous day of the week:def prev(d: Weekday): Weekday = ???
src/main/scala/patmat/WeekdayOps.scala
Want to test your code? Run testOnly WeekdayOpsTest
in sbt
.
This exercise is taken from Logical Foundations, a book about mathematical proofs of programs, and translated into Scala.
Tri-booleans ⭐️
By now you’re very familiar with Booleans. But in real life, not every thing is Yes
or No
: sometimes we just don’t know! Tri-boolean logic helps with this by adding an indeterminate value, Maybe
:
enum TriBool:
case Yes
case No
case Maybe
src/main/scala/patmat/TriBool.scala
Maybe
is like a Boolean value that is not yet known.
-
Take the time to think of how
Maybe
combines withYes
andNo
.-
If I have two objects, one which is for sure blue, and one that may be blue, then can I say that I have at least one blue item? Is that for sure or maybe? What does this entail about the “or” operator on tri-Booleans?
-
If I have the same two objects (one blue, one maybe blue), can I conclusively say that not all of my objects are blue? Can I promise that they are all blue? What does this entail about the “and” operator on tri-Booleans?
-
-
Implement the following operations (we expect that you’ll find pattern-matching very nice for that!)
def neg(b: TriBool): TriBool = ???
src/main/scala/patmat/TriBoolOps.scala
def and(b1: TriBool, b2: TriBool): TriBool = ???
src/main/scala/patmat/TriBoolOps.scala
def or(b1: TriBool, b2: TriBool): TriBool = ???
src/main/scala/patmat/TriBoolOps.scala
def nand(b1: TriBool, b2: TriBool): TriBool = ???
src/main/scala/patmat/TriBoolOps.scala
Want to test your code? Run
testOnly TriBoolOpsTest
insbt
.nand
is a very surprising operator. If you’re not familiar with it, inspect the test cases, or read more about it! -
Try proving on paper the following properties of these functions:
- For any
a
,neg(neg(a)) = a
. - For any
a
andb
,neg(and(a, b)) = or(neg(a), neg(b))
. - For any
a
andb
,neg(or(a, b)) = and(neg(a), neg(b))
.
- For any
Contexts 🧪
Now that you have a bit of experience with pattern matching, let’s use it to construct more complex types. In this exercise we’ll study contexts, which are essentially lists of keys and values. This exercise is particularly useful for this week’s lab.
Our contexts associate names (String
s) with values (Int
s), and let us do the following:
- Create an empty context;
- Add an additional key/value pair (called a “binding”) to a context;
- Look up a key to retrieve the corresponding value;
- Remove a key and its corresponding value.
For instance, we can have a context which associates "x"
with 1
and "y"
with 2
.
We can then look up the keys: looking up "x"
in the context produces 1
, looking up "y"
produces 2
, and looking up "z"
or any other name results in a not-found error.
We can add more mapping in the context: after associating "z"
with 3
, looking "z"
up in the new context will result in 3
, instead of an error.
Think of what Scala types and features you may use to represent a context and write them down before proceeding.
Reveal our choice of representation
We’ll start by representing contexts using an enum
(and we’ll see another representation later):
enum Context:
case Empty
case Cons(name: String, value: Int, tail: Context)
src/main/scala/patmat/EnumContext.scala
We have two cases:
Empty
, which represents an empty context;Cons
, which forms a new context by associating a new name with its value in an existing context.
For example, the context that associates "x"
with 1
and "y"
with 2
can be represented as
Cons("x", 1, Cons("y", 2, Empty))
, or Cons("y", 2, Cons("x", 1, Empty))
.
If you want to experiment with contexts in the playground, make sure to add import EnumContext.*
to bring Cons
and Empty
into the worksheet’s scope.
In our representation, how do you represent a new context with the additional binding associating "z"
to 3
on top of the other bindings in our example?
Solution
Cons("z", 3, Cons("x", 1, Cons("y", 2, Empty)))
When multiple bindings in a context have the same name,
only the outermost is effective.
For instance, given a context Cons("a", 1, Cons("b", 2, Cons("a", 3, Empty)))
,
looking up "a"
this context should return 1
instead of 3
.
Implement the following three functions:
-
empty
returns an empty context:def empty: Context = ???
src/main/scala/patmat/EnumContext.scala
-
cons
forms a new context by associating a new pair of name-value in an existing context:def cons(name: String, value: Int, rem: Context) = ???
src/main/scala/patmat/EnumContext.scala
-
lookup
looks a name up in a given context:def lookup(ctx: Context, name: String): LookupResult = ???
src/main/scala/patmat/EnumContext.scala
Notice the return type: we used another
enum
, calledLookupResult
, to capture two possible results: either finding a value, or indicating that the name could not be found.enum LookupResult: case Ok(v: Int) case NotFound
src/main/scala/patmat/EnumContext.scala
-
erase
drops all bindings with the givenname
in the context:def erase(ctx: Context, name: String): Context = ???
src/main/scala/patmat/EnumContext.scala
(As a bonus, try writing a function that removes just the first binding for a given name!)
-
Finally,
filter
drops all bindings that fail the testpred
(i.e. for whichpred
evaluated to `false):def filter(ctx: Context, pred: (String, Int) => Boolean): Context = ???
src/main/scala/patmat/EnumContext.scala
Want to test your code? Run testOnly EnumContextTest
in sbt
.
Tree Mapping and Tree Reducing
In week 1 we worked with trees using a clunky API of .left
, .right
, and .isEmpty
. This time, let’s do it the right way. And, to mix things up a bit, we’ll see a tree that contains values only at the leaves, instead of in every node (revisiting the examples from week 1 is another great exercise, though!)
The IntTree
enum
is defined as the following:
enum IntTree:
case Leaf(value: Int)
case Branch(left: IntTree, right: IntTree)
src/main/scala/patmat/IntTree.scala
Here is an example:
What is the enum
representation of the tree below?
Solution
Branch(Branch(Leaf(1), Leaf(2)), Leaf(3))
Want to test your code? Run testOnly IntTreeOpsTest
in sbt
.
Tree Mapping 🧪
Let’s start with the treeMap
function:
def treeMap(tree: IntTree, op: Int => Int): IntTree =
???
src/main/scala/patmat/IntTreeOps.scala
It takes a tree and a function, and creates a new tree by applying the function on the value of each leaf node in the tree.
For instance, let’s name the tree we just saw in the diagram as t
.
The following diagram dipicts the computation of treeMap(t, x => x + 1)
.
Tree Reducing
The second task is the treeReduce
function:
def treeReduce(tree: IntTree, op: (Int, Int) => Int): Int =
???
src/main/scala/patmat/IntTreeOps.scala
Given a tree and an operator, the function reduces the values in the tree with the operator.
To see what that means, let’s illustrate the call treeReduce(t, (a, b) => a + b)
in diagrams,
which essentially computes the sum of all values in the leaf node of t
.
Intuitively, it is a bottom-up aggregation of values in the leaves.
The values flow from the bottom to the root.
At each branch node, the aggregated results from the two children are merged using op
, the operator supplied to treeReduce
.
But many of us prefer to think of it as a “pull” operation: to compute its own value, each internal node “pulls” the values of the subtrees by making recursive call.
A bit more formally,
- For each branch node, e.g.
Branch(left, right)
, The function should recurse on bothleft
andright
and merge the results withop
. - For the leaf node, e.g.
Leaf(value)
, the value is returned.
Map and Reduce
The last task is the treeMapReduce
function, which first maps the stores in each leaf with the mapper
,
then reduces these mapped values using reducer
.
Start with a version that maps values to strings, then reduces the strings on the leaf to the result string:
def treeMapReduce(tree: IntTree, mapper: Int => String, reducer: (String, String) => String): String =
???
src/main/scala/patmat/IntTreeOps.scala
Now implement a version on Double
s:
def treeMapReduceDouble(tree: IntTree, mapper: Int => Double, reducer: (Double, Double) => Double): Double =
???
src/main/scala/patmat/IntTreeOps.scala
Are there similarities? Soon we will learn how to unify the two.
IntList
Let’s implement functions on IntList
again, this time with pattern matching.
Before starting this part of exercises,
we suggest you implementing several IntList
questions from previous weeks’ exercises using pattern matching
to warm yourself up!
Want to check answers in this section on computer?
Test them with testOnly IntListOpsTest
.
Polish Notation, take 2
Now that we know pattern matching, we can make a much nicer version of the polishEval
function from week 1.
-
Create appropriate types to encode atoms (operators and numbers) and Polish notation expressions (lists of atoms):
// Replace these types with `enum`s type PolishNotationAtom = Unit type PolishNotation = Unit
src/main/scala/patmat/PolishNotation.scala
-
Rewrite the two provided examples to use your notation:
def plusOneTwo: PolishNotation = // + 1 2 ??? def plusTwoTimesThreeFour: PolishNotation = // + 2 * 3 4 ???
src/main/scala/patmat/PolishNotation.scala
-
Rewrite the
polishEval
function so that it uses pattern matching:def polishEval(l: PolishNotation): (Int, PolishNotation) = ???
src/main/scala/patmat/PolishNotation.scala
The
InvalidExpression
error is still needed, but you should not need theInvalidOperatorNumber
exception (instead, negative numbers can be treated just like nonnegative ones). -
Compare your new version with the original
if
-based implementation. Which one is more readable?
zipWith
⭐️
map
, fold
, and other functions that we have previously studied operate on a single list. We have already seen that some of these generalize to trees; here, we will see how one of them (map
) generalizes to multiple lists. This function is called zipWith
in Scala:
def zipWith(l1: IntList, l2: IntList, op: (Int, Int) => Int): IntList =
???
src/main/scala/patmat/IntListOps.scala
Here is one possible specification for zipWith
, using l[n]
to mean “the n-th element of list l
”:
Given two lists
xs
andys
and an operatorop
, letzs
bezipWith(xs, ys, op)
. Then,zs
should have lengthmin(length(xs), length(ys))
, and thei
-th element ofzs
should beop(xs[i], ys[i])
for alli
.
Note the part about the length of zs
. For instance, zipWith(IntNil, IntCons(1, Nil), (x, y) => x + y)
should equal to IntNil
.
Implement zipWith
. Do you see a connection with map
?
A modular implementation of zipWith
⭐️
zipWith
does a lot of work at once, and we can make it a bit more modular. For this, let’s define an intermediate type IntIntList
of lists of pairs of ints:
enum IntIntList:
case IntIntNil
case IntIntCons(xy: (Int, Int), xs: IntIntList)
import IntIntList.*
src/main/scala/patmat/IntList.scala
-
Define a function
zip
to construct a list of pairs from a pair of lists:def zip(l1: IntList, l2: IntList): IntIntList = ???
src/main/scala/patmat/IntListOps.scala
-
Define a function
unzip
to construct a pair of lists from a list of pairs:def unzip(l: IntIntList): (IntList, IntList) = ???
src/main/scala/patmat/IntListOps.scala
-
Define a function
map2to1
on lists of pairs that produces a list of ints:def map2to1(op: (Int, Int) => Int)(l: IntIntList): IntList = ???
src/main/scala/patmat/IntListOps.scala
-
Rewrite
zipWith
using (some of) the functions above.def zipThenWith(l1: IntList, l2: IntList, op: (Int, Int) => Int): IntList = ???
src/main/scala/patmat/IntListOps.scala
-
Use
zip
to implement a functionmovingWindow
on lists that returns sequences of consecutive pairs in its input list. For example,movingWindow
applied to the lista b c d e
should produce the listab bc cd de
.def movingWindow(l: IntList): IntIntList = ???
src/main/scala/patmat/IntListOps.scala
-
What relation is there between
zip
andunzip
? What properties can you prove?
extractSecond
⭐️
Enumerations and case classes are often useful to distinguish multiple kinds of results. In an imperative language this would often be done with an exception: a function returns a result, and may throw various exceptions if the result cannot be computed. In Scala (and functional programming), we tend to use case classes instead, with one case per kind of result.
For example, a function that extracts the second element of a list might return a type like the following:
enum ExtractResult:
case SecondElem(i: Int)
case NotLongEnough
case EmptyList
src/main/scala/patmat/IntListOps.scala
When the input list is empty, extractSecond
returns EmptyList
;
when the input list is not long enough to have a second element (i.e. it only has one element),
extractSecond
returns NotLongEnough
;
finally, when the input list has a second element, the function returns it.
Implement extractSecond
:
def extractSecond(l: IntList): ExtractResult =
???
src/main/scala/patmat/IntListOps.scala
How would you implement extractSecond
with if
?
Jot down your implementation and compare it with the pattern-matching-based one.
Is one more readable?
Natural Numbers 🔥
There are lots of encodings for natural numbers (integers greater than or equal to zero). Today we will study one of the simplest ones:
enum UnaryNat:
case Zero
case Succ(pred: UnaryNat)
src/main/scala/patmat/UnaryNat.scala
There are two cases:
Zero
represents the zero, 0;Succ
representsn+1
given an existing natural numbern
.
One may think of a natural encoded in this way as a Matryoshka doll:
Zero
is a smallest, solid doll,Succ
wraps a doll inside a larger doll.
Then, the number represented by a UnaryNat
is the degree of nesting of the doll (how many nested dolls there are in addition to the core one). For example, 2
is represented as Succ(Succ(Zero))
.
What is the representation of 5?
Answer
Succ(Succ(Succ(Succ(Succ(Zero)))))
This is the second time that we mention Matryoshka dolls in this class. The first one was for lists… can you see a connection between lists and numbers encoded in this way? To see it more clearly, write the type definition for IntList
s with an enum
, and compare it to the definition above. What differences are there?
Going further: what does the length
function on lists translate to on UnaryNat
s?
Answer
UnaryNat
s are just like lists, but without the data. Zero
is like the Nil
case and Succ
the Cons
case, with pred
being the “tail
” of the number. And the length
of a UnaryNat
is its representation as a native Int
!
Implement the following functions:
-
Conversion to/from integers:
-
Here is an implementation of
fromInt
. Is anything wrong with it?def fromInt(n: Int): UnaryNat = n match case 0 => Zero case _ => Succ(fromInt(n - 1))
src/main/scala/patmat/UnaryNatOps.scala
-
Implement
toInt
.def toInt(n: UnaryNat): Int = ???
src/main/scala/patmat/UnaryNatOps.scala
-
-
Addition between
UnaryNat
sdef add(n: UnaryNat, m: UnaryNat): UnaryNat = ???
src/main/scala/patmat/UnaryNatOps.scala
-
Multiplication between
UnaryNat
s; for this, we have already provided you with an erroneous implementation: help us fix it… or write your own from scratch!def multiply(n: UnaryNat, m: UnaryNat): UnaryNat = m match case Zero => n case Succ(m0) => add(n, multiply(n, m0))
src/main/scala/patmat/UnaryNatOps.scala
- Run the test (
testOnly -- "*multiply*"
) to find a failing test case. Use it to pinpoint the problem. - Use
println
to inspect how the function is recursively invoked.
- Run the test (
-
Subtraction between
UnaryNat
s, saturating at zero (this means thatminus(x, y) == max(0, x - y)
). For instance,minus(Succ(Zero), Succ(Succ(Zero))) == Zero
.def minus(n: UnaryNat, m: UnaryNat): UnaryNat = ???
src/main/scala/patmat/UnaryNatOps.scala
-
Parity checks:
def isEven(n: UnaryNat): Boolean = ???
src/main/scala/patmat/UnaryNatOps.scala
def isOdd(n: UnaryNat): Boolean = ???
src/main/scala/patmat/UnaryNatOps.scala
Bonus: How would you implement
isEven
andisOdd
withif
? Compare the pattern-match-based and if-based implementations: which ones look better to you in terms of conciseness and readability?
To test your answers on your computer, run testOnly UnaryNatOpsTest
.
Contexts as Binary Search Trees
The contexts that we have seen up to this point are a bit inefficient: lookups take $\mathrm{O}(n)$, where $n$ is the number of bindings. Let’s do better. We want a data structure that allows us to do fast insertions, retrievals (lookups), and removals.
- Think about the requirements above. Are some of these operations already fast with function contexts? How about with linked-list contexts?
In this exercise, we will explore “binary search trees”, a kind of tree data structure that exploits the fact that keys can be ordered (sorted) to distribute the names into a sorted data structure in which lookups are more efficient (especially when compared to searching through every element in a linked list!).
In general, binary search trees (BSTs) can store sets of any types of values, or mappings from any types of keys to values. In this exercise we consider BSTs that associate strings to integers (like our previous contexts).
Definition
So, what are BSTs? Let’s see some examples!
-
The following tree stores the bindings
"a" → 1
,"b" → 2
and"c" → 0
. (We use"a" → 1
to mean a binding that associates"a"
(a key) with1
(a value)).Each box denotes a node, whose left part is the key and right part the value.
-
This one stores
"a" → 1
,"b" → 3
,"c" → 2
,"d" → 4
:
Look carefully at these examples. What properties (“invariants”) do their nodes have? Do you see a common pattern?
Show the answer
BST keys are ordered: a node with key k
and two children left
and right
has the BST property if all keys in its left
subtree are less than k
, and all keys in its right
subtree are greater than k
. Notice that there are no constraints on the values: in the examples above, the strings and are ordered but the values are not.
Construct a BST that stores "a" → 1
, "b" → 3
, "c" → 2
, "d" → 4
and "e" → 0
.
Beware: The answer is not unique: any tree that conforms to the principle and stores these bindings works.
Reference tree
Here is the Scala enum
that we’ll use to represent BSTs:
enum BSTContext:
case Leaf
case Branch(k: String, v: Int, l: BSTContext, r: BSTContext)
src/main/scala/patmat/BST.scala
It closely mirrors the structure of the trees pictured above:
- A
Branch
is a node, with thek
-v
pair and two subtrees. Leaf
represents missing subtrees. They are not drawn on the diagrams above: when a subtree is missing in the diagram, there is implicitly a leaf in its position.
Compare this type to the EnumContext that we saw previously. How does it differ?
Basic operations
- Implement
lookup
on BST contexts. If a branch’s root does not match the key that you are looking for, how can you (quickly) determine whether to explore the left or right subtree?
def lookup(bst: BSTContext, key: String): LookupResult =
???
src/main/scala/patmat/BSTOps.scala
- Implement
insert
:
def insert(bst: BSTContext, key: String, v: Int): BSTContext =
???
src/main/scala/patmat/BSTOps.scala
To get you a feeling of why looking up bindings in a BST is fast, consider the following questions:
-
In the worst case, how many steps it will take to lookup a BST of depth $d$? (The depth of a BST, or its height, is the length of the longest path from the root to a leaf.)
Answer
$d$.
-
How many bindings can a BST of depth $d$ store?
Answer
$2^d-1$.
-
If we were to store that many bindings in a context-like structure, what would be the worst-case lookup time?
answer
$O(2^d-1)$.
Rotations
The performance of lookup
in a BST context depends on how deep the node containing the corresponding key is: in the worst case (when the node is the furthest one from the root), a lookup can take time proportional to the depth of the tree.
This is bad! There is nothing in our implementation above that guarantees that a BST will be balanced, so we have no guarantees that the data will be evenly packed together: we may get some very long paths, and some very short ones. In fact, let’s look at an example:
- Draw the tree that results from calling
insert
with the following bindings, in order:"a" → 1
,"b" → 2
,"c" → 1
,"d" → 2
,"e" → 0
,"f" → 5
. What shape do you get?
It would be much preferable to produce balanced BSTs. A full balanced BST is beyond the scope of this exercise, but we can look at the most important operation: rotations.
The following diagram illustrates two rotations (left and right), with the root node colored in blue:
Take the left rotation as an example: it transforms an input tree in the form of the left hand side into the right hand side. The tree root changes from k1
to k2
.
- Why do rotations preserve the BST property?
- What does the tree look like if we right-rotate the aforementioned degenerated BST?
- What if we right-rotate twice?
Most self-balancing BSTs use rotations to preserve balance across insertions (famous examples include AVL trees and Red-black trees).
You should now be ready to implement rotations:
def rotateLeft(tree: BSTContext): BSTContext =
???
src/main/scala/patmat/BSTOps.scala
def rotateRight(tree: BSTContext): BSTContext =
???
src/main/scala/patmat/BSTOps.scala
Can you implement rotateLeft
with if
s instead of pattern matching?
Which version is more readable?
Contexts as Functions 🔥
Previously, we have seen how to represent contexts as Scala enums and as binary trees. Now you will try another alternative: representing them with functions!
type Context =
String => LookupResult
src/main/scala/patmat/FuncContext.scala
This exercise takes a lot of thinking, so we recommend pondering with other students, and reading the hint below.
Hint
It may seem odd to represent data with functions. To understand better what this means, you can think of it this way: what we are doing is reducing a type (contexts) to its most fundamental operation (lookups). This becomes a bit more obvious when we think of the lookup
method:
lookup
has type(Context, String) => LookupResult
.- If we curried it, it would have type
Context => String => LookupResult
- Now if we partially apply it, we get a new function
lookupCurried(ctx)
of typeString => LookupResult
This last type is precisely what we call a “context” in this exercise.
Implement the following four functions for function-based contexts:
def empty: Context =
???
src/main/scala/patmat/FuncContext.scala
def cons(name: String, value: Int, rem: Context): Context =
???
src/main/scala/patmat/FuncContext.scala
def lookup(ctx: Context, name: String): LookupResult =
???
src/main/scala/patmat/FuncContext.scala
def erase(ctx: Context, name: String): Context =
???
src/main/scala/patmat/FuncContext.scala
Want to test your code? Run testOnly FuncContextTest
in sbt
.
The following additional questions can help further your understanding:
- Compare the efficiency of
cons
,lookup
anderase
for enum-based and function-based contexts. When would you use one, and when would you use the other? - Are there operations that you can implement on function contexts but not enum contexts?
- Can you convert a function context into an enum context? How about the reverse?
- What happens when you
println
an enum-based context? (Try in the playground!)