Last updated on
Week 1 Debrief: Scala, recursion exercises, and find
lab
Congratulations on completing your first week of CS-214! Here is a round-up of interesting questions, tips for exercises and labs, and general notes about the course.
But first: thanks to all of you who reported typos in the labs and exercise to help us improve them!
Administrivia
-
Following a show-of-hands vote on Wednesday, the CS-214 midterm date has been set to Wednesday, November 6th. Save the date!
The time has not been set yet. Please get in touch with us on Ed if you have another conflicting midterm.
-
We have posted solutions to exercise set 1.
-
We have a troubleshooting guide for setup problems in the
degrees
lab. Start there if you run into issues! -
CS-214 labs are typically due on Fridays, and
degrees
was no exception. If you missed thedegrees
deadline, don’t panic:degrees
doesn’t count.find
(released last Wednesday) does count, however, and is due next Friday. Don’t miss that deadline!As a reminder,
degrees
covers the following:- How to use the SBT build tool and start up Visual Studio Code and the Metals extension
- How to quickly experiment with your code using a worksheet.
- How to run specific tests with SBT.
Interesting links and Ed questions
- Scala syntax
- Exercises
- Setup issues
- Git
A tip for find
: you can use if
inside val
!
Remember that, in Scala, if
is an expression. This means that you can use it in all contexts:
(if b1 then 1 else 2) + (if b2 then 10 else 20)
find.worksheet.sc
val result =
if x == y then
println("They are equal!")
true
else false
println(result)
find.worksheet.sc
How can the Scala compiler be written in Scala?
We saw on Wednesday that the Scala compiler is written in Scala. How is that possible? It’s thanks to a process called “bootstrapping”. Take a minute to think about it before reading on!
Spoilers!
In short, to write a Scala compiler in Scala, we:
- First, we write a compiler for a very small subset of Scala (call it nanoScala), in another language (for example, Java or C).
- Then, we write a compiler for nanoScala in nanoScala itself. Using the compiler in the previous step, we can now compile this compiler. Additionally, we can now use this compiled compiler to compile itself. At this point, we can stop using Java or C — it’s all nanoScala.
- Then, we start adding features to our nanoScala. We modify the compiler nanoScala compiler to accept a larger language, let’s say µScala. Since this compiler is still written in nanoScala, we can use the binary from the previous step to compile it. At this point we have a compiler that can compile µScala, so we can start using µScala features in our compiler… which is now written in µScala.
- Rinse and repeat, until we have implemented all features of Scala!
This is exactly how the Scala 3 compiler was written: from scratch, starting with a Scala 2 codebase.
Having a compiler compile itself leads to all sorts of fascinating questions and challenges, not least of which is the questions of trusting trust.
And it’s not just compilers! Build systems also have this chicken-and-egg problem. SBT is written in Scala: how is it compiled?
More on bootstrapping
Bootstrapping is amazing (we get to write the compiler for our awesome language, in our awesome language!), but it does come with downsides…
-
We can’t just keep the last compiler: we have to maintain the source code of all of the intermediate compilers, to be able to repeat the process. Another win for version control systems!
-
Relying on a bootstrapped binary compiler is a security issue: unless we have the whole chain of intermediate compilers and we rebuild it from scratch, how do we know what the compiler even does? This is the problem of trusting trust, linked to above. Fortunately, there are and mitigations.
Debugging with the substitution model
The substitution model that we studied in class is a great debugging tool.
As an example, let’s try to walk through one of the problematic implementations of collectEven
in recursion
exercise.
Being a good CS-214 student, we’ve thought about the approach on paper. We’ve written down some pseudocode for what we want to do, given a list l
(unfortunately, that pseudocode is wrong!):
- If
l
is empty, there’s nothing to do, we just return an empty list. - If
l
is non-empty, we consider its head element. a. If the head is odd, we can collect even numbers from the tail, and then add back the head withIntCons
. b. Otherwise, we can just call ourselves on the same list.
So we write some code:
def collectEven0(l: IntList): IntList =
if l.isEmpty then IntNil()
else
if l.head % 1 == 0 then IntCons(l.head, collectEven0(l.tail))
collectEven0(l)
collectEven.worksheet.sc
Unfortunately this implementation doesn’t pass the tests. In fact, it even hangs forever! What to do?
Let’s first skim through the code to check for easy typos.
Here, we want to check whether l.head
is even, but mistakenly tried to find the
remainder modulo 1 instead of 2! Nice catch, fixed:
def collectEven1(l: IntList): IntList =
if l.isEmpty then IntNil()
else
if l.head % 2 == 0 then IntCons(l.head, collectEven1(l.tail))
collectEven1(l)
collectEven.worksheet.sc
Unfortunately, that’s not enough: the code still loops forever… and we can’t spot anything obvious.
Time to use the substitution model to help us understand what’s going on.
Let’s pick an example, say IntCons(2, IntCons(3, IntNil()))
. We’ll write it as [2 3]
for succinctness. Let’s write down every step of our algorithm.
Going through the steps...
collectEven([2 3])
- Substitute the function body…
if [2 3] is empty then IntNil() else if [2 3].head is even then IntCons([2 3].head, collectEven([2 3].tail)) else collectEven([2 3])
[2 3]
is not empty, so we can reduce theif
expression by theelse
branchif [2 3].head is even then IntCons([2 3].head, collectEven([2 3].tail)) else collectEven([2 3])
[2 3].head
is just 2, which is even, so we reduce to thethen
branch…IntCons(2, collectEven([2 3].tail))
[2 3].tail
is[3]
:IntCons(2, collectEven([3]))
- Let’s substitute the function body again.
IntCons(2, { if [3] is empty then IntNil() else if [3].head is even then IntCons([3].head, collectEven([3].tail)) else collectEven([3]) })
[3]
is not empty, reduceif
.IntCons(2, { if [3].head is even then IntCons([3].head, collectEven([3].tail)) else collectEven([3]) })
[3].head
is3
, which is odd, so we reduce to theelse
branch…IntCons(2, collectEven([3]))
… wait, we’ve seen this before!
After a few steps, we run into an infinite loop. Looking at our hand written algorithm again, we see that in the odd number case, we should just drop the head, and recursively continue with just the tail. Making some fixes…
def collectEven2(l: IntList): IntList =
if l.isEmpty then IntNil()
else
if l.head % 2 == 0 then IntCons(l.head, collectEven2(l.tail))
collectEven2(l.tail)
collectEven.worksheet.sc
Good news: The tests finish! But they are still not passing…
Looking at the tests shows stat with input [2 3]
we return []
, an empty list.
Let’s try the substitution method again, with the same input.
Going through the steps...
collectEven([2 3])
- Substitute the function body…
if [2 3] is empty then IntNil() else if [2 3].head is even then IntCons([2 3].head, collectEven([2 3].tail)) else collectEven([2 3].tail)
[2 3]
is not empty, so we can reduce theif
expression by theelse
branchif [2 3].head is even then IntCons([2 3].head, collectEven([2 3].tail)) else collectEven([2 3].tail)
[2 3].head
is just 2, which is even, so we reduce to thethen
branch…IntCons(2, collectEven([2 3].tail))
[2 3].tail
is[3]
:IntCons(2, collectEven([3]))
- Let’s substitute the function body again.
IntCons(2, { if [3] is empty then IntNil() else if [3].head is even then IntCons([3].head, collectEven([3].tail)) else collectEven([3].tail) })
[3]
is not empty, reduceif
.IntCons(2, { if [3].head is even then IntCons([3].head, collectEven([3].tail)) else collectEven([3].tail) })
[3].head
is3
, which is odd, so we reduce to theelse
branch…IntCons(2, collectEven([3].tail))
[3].tail
is the empty list[]
.IntCons(2, collectEven([]))
- Let’s substitute the function body again.
IntCons(2, { if [] is empty then IntNil() else if [].head is even then IntCons([].head, collectEven([].tail)) else collectEven([].tail) })
- This time,
[]
is empty, so we have theelse
branch substituted:IntCons(2, IntNil())
The result is [2]
, which should be correct, but our Scala code is not agreeing. This means that we must have misunderstood something about Scala itself. And indeed, once we look carefully, we notice a missing else
in front of collectEven
!
def collectEven2(l: IntList): IntList =
if l.isEmpty then IntNil()
else
if l.head % 2 == 0 then IntCons(l.head, collectEven2(l.tail))
collectEven2(l.tail)
collectEven.worksheet.sc
In Scala, an if
without an else
is not an early return
, so the results of the first two lines are simply ignored: we always return l.tail
.
With better tests, we would have found the problem even faster: for example, collectEven(IntNil())
would have raised an exception when executing l.tail
. We fix the problem, and…
def collectEven3(l: IntList): IntList =
if l.isEmpty then IntNil()
else
if l.head % 2 == 0 then IntCons(l.head, collectEven3(l.tail))
else collectEven3(l.tail)
collectEven.worksheet.sc
Hooray, our tests are passing now! 🎉 The substitution model has helped us:
- Spot the bug in our thinking, and
- Spot the bug in our understanding of Scala syntax!
We picked a difficult case here (tests were hanging!), but usually a failing test will tell you which input is failing as well.
However, try to come up with inputs on your own as well! It will help to think about edge cases (negative numbers, empty lists, …)… and you won’t always have CS-214 staff to provide you with good test cases!