Last updated on

2024-09-15 debrief: Scala, recursion exercises, and find lab

Congratulations on completing your first week of CS-214! Here is a round-up of interesting questions, tips for exercises and labs, and general notes about the course.

But first: thanks to all of you who reported typos in the labs and exercise to help us improve them!

Administrivia

Interesting links and Ed questions

A tip for find: you can use if inside val!

Remember that, in Scala, if is an expression. This means that you can use it in all contexts:

(if b1 then 1 else 2) + (if b2 then 10 else 20)

find.worksheet.sc

val result =
  if x == y then
    println("They are equal!")
    true
  else false
println(result)

find.worksheet.sc

How can the Scala compiler be written in Scala?

We saw on Wednesday that the Scala compiler is written in Scala. How is that possible? It’s thanks to a process called “bootstrapping”. Take a minute to think about it before reading on!

Spoilers!

In short, to write a Scala compiler in Scala, we:

  1. First, we write a compiler for a very small subset of Scala (call it nanoScala), in another language (for example, Java or C).
  2. Then, we write a compiler for nanoScala in nanoScala itself. Using the compiler in the previous step, we can now compile this compiler. Additionally, we can now use this compiled compiler to compile itself. At this point, we can stop using Java or C — it’s all nanoScala.
  3. Then, we start adding features to our nanoScala. We modify the compiler nanoScala compiler to accept a larger language, let’s say µScala. Since this compiler is still written in nanoScala, we can use the binary from the previous step to compile it. At this point we have a compiler that can compile µScala, so we can start using µScala features in our compiler… which is now written in µScala.
  4. Rinse and repeat, until we have implemented all features of Scala!

This is exactly how the Scala 3 compiler was written: from scratch, starting with a Scala 2 codebase.

Having a compiler compile itself leads to all sorts of fascinating questions and challenges, not least of which is the questions of trusting trust.

And it’s not just compilers! Build systems also have this chicken-and-egg problem. SBT is written in Scala: how is it compiled?

More on bootstrapping

Bootstrapping is amazing (we get to write the compiler for our awesome language, in our awesome language!), but it does come with downsides…

  • We can’t just keep the last compiler: we have to maintain the source code of all of the intermediate compilers, to be able to repeat the process. Another win for version control systems!

  • Relying on a bootstrapped binary compiler is a security issue: unless we have the whole chain of intermediate compilers and we rebuild it from scratch, how do we know what the compiler even does? This is the problem of trusting trust, linked to above. Fortunately, there are and mitigations.

Debugging with the substitution model

The substitution model that we studied in class is a great debugging tool. As an example, let’s try to walk through one of the problematic implementations of collectEven in recursion exercise.

Being a good CS-214 student, we’ve thought about the approach on paper. We’ve written down some pseudocode for what we want to do, given a list l (unfortunately, that pseudocode is wrong!):

  1. If l is empty, there’s nothing to do, we just return an empty list.
  2. If l is non-empty, we consider its head element. a. If the head is odd, we can collect even numbers from the tail, and then add back the head with IntCons. b. Otherwise, we can just call ourselves on the same list.

So we write some code:

def collectEven0(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else
    if l.head % 1 == 0 then IntCons(l.head, collectEven0(l.tail))
    collectEven0(l)

collectEven.worksheet.sc

Unfortunately this implementation doesn’t pass the tests. In fact, it even hangs forever! What to do?

Let’s first skim through the code to check for easy typos. Here, we want to check whether l.head is even, but mistakenly tried to find the remainder modulo 1 instead of 2! Nice catch, fixed:

def collectEven1(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else
    if l.head % 2 == 0 then IntCons(l.head, collectEven1(l.tail))
    collectEven1(l)

collectEven.worksheet.sc

Unfortunately, that’s not enough: the code still loops forever… and we can’t spot anything obvious. Time to use the substitution model to help us understand what’s going on. Let’s pick an example, say IntCons(2, IntCons(3, IntNil())). We’ll write it as [2 3] for succinctness. Let’s write down every step of our algorithm.

Going through the steps...
  1. collectEven([2 3])
  2. Substitute the function body…
    if [2 3] is empty then IntNil()
    else
      if [2 3].head is even
      then IntCons([2 3].head, collectEven([2 3].tail))
      else collectEven([2 3])
    
  3. [2 3] is not empty, so we can reduce the if expression by the else branch
    if [2 3].head is even
    then IntCons([2 3].head, collectEven([2 3].tail))
    else collectEven([2 3])
    
  4. [2 3].head is just 2, which is even, so we reduce to the then branch…
    IntCons(2, collectEven([2 3].tail))
    
  5. [2 3].tail is [3]:
    IntCons(2, collectEven([3]))
    
  6. Let’s substitute the function body again.
    IntCons(2, {
      if [3] is empty then IntNil()
      else
        if [3].head is even
        then IntCons([3].head, collectEven([3].tail))
        else collectEven([3])
    })
    
  7. [3] is not empty, reduce if.
    IntCons(2, {
      if [3].head is even
      then IntCons([3].head, collectEven([3].tail))
      else collectEven([3])
    })
    
  8. [3].head is 3, which is odd, so we reduce to the else branch…
    IntCons(2, collectEven([3]))
    

    … wait, we’ve seen this before!

After a few steps, we run into an infinite loop. Looking at our hand written algorithm again, we see that in the odd number case, we should just drop the head, and recursively continue with just the tail. Making some fixes…

def collectEven2(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else
    if l.head % 2 == 0 then IntCons(l.head, collectEven2(l.tail))
    collectEven2(l.tail)

collectEven.worksheet.sc

Good news: The tests finish! But they are still not passing… Looking at the tests shows stat with input [2 3] we return [], an empty list.

Let’s try the substitution method again, with the same input.

Going through the steps...
  1. collectEven([2 3])
  2. Substitute the function body…
    if [2 3] is empty then IntNil()
    else
      if [2 3].head is even
      then IntCons([2 3].head, collectEven([2 3].tail))
      else collectEven([2 3].tail)
    
  3. [2 3] is not empty, so we can reduce the if expression by the else branch
    if [2 3].head is even
    then IntCons([2 3].head, collectEven([2 3].tail))
    else collectEven([2 3].tail)
    
  4. [2 3].head is just 2, which is even, so we reduce to the then branch…
    IntCons(2, collectEven([2 3].tail))
    
  5. [2 3].tail is [3]:
    IntCons(2, collectEven([3]))
    
  6. Let’s substitute the function body again.
    IntCons(2, {
      if [3] is empty then IntNil()
      else
        if [3].head is even
        then IntCons([3].head, collectEven([3].tail))
        else collectEven([3].tail)
    })
    
  7. [3] is not empty, reduce if.
    IntCons(2, {
      if [3].head is even
      then IntCons([3].head, collectEven([3].tail))
      else collectEven([3].tail)
    })
    
  8. [3].head is 3, which is odd, so we reduce to the else branch…
    IntCons(2, collectEven([3].tail))
    
  9. [3].tail is the empty list [].
    IntCons(2, collectEven([]))
    
  10. Let’s substitute the function body again.
    IntCons(2, {
      if [] is empty then IntNil()
      else
       if [].head is even
       then IntCons([].head, collectEven([].tail))
       else collectEven([].tail)
    })
    
  11. This time, [] is empty, so we have the else branch substituted:
    IntCons(2, IntNil())
    

The result is [2], which should be correct, but our Scala code is not agreeing. This means that we must have misunderstood something about Scala itself. And indeed, once we look carefully, we notice a missing else in front of collectEven!

def collectEven2(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else
    if l.head % 2 == 0 then IntCons(l.head, collectEven2(l.tail))
    collectEven2(l.tail)

collectEven.worksheet.sc

In Scala, an if without an else is not an early return, so the results of the first two lines are simply ignored: we always return l.tail.

With better tests, we would have found the problem even faster: for example, collectEven(IntNil()) would have raised an exception when executing l.tail. We fix the problem, and…

def collectEven3(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else
    if l.head % 2 == 0 then IntCons(l.head, collectEven3(l.tail))
    else collectEven3(l.tail)

collectEven.worksheet.sc

Hooray, our tests are passing now! 🎉 The substitution model has helped us:

We picked a difficult case here (tests were hanging!), but usually a failing test will tell you which input is failing as well.

However, try to come up with inputs on your own as well! It will help to think about edge cases (negative numbers, empty lists, …)… and you won’t always have CS-214 staff to provide you with good test cases!