Last updated on

Week 5 Debrief: anagrams tips, unit tests versus integration tests, and VSCode scrollback buffers

Congrats on completing your fifth week of CS-214! Here is a round-up of interesting questions, tips for exercises and labs, and general notes about the course.

Administrivia

Interesting Ed questions

Reading tests

It can be overwhelming when SBT prints out a long list of failing tests back to you. In general, we recommend the following process:

  1. Start by testing each of your functions on small inputs, in a worksheet. (If you have trouble configuring worksheets, ask us in help sessions!)
  2. Once you’re confident that your code is correct, locate the corresponding tests:
    1. Open the test suite files under src/test/scala/.
    2. Identify the ones that use your function, and try to get a sense of what they do.
  3. Run only the relevant tests, using testOnly. (Reminder about how to use testOnly.)
  4. If a test fails, clear your terminal, then rerun just that one, using testOnly.
  5. If it’s not clear what’s happening, copy the test to your worksheet, and experiment there.

Unit tests versus integration tests

Unit tests test individual functions in isolation, whereas integration tests check whether functions work well together. Consider the following test in anagrams, for example:

test(f"sentenceOccurrences: $input (${points}pts)"):
  assertEquals(sentenceOccurrences(List(input)), CharMultiSet.from(normalizeString(input)))

This test checks sentenceOccurences and CharMultiSet.from together. It’s unavoidable: since we let you chose your representation for the MultiSet, we can’t hardcode an answer: we have to rely on your implementation of from. But as a result, if your from is incorrect, then the “expected” result printed by sbt test can be wrong.

We recommend that you start by testing each function in isolation, using a worksheet or your own unit tests.

Computing subsets

A tricky part of the anagrams lab is the subsets function. To solve it:

  1. Make sure you understand the definition: m1 is a subset of m2 if all elements of m1 are in m2 with larger or equal multiplicities.

    1. Is m1 = List(('a', 1)) a subset of m2 = List(('a', 3), ('b', 4))?
    2. Is m1 = List(('a', 5), ('b', 3)) a subset of m2 = List(('a', 3), ('b', 4))?
    3. Is m1 = List(('a', 1), ('b', 4), ('c', 2)) a subset of m2 = List(('a', 3), ('b', 4))?
    Solution
    1. Yes.
    2. No (multiplicity of a too high).
    3. No (c not in m2).
  2. Consider how you might enumerate subsets of a given multiset. What are the subsets of List(('a', 3))?

    Solution

    They are List(), List(('a', 1)), List(('a', 2)), List(('a', 3)).

  3. Assume your multiset is ('a', 3) :: rest. How can you combine the subsets of List(('a', 3)) and those of rest?
    This last question should lead you to writing the complete comprehension.

Dealing with long error messages in VSCode

The terminal in VSCode has a limited “scrollback buffer”. That means that if a command produces too much output (e.g. if sbt prints a long stack trace, maybe due to a stack overflow), VSCode will start discarding the earlier parts of the output as new output arrives.

One solution to this problem is to simply increase the number of lines VSCode stores. This can be done by changing “Integrated: Scrollback” option in VSCode’s settings:

Integrated Scrollback option

Alternatively, you can save a command’s output to a file. For example, in Bash, the following will write the output of sbt test to test_output.txt (replacing any previous contents):

$ sbt test > test_output.txt

Use the tee command (Tee-Object in PowerShell on Windows) if you want to show the output and write it to a file at the same time:

$ sbt test | tee test_output.txt

Help, my test is running out of memory!

The anagrams.MultiSetTest.subsets: (500a, 500b) (3pts) test should be rather quick if your implementation is correct. If it’s not, however, then our testing framework will try to print a pretty message, which won’t work well: computing a nice diff for such a large set can be very slow. If you run into this problem, change this:

def testSubsets[A](message: String, points: Int, result: MultiSet[A], expected: Seq[MultiSet[A]]) =
  test(f"subsets: $message (${points}pts)"):
    assertEquals(result.subsets.toSet, expected.toSet)

to this:

def testSubsets[A](message: String, points: Int, result: MultiSet[A], expected: Seq[MultiSet[A]]) =
  test(f"subsets: $message (${points}pts)"):
    assert(result.subsets.toSet == expected.toSet)