Last updated on
Week 5 Debrief: anagrams
tips, unit tests versus integration tests, and VSCode scrollback buffers
Congrats on completing your fifth week of CS-214! Here is a round-up of interesting questions, tips for exercises and labs, and general notes about the course.
Administrivia
-
The unguided callback starts this week!
- Read the instructions.
- Submit your team on Moodle by Fri, Oct 18, 23:00.
-
We have posted a study guide to help you prepare for the midterm.
-
We are still working through your feedback. Thanks a lot for helping improve this course through our internal polls and through the retour indicatif!
Interesting Ed questions
-
Scala
- Explaining the interaction of
for
and assignments - Should I try to always use tail recursion? (tl;dr: No! If you are not recursing deeply, then tail recursion will often be an unnecessary complication.)
- Explaining the interaction of
-
What are the subsets of multiset
List((a, 2), (b, 2), (c, 2))
?.
Reading tests
It can be overwhelming when SBT prints out a long list of failing tests back to you. In general, we recommend the following process:
- Start by testing each of your functions on small inputs, in a worksheet. (If you have trouble configuring worksheets, ask us in help sessions!)
- Once you’re confident that your code is correct, locate the corresponding tests:
- Open the test suite files under
src/test/scala/
. - Identify the ones that use your function, and try to get a sense of what they do.
- Open the test suite files under
- Run only the relevant tests, using
testOnly
. (Reminder about how to usetestOnly
.) - If a test fails, clear your terminal, then rerun just that one, using
testOnly
. - If it’s not clear what’s happening, copy the test to your worksheet, and experiment there.
Unit tests versus integration tests
Unit tests test individual functions in isolation, whereas integration tests check whether functions work well together. Consider the following test in anagrams
, for example:
test(f"sentenceOccurrences: $input (${points}pts)"):
assertEquals(sentenceOccurrences(List(input)), CharMultiSet.from(normalizeString(input)))
This test checks sentenceOccurences
and CharMultiSet.from
together. It’s unavoidable: since we let you chose your representation for the MultiSet
, we can’t hardcode an answer: we have to rely on your implementation of from
. But as a result, if your from
is incorrect, then the “expected” result printed by sbt test
can be wrong.
We recommend that you start by testing each function in isolation, using a worksheet or your own unit tests.
Computing subsets
A tricky part of the anagrams
lab is the subsets
function. To solve it:
-
Make sure you understand the definition:
m1
is a subset ofm2
if all elements ofm1
are inm2
with larger or equal multiplicities.- Is
m1 = List(('a', 1))
a subset ofm2 = List(('a', 3), ('b', 4))
? - Is
m1 = List(('a', 5), ('b', 3))
a subset ofm2 = List(('a', 3), ('b', 4))
? - Is
m1 = List(('a', 1), ('b', 4), ('c', 2))
a subset ofm2 = List(('a', 3), ('b', 4))
?
Solution
- Yes.
- No (multiplicity of
a
too high). - No (
c
not inm2
).
- Is
-
Consider how you might enumerate subsets of a given multiset. What are the subsets of
List(('a', 3))
?Solution
They are
List()
,List(('a', 1))
,List(('a', 2))
,List(('a', 3))
. -
Assume your multiset is
('a', 3) :: rest
. How can you combine the subsets ofList(('a', 3))
and those ofrest
?
This last question should lead you to writing the complete comprehension.
Dealing with long error messages in VSCode
The terminal in VSCode has a limited “scrollback buffer”. That means that if a command produces too much output (e.g. if sbt
prints a long stack trace, maybe due to a stack overflow), VSCode will start discarding the earlier parts of the output as new output arrives.
One solution to this problem is to simply increase the number of lines VSCode stores. This can be done by changing “Integrated: Scrollback” option in VSCode’s settings:
Alternatively, you can save a command’s output to a file. For example, in Bash, the following will write the output of sbt test
to test_output.txt
(replacing any previous contents):
$ sbt test > test_output.txt
Use the tee
command (Tee-Object
in PowerShell on Windows) if you want to show the output and write it to a file at the same time:
$ sbt test | tee test_output.txt
Help, my test is running out of memory!
The anagrams.MultiSetTest.subsets: (500a, 500b) (3pts)
test should be rather quick if your implementation is correct. If it’s not, however, then our testing framework will try to print a pretty message, which won’t work well: computing a nice diff for such a large set can be very slow. If you run into this problem, change this:
def testSubsets[A](message: String, points: Int, result: MultiSet[A], expected: Seq[MultiSet[A]]) =
test(f"subsets: $message (${points}pts)"):
assertEquals(result.subsets.toSet, expected.toSet)
to this:
def testSubsets[A](message: String, points: Int, result: MultiSet[A], expected: Seq[MultiSet[A]]) =
test(f"subsets: $message (${points}pts)"):
assert(result.subsets.toSet == expected.toSet)