Last updated on

Week 1: Structural Recursion and Version Control

Welcome to the first exercise session of CS-214, Software Construction!

This exercise set will help you practice writing Scala code, get used to some functional programming style, use recursion, and get familiar with version control concepts and with Git.

Exercise sessions are meant to be done together in groups, on whiteboards, on blackboards, or on paper. Read more in the course policies. The second part of this exercise set is one of the few exceptions to the rule, as Git exercises are better done in a terminal… but you are still encouraged to do them in groups!

Exercises marked with ⭐️ are the most important ones. Exercises marked with 🔥 are the most challenging ones. Exercises or questions marked 🧪 are intended to build up to concepts used in this week’s lab.

You do not need to complete all exercises to succeed in this class, and you do not need to do all exercises in the order they are written.

In the following weeks, we will encourage you to do the exercises before the lab. However, as this is your first week in CS-214, make sure to:

  1. Go through the Tools Setup to install everything you need.
  2. Do the degrees lab, as a tutorial on how to use SBT, VSCode and Metals.
  3. Only then, do the exercises.

Structural Recursion

For each exercise below, we provide the signature of the functions you need to write. The body is replaced with ???: that’s the part you need to come up with! We strongly recommend that you solve these exercises on paper first.

We strongly encourage you to solve the exercises on paper first, in groups. After completing a first draft on paper, you may want to check your solutions on your computer. To do so, you can clone the course exercises repository.

Recursion on lists

In this exercise, you will write recursive functions that operate on lists. These lists are represented as instances of the IntList class. An IntList can be either:

Hence, the list 1, 2 is represented by IntCons(1, IntCons(2, IntNil())).

IntList objects have three operations — .head, .tail, and .isEmpty — demonstrated in the following REPL session:

scala> import recursion.*

scala> IntCons(2, IntNil()).head
val res0: Int = 2

scala> IntCons(2, IntNil()).tail
val res1: recursion.IntList = IntNil()

scala> val oneTwo = IntCons(1, IntCons(2, IntNil()))
val oneTwo: recursion.IntCons = IntCons(1,IntCons(2,IntNil()))

scala> oneTwo.isEmpty
val res2: Boolean = false

scala> oneTwo.head
val res3: Int = 1

scala> oneTwo.tail
val res4: recursion.IntList = IntCons(2,IntNil())

scala> oneTwo.tail.isEmpty
val res5: Boolean = false

scala> oneTwo.tail.head
val res6: Int = 2

scala> oneTwo.tail.tail
val res7: recursion.IntList = IntNil()

scala> oneTwo.tail.tail.isEmpty
val res8: Boolean = true

These three operations are all that you need (in addition to recursion) to solve all of the exercises below.

In all of the exercises below, if you’re not sure where to start, ask yourself: what should my function return when given an empty list? What about a non-empty list?

length ⭐️

Implement a function that computes the length of an IntList:

def length(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

A linked list is a bit like a Matryoshka doll (called a “poupée russe” in French): you have to open the larger one to access the smaller ones inside. So ask yourself: how would you count how many nested dolls a large Matryoshka doll contains? Can you translate this process into code?

In class, you learned the substitution model. On paper, try evaluating the expression length(IntCons(1, IntCons(2, IntCons(3, IntNil())))) using the substitution model. Use this for inspiration to implement length.

def length(l: IntList): Int =
  if l.isEmpty then 0
  else 1 + length(l.tail)

recursion/src/main/scala/recursion/listOps.scala

allPositiveOrZero ⭐️

Implement a function that determines if all values in an IntList are positive or zero:

def allPositiveOrZero(l: IntList): Boolean =
  ???

recursion/src/main/scala/recursion/listOps.scala

What should this function return for an empty list? Why?

def allPositiveOrZero(l: IntList): Boolean =
  if l.isEmpty then true
  else l.head >= 0 && allPositiveOrZero(l.tail)

recursion/src/main/scala/recursion/listOps.scala

countPositive

Implement a function that counts the number of positive values in an IntList:

def countPositive(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def countPositive(l: IntList): Int =
  if l.isEmpty then 0
  else (if l.head > 0 then 1 else 0) + countPositive(l.tail)

recursion/src/main/scala/recursion/listOps.scala

sum

Implement a function that computes the sum of all elements in an IntList:

def sum(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def sum(l: IntList): Int =
  if l.isEmpty then 0
  else l.head + sum(l.tail)

recursion/src/main/scala/recursion/listOps.scala

product

Implement a function that computes the product of all elements in an IntList:

def product(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

There are multiple reasonable choices for empty lists, but one leads to a simpler function. Can you figure out which?

def product(l: IntList): Int =
  if l.isEmpty then 1
  else l.head * product(l.tail)

recursion/src/main/scala/recursion/listOps.scala

anyOdd

Implement a function that determines if any value in an IntList is odd:

def anyOdd(l: IntList): Boolean =
  ???

recursion/src/main/scala/recursion/listOps.scala

You can use x % 2 == 0 to check whether a number is even, but does it work to use x % 2 == 1 to check whether a number is odd?

What should this function return for an empty list? Why?

def anyOdd(l: IntList): Boolean =
  if l.isEmpty then false
  else (l.head % 2 != 0) || anyOdd(l.tail)

recursion/src/main/scala/recursion/listOps.scala

decrement

Implement a function that creates a new list whose values correspond to the ones in the original list, decremented by one:

def decrement(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

Hint

This exercise requires you to construct a new list! Use the IntNil() and IntCons() constructors for that purpose, and follow the structure of the original list.

Hint

Think of the structure of the computation: how would it look with Matryoshka dolls? In particular, think of what you need to do if you are given an empty list, and separately of what to do if you are given a non-empty list.

def decrement(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else IntCons(l.head - 1, decrement(l.tail))

recursion/src/main/scala/recursion/listOps.scala

collectEven ⭐️

Write a function that collects all even values from an IntList:

def collectEven(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def collectEven(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else if l.head % 2 == 0 then IntCons(l.head, collectEven(l.tail))
  else collectEven(l.tail)

recursion/src/main/scala/recursion/listOps.scala

min

Write a function min that retrieves the minimum of a list. If min is called on an empty list, it should throw an IllegalArgumentException using throw IllegalArgumentException("Empty list!").

def min(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def min(l: IntList): Int =
  if l.isEmpty then throw IllegalArgumentException("Empty list!")
  else if l.tail.isEmpty then l.head
  else
    val m = min(l.tail)
    (if l.head < m then l.head else m)

recursion/src/main/scala/recursion/listOps.scala

increment ⭐️

Implement a function that creates a new list whose values correspond to the ones in the original list incremented by one:

def increment(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def increment(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else IntCons(l.head + 1, increment(l.tail))

recursion/src/main/scala/recursion/listOps.scala

subtract

Implement a function that performs right-associative subtraction. That is, 1, 2, 3, 4, 5 should become 1 - (2 - (3 - (4 - 5))) = 3.

If subtract is called on an empty list, it should throw an IllegalArgumentException using throw IllegalArgumentException("Empty list!"):

def subtract(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def subtract(l: IntList): Int =
  if l.isEmpty then throw IllegalArgumentException("Empty list!")
  else if l.tail.isEmpty then l.head
  else l.head - subtract(l.tail)

recursion/src/main/scala/recursion/listOps.scala

removeOdd ⭐️

Write a function that creates a new list with all odd values removed:

def removeOdd(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

Can you find an algebraic relationship between collectEven and removeOdd?

def removeOdd(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else if l.head % 2 != 0 then removeOdd(l.tail)
  else IntCons(l.head, removeOdd(l.tail))

recursion/src/main/scala/recursion/listOps.scala

countEven

Implement a function that counts the number of even values in an IntList:

def countEven(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

Could you write countEven using collectEven and length?

def countEven(l: IntList): Int =
  if l.isEmpty then 0
  else (if l.head % 2 == 0 then 1 else 0) + countEven(l.tail)

recursion/src/main/scala/recursion/listOps.scala

/** `countEven` using `collectEven` and `length` */
def countEven2(l: IntList): Int =
  length(collectEven(l))

recursion/src/main/scala/recursion/listOps.scala

multiplyBy2

Implement a function that creates a new list whose values correspond to the ones in the original list multiplied by two:

def multiplyBy2(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def multiplyBy2(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else IntCons(2 * l.head, multiplyBy2(l.tail))

recursion/src/main/scala/recursion/listOps.scala

anyNegative

Implement a function that determines if any value in an IntList is negative:

def anyNegative(l: IntList): Boolean =
  ???

recursion/src/main/scala/recursion/listOps.scala

def anyNegative(l: IntList): Boolean =
  if l.isEmpty then false
  else l.head < 0 || anyNegative(l.tail)

recursion/src/main/scala/recursion/listOps.scala

allEven

Implement a function that determines if all values in an IntList are even:

def allEven(l: IntList): Boolean =
  ???

recursion/src/main/scala/recursion/listOps.scala

def allEven(l: IntList): Boolean =
  if l.isEmpty then true
  else (l.head % 2 == 0) && allEven(l.tail)

recursion/src/main/scala/recursion/listOps.scala

multiplyOdd

Implement a function that computes the product of all odd numbers in an IntList:

def multiplyOdd(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def multiplyOdd(l: IntList): Int =
  if l.isEmpty then 1
  else (if l.head % 2 != 0 then l.head else 1) * multiplyOdd(l.tail)

recursion/src/main/scala/recursion/listOps.scala

horner ⭐️

Given a polynomial:

$$a_0 + a_1x + a_2x^2 + a_3x^3 + \cdots + a_nx^n$$

We represent it as a list of coefficients $[a_0, a_1, a_2, \cdots, a_n]$.

Using the Horner’s rule:

$$a_0 + x \bigg(a_1 + x \Big(a_2 + x \big(a_3 + \cdots + x(a_{n-1} + x \, a_n) \cdots \big) \Big) \bigg)$$

Write a function that evaluates a polynomial given its list of coefficients and a value for $x$:

def horner(x: Int, l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def horner(x: Int, l: IntList): Int =
  if l.isEmpty then 0
  else l.head + x * horner(x, l.tail)

recursion/src/main/scala/recursion/listOps.scala

capAt0

Implement a function that creates a copy of a list with all numbers greater than 0 replaced with zeroes.

def capAtZero(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def capAtZero(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else IntCons(if l.head > 0 then 0 else l.head, capAtZero(l.tail))

recursion/src/main/scala/recursion/listOps.scala

removeZeroes

Implement a function that creates a copy of a list with all zeroes removed.

def removeZeroes(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def removeZeroes(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else if l.head == 0 then removeZeroes(l.tail)
  else IntCons(l.head, removeZeroes(l.tail))

recursion/src/main/scala/recursion/listOps.scala

reverseAppend

Implement a function that appends the reversed version of an IntList to another. For example, calling reverseAppend with the lists 1, 2, 3 and 7, 8, 9 should produce 3, 2, 1, 7, 8, 9:

def reverseAppend(l1: IntList, l2: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def reverseAppend(l1: IntList, l2: IntList): IntList =
  if l1.isEmpty then l2
  else reverseAppend(l1.tail, IntCons(l1.head, l2))

recursion/src/main/scala/recursion/listOps.scala

reverse ⭐️

Implement a function that reverses an IntList:

def reverse(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

Hint

Consider reusing reverseAppend.

def reverse(l: IntList): IntList =
  reverseAppend(l, IntNil())

recursion/src/main/scala/recursion/listOps.scala

takeWhilePositive

Implement a function that creates a new list containing the same elements as its input list, except that it discards every number starting from the first nonpositive number (either 0 or negative). That is, 1, 2, -3, 4 should produce just 1, 2.

def takeWhilePositive(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def takeWhilePositive(l: IntList): IntList =
  if l.isEmpty then IntNil()
  else if l.head > 0 then IntCons(l.head, takeWhilePositive(l.tail))
  else IntNil()

recursion/src/main/scala/recursion/listOps.scala

append

Implement a function that concatenates two lists:

def append(l1: IntList, l2: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

Bonus: Can you write it using just reverse and reverseAppend?

def append(l1: IntList, l2: IntList): IntList =
  if l1.isEmpty then l2
  else IntCons(l1.head, append(l1.tail, l2))

def appendUsingReverseAppend(l1: IntList, l2: IntList): IntList =
  reverseAppend(reverse(l1), l2)

recursion/src/main/scala/recursion/listOps.scala

collectMultiples

Write a function that collects all the multiples of an integer d in a list l:

def collectMultiples(d: Int, l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def collectMultiples(d: Int, l: IntList): IntList =
  if l.isEmpty then IntNil()
  else if l.head % d == 0 then IntCons(l.head, collectMultiples(d, l.tail))
  else collectMultiples(d, l.tail)

recursion/src/main/scala/recursion/listOps.scala

last

Implement a function that retrieves the last element from an IntList.

If last is called on an empty list, it should throw an IllegalArgumentException using throw IllegalArgumentException("Empty list!").

def last(l: IntList): Int =
  ???

recursion/src/main/scala/recursion/listOps.scala

def last(l: IntList): Int =
  if l.isEmpty then throw IllegalArgumentException("Empty list!")
  else if l.tail.isEmpty then l.head
  else last(l.tail)

recursion/src/main/scala/recursion/listOps.scala

init

Implement a function that creates a copy of an IntList with the last element removed (this function is sometimes called butlast).

If init is called on an empty list, it should throw an IllegalArgumentException using throw IllegalArgumentException("Empty list!").

def init(l: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def init(l: IntList): IntList =
  if l.isEmpty then throw IllegalArgumentException("Empty list!")
  else if l.tail.isEmpty then IntNil()
  else IntCons(l.head, init(l.tail))

recursion/src/main/scala/recursion/listOps.scala

minMax ⭐️

The goal of this exercise is to write a function that retrieves the smallest and largest elements from an IntList as a pair.

Because your functions must return 2 values, you will need to return a pair. A pair containing the integers 1 and 2 is written as (1, 2). Similarly, the type of a pair containing two integers is written as (Int, Int). Here is a REPL session demonstrating how to create and use pairs:

scala> val pair = (1, 2)
val pair: (Int, Int) = (1,2)

scala> pair._1
val res9: Int = 1

scala> pair._2
val res10: Int = 2

scala> val (first, second) = pair
val first: Int = 1
val second: Int = 2

To find the smallest or largest of two integers, you can use the scala.math.min and scala.math.max functions.

If minMax is called on an empty list, it should throw an IllegalArgumentException using throw IllegalArgumentException("Empty list!").

def minMax(l: IntList): (Int, Int) =
  ???

recursion/src/main/scala/recursion/listOps.scala

def minMax(l: IntList): (Int, Int) =
  if l.isEmpty then
    throw IllegalArgumentException("Empty list!")
  else if l.tail.isEmpty then
    (l.head, l.head)
  else
    val (min, max) = minMax(l.tail)
    (scala.math.min(min, l.head), scala.math.max(max, l.head))

recursion/src/main/scala/recursion/listOps.scala

Lists as sets

Lists are ordered collections of objects; as a result, most set operations can be defined on them too:

contains ⭐️

Write a contains(l, n) function that checks whether list l contains value n:

def contains(l: IntList, n: Int): Boolean =
  ???

recursion/src/main/scala/recursion/listOps.scala

def contains(l: IntList, n: Int): Boolean =
  if l.isEmpty then false
  else (l.head == n) || contains(l.tail, n)

recursion/src/main/scala/recursion/listOps.scala

isSubset

Write an isSubset(l, L) function that checks whether list l is a subset of list L (that is, all elements of l are also contained in L):

def isSubset(l: IntList, L: IntList): Boolean =
  ???

recursion/src/main/scala/recursion/listOps.scala

def isSubset(l: IntList, L: IntList): Boolean =
  if l.isEmpty then true
  else contains(L, l.head) && isSubset(l.tail, L)

recursion/src/main/scala/recursion/listOps.scala

intersection

Write an intersection(l, L) function that constructs a new list whose elements are the same as those of l in the same order, but with all elements not contained in L removed:

def intersection(l: IntList, L: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def intersection(l: IntList, L: IntList): IntList =
  if l.isEmpty then IntNil()
  else if contains(L, l.head) then IntCons(l.head, intersection(l.tail, L))
  else intersection(l.tail, L)

recursion/src/main/scala/recursion/listOps.scala

difference ⭐️

Write a difference(l, L) function that constructs a new list whose elements are the same as those of l in the same order, but with all elements contained in L removed:

def difference(l: IntList, L: IntList): IntList =
  ???

recursion/src/main/scala/recursion/listOps.scala

def difference(l: IntList, L: IntList): IntList =
  if l.isEmpty then IntNil()
  else if contains(L, l.head) then difference(l.tail, L)
  else IntCons(l.head, difference(l.tail, L))

recursion/src/main/scala/recursion/listOps.scala

Recursion on trees

Now, we’ll explore recursion on binary trees. A binary tree is a tree where each node has at most two children. An IntTree can be either:

Hence, a tree with two non-empty nodes, a root node with value 1 with a left child with value 2, is represented by IntBranch(1, IntBranch(2, IntEmptyTree(), IntEmptyTree()), IntEmptyTree()).

IntTree objects have four operations — .value, .left, .right and .isEmpty — demonstrated in the following REPL session:

scala> import recursion.*

scala> IntBranch(2, IntEmptyTree(), IntEmptyTree()).value
val res11: Int = 2

scala> IntBranch(2, IntEmptyTree(), IntEmptyTree()).left
val res12: recursion.IntTree = IntEmptyTree()

scala> IntBranch(2, IntEmptyTree(), IntEmptyTree()).right
val res13: recursion.IntTree = IntEmptyTree()

scala> val tree = IntBranch(1, IntBranch(2, IntEmptyTree(), IntEmptyTree()), IntEmptyTree())
val tree: recursion.IntBranch = IntBranch(1,IntBranch(2,IntEmptyTree(),IntEmptyTree()),IntEmptyTree())

scala> tree.isEmpty
val res14: Boolean = false

scala> tree.value
val res15: Int = 1

scala> tree.left
val res16: recursion.IntTree = IntBranch(2,IntEmptyTree(),IntEmptyTree())

scala> tree.left.isEmpty
val res17: Boolean = false

scala> tree.left.value
val res18: Int = 2

scala> tree.left.left
val res19: recursion.IntTree = IntEmptyTree()

scala> tree.left.right
val res20: recursion.IntTree = IntEmptyTree()

scala> tree.right
val res21: recursion.IntTree = IntEmptyTree()

scala> tree.right.isEmpty
val res22: Boolean = true

treeSize

Implement a function that computes the size of an IntTree. The size of a tree is the number of nodes it contains.

def treeSize(t: IntTree): Int =
  ???

recursion/src/main/scala/recursion/treeOps.scala

Here too, the substitution model can help: start with an example tree, and ask yourself how to “reduce” the expression treeSize(example) step by step.

def treeSize(t: IntTree): Int =
  if t.isEmpty then 0
  else 1 + treeSize(t.left) + treeSize(t.right)

recursion/src/main/scala/recursion/treeOps.scala

treeDepth ⭐️

Implement a function that computes the depth of an IntTree. The depth of a tree is the number of edges on the longest path between the root and a leaf. Our example tree above has a depth of 2.

To find the largest of two integers, you can use the scala.math.max function.

def treeDepth(t: IntTree): Int =
  ???

recursion/src/main/scala/recursion/treeOps.scala

def treeDepth(t: IntTree): Int =
  if t.isEmpty then 0
  else 1 + scala.math.max(treeDepth(t.left), (treeDepth(t.right)))

recursion/src/main/scala/recursion/treeOps.scala

treeSum 🧪

Implement a function that computes the sum of all values in an IntTree, similarly to the sum function for IntLists:

def treeSum(t: IntTree): Int =
  ???

recursion/src/main/scala/recursion/treeOps.scala

def treeSum(t: IntTree): Int =
  if t.isEmpty then 0
  else t.value + treeSum(t.left) + treeSum(t.right)

recursion/src/main/scala/recursion/treeOps.scala

treeAllEven

Implement a function that determines if all values in an IntTree are even:

def treeAllEven(t: IntTree): Boolean =
  ???

recursion/src/main/scala/recursion/treeOps.scala

def treeAllEven(t: IntTree): Boolean =
  if t.isEmpty then true
  else (t.value % 2 == 0) && treeAllEven(t.left) && treeAllEven(t.right)

recursion/src/main/scala/recursion/treeOps.scala

treeIncrement ⭐️

Implement a function that increases each value in an IntTree by one:

def treeIncrement(t: IntTree): IntTree =
  ???

recursion/src/main/scala/recursion/treeOps.scala

def treeIncrement(t: IntTree): IntTree =
  if t.isEmpty then IntEmptyTree()
  else IntBranch(t.value + 1, treeIncrement(t.left), treeIncrement(t.right))

recursion/src/main/scala/recursion/treeOps.scala

Consider porting other list exercises to work on trees!

treeShow 🧪

Implement a function that creates a string containing all values in an IntTree, in pre-order. This means that the value of each node must be printed before its children, and then the values of the left subtree must be printed before the values of the right subtree.

For our example tree, the output should be:

1
2

Your function must return a String.

To concatenate two strings, you can use the + operator:

scala> "Hello " + "Ada"
val res23: String = Hello Ada

To convert an integer to a string, you can use the toString method:

scala> 3.toString
val res24: String = 3

In Scala, like in Java, line breaks in strings are represented by the \n character.

Implement treeShow:

def treeShow(t: IntTree): String =
  ???

recursion/src/main/scala/recursion/treeOps.scala

How would you modify your function to collect the values in post-order? Try comparing the execution of the original pre-order function and the post-order variant on an example, using the substitution model.

Finally, how would you modify your function to collect the values in-order?

def treeShow(t: IntTree): String =
  if t.isEmpty then ""
  else t.value.toString() + "\n" + treeShow(t.left) + treeShow(t.right)

recursion/src/main/scala/recursion/treeOps.scala

def treeShowPostOrder(t: IntTree): String =
  if t.isEmpty then ""
  else treeShowPostOrder(t.left) + treeShowPostOrder(t.right) + t.value + "\n"

recursion/src/main/scala/recursion/treeOps.scala

def treeShowInOrder(t: IntTree): String =
  if t.isEmpty then ""
  else treeShowInOrder(t.left) + t.value + "\n" + treeShowInOrder(t.right)

recursion/src/main/scala/recursion/treeOps.scala

Recursion on Strings

In this exercise, you will write recursive functions that operate on strings. A string can be thought of as a list of characters. Like lists, we can apply the same structural recursion principles to strings. Each string has the isEmpty, head and tail methods similar to IntLists:

scala> "Hello Ada".tail
val res25: String = ello Ada

scala> "Hello Ada".tail.head
val res26: Char = e

scala> "Hello Ada".tail.isEmpty
val res27: Boolean = false

scala> " ".isEmpty
val res28: Boolean = false

scala> "".isEmpty
val res29: Boolean = true

Note the type of the head method: it returns a Char, not a String. A Char is a single character, whereas a String is a sequence of characters. A String is surrounded by double quotes, whereas a Char is surrounded by single quotes.

For the following exercises, you cannot use any other method than isEmpty, head and tail on strings.

stringLength

Implement a function that computes the length of a string:

def stringLength(s: String): Int =
  ???

recursion/src/main/scala/recursion/stringOps.scala

def stringLength(s: String): Int =
  if s.isEmpty then 0
  else 1 + stringLength(s.tail)

recursion/src/main/scala/recursion/stringOps.scala

capitalizeString

Implement a function that capitalizes every character in a string:

def capitalizeString(s: String): String =
  ???

recursion/src/main/scala/recursion/stringOps.scala

You will need the methods toUpper and toString of Char:

scala> 'c'.toUpper
val res30: Char = C

scala> 'c'.toString
val res31: String = c
def capitalizeString(s: String): String =
  if s.isEmpty then ""
  else s.head.toUpper.toString + capitalizeString(s.tail)

recursion/src/main/scala/recursion/stringOps.scala

isBlank

Implement a function that checks if a string is blank (i.e., if it only contains whitespace characters or is empty):

def isBlank(s: String): Boolean =
  ???

recursion/src/main/scala/recursion/stringOps.scala

You may find the .isWhitespace method of Char useful:

scala> 'c'.isWhitespace
val res29: Boolean = false

scala> ' '.isWhitespace
val res30: Boolean = true
def isBlank(s: String): Boolean =
  if s.isEmpty then true
  else if !s.head.isWhitespace then false
  else isBlank(s.tail)

recursion/src/main/scala/recursion/stringOps.scala

wordCount ⭐️

In this exercise, we call “word” any sequence of non-whitespace characters. For example, " Bonjour ! " has two words. Implement a function that counts the number of words in a string:

def wordCount(s: String): Int =
  ???

recursion/src/main/scala/recursion/stringOps.scala

Example run:

scala> wordCount("  hello ")
val res32: Int = 1

scala> wordCount("  hello   world !")
val res33: Int = 3

As before, you may use c.isWhitespace to check if a character c counts as whitespace.

Hint

Consider defining a helper function discardWord that skips over the first word of a string and returns the corresponding suffix (in other words, discardWord("abc def") should return " def").

def discardWord(s: String): String =
  ???

recursion/src/main/scala/recursion/stringOps.scala

def wordCount(s: String): Int =
  if s.isEmpty then 0
  else if s.head.isWhitespace then wordCount(s.tail)
  else 1 + wordCount(discardWord(s))

recursion/src/main/scala/recursion/stringOps.scala

def discardWord(s: String): String =
  if s.isEmpty || s.head.isWhitespace then s
  else discardWord(s.tail)

recursion/src/main/scala/recursion/stringOps.scala

caesarCipher ⭐️

Implement a function that encrypts a string using the Caesar cipher method. The Caesar cipher is an ancient form of encryption, where each letter in a string is shifted by a fixed number of places down the alphabet.

In this exercise, we assume that the input string only contains English lowercase letters. In addition, we shift the letters in a circular way, so that z shifted by 1 becomes a, z shifted by 2 becomes b, etc.

Every English letter has a corresponding ASCII and Unicode code which is an integer value. Codes of English letters are the same in ASCII and Unicode; the English lowercase letter ‘a’ corresponds to the number 97, ‘b’ is 98, and so on up to ‘z’ which is 122.

In Scala, you can get the ASCII value of an English letter using the toInt method, and convert it back to a character using the toChar method:

scala> 'a'.toInt
val res34: Int = 97

scala> 97.toChar
val res35: Char = a

Here is a REPL session demonstrating how to encrypt a single character:

scala> val aCode = 'a'.toInt
     | val zCode = 'z'.toInt
     | val numLetters = zCode - aCode + 1
val aCode: Int = 97
val zCode: Int = 122
val numLetters: Int = 26

scala> val dCode = 'd'.toInt
val dCode: Int = 100

scala> val dPlus3 = (dCode + 3 - aCode) % numLetters + aCode
val dPlus3: Int = 103

scala> dPlus3.toChar
val res36: Char = g

scala> val dPlus25 = (dCode + 25 - aCode) % numLetters + aCode
val dPlus25: Int = 99

scala> dPlus25.toChar.toString
val res37: String = c

And here are two additional examples of how your caesarCipher function should behave:

scala> caesarCipher("abc", 2)
val res38: String = cde

scala> caesarCipher("abz", 3)
val res39: String = dec

Implement the caesarCipher function accordingly:

def caesarCipher(s: String, shift: Int): String =
  ???

recursion/src/main/scala/recursion/stringOps.scala

def caesarCipher(s: String, shift: Int): String =
  val aCode = 'a'.toInt
  val zCode = 'z'.toInt
  val numLetters = zCode - aCode + 1
  if s.isEmpty then ""
  else
    val charCode = s.head.toInt
    if charCode >= aCode && charCode <= zCode then
      val shiftedCharCode = (charCode + shift - aCode) % numLetters + aCode
      shiftedCharCode.toChar.toString + caesarCipher(s.tail, shift)
    else
      throw IllegalArgumentException("Only lowercase English letters are allowed")

recursion/src/main/scala/recursion/stringOps.scala

reverseString ⭐️

Implement a function that reverses a string:

def reverseString(s: String): String =
  ???

recursion/src/main/scala/recursion/stringOps.scala

def reverseString(s: String): String =
  if s.isEmpty then ""
  else reverseString(s.tail) + s.head

recursion/src/main/scala/recursion/stringOps.scala

Polish Notation 🔥

Polish Notation (PN) is a way to write arithmetic expressions where every operator comes before its operands. For example, instead of writing 3 + 4, you would write + 3 4.

Your task is to implement a PN evaluator for an IntList where operators are represented by negative integers: we’ll deal with just addition and multiplication for now:

val Add = -1
val Multiply = -2

recursion/src/main/scala/recursion/listOps.scala

For example, the expression 3 + 4 would be represented as IntCons(Add, IntCons(3, IntCons(4, IntNil()))) and the expression 3 * 4 would be represented as IntCons(Mul, IntCons(3, IntCons(4, IntNil()))).

Here is a REPL session demonstrating how the polishEval function should behave:

scala> // List:   [Add, 3, 4]
scala> // Prefix: + 3 4
scala> // Infix:  3 + 4
scala> val expr1 = IntCons(Add, IntCons(3, IntCons(4, IntNil())))
val expr1: recursion.IntCons = IntCons(-1,IntCons(3,IntCons(4,IntNil())))

scala> polishEval(expr1)
val res40: (Int, recursion.IntList) = (7,IntNil())

scala> // List:   [Multiply, Add, 5, 3, 4]
scala> // Prefix: * + 5 3 4
scala> // Infix:  (5 + 3) * 4
scala> val expr2 = IntCons(Multiply, IntCons(Add, IntCons(5, IntCons(3, IntCons(4, IntNil())))))
val expr2: recursion.IntCons = IntCons(-2,IntCons(-1,IntCons(5,IntCons(3,IntCons(4,IntNil())))))

scala> polishEval(expr2)._1
val res41: Int = 32

Before starting to write code, think about how you would evaluate a PN expression. You’ll need to look at the first element of the list to determine what to do next. If it’s an operator, you’ll need to evaluate its arguments (where are they?). If it’s a number, you’ll need to return it. If it’s not a number or a known operator, you’ll need to throw an InvalidOperatorException exception. This time, when encountering an empty list throw a EmptyListException.

Write a few examples on paper and try to figure out how you would evaluate them. Then, try to translate your steps into pseudo-code.

Only once you have a clear idea of how to solve the problem, start implementing polishEval:

def polishEval(l: IntList): (Int, IntList) =
  ???

recursion/src/main/scala/recursion/listOps.scala

The idea behind Polish notation is that an operator will act on the next two parts of the list. These parts could either be simple numbers or other prefixed expressions. Using this pattern will help you break down and evaluate the list.

def polishEval(l: IntList): (Int, IntList) =
  if l.isEmpty then
    throw EmptyListException()
  else if l.head >= 0 then
    (l.head, l.tail)
  else if l.head == Add then
    val (leftValue, afterLeft) = polishEval(l.tail)
    val (rightValue, afterRight) = polishEval(afterLeft)
    (leftValue + rightValue, afterRight)
  else if l.head == Multiply then
    val (leftValue, afterLeft) = polishEval(l.tail)
    val (rightValue, afterRight) = polishEval(afterLeft)
    (leftValue * rightValue, afterRight)
  else
    throw InvalidOperatorException(l.head)

recursion/src/main/scala/recursion/listOps.scala

Git I: Single-user version control

The following questions, exercises, and resources are meant to guide your exploration of version control systems.

If you ask a complex Git question on Ed, it will often be much easier for us to help you debug if you include a copy of your repo. For this, post your repo as a git bundle in a private Ed post (look up the documentation of git bundle to know what it does).

Make sure you have good backups before experimenting with your computer on the command line!

Browsing documentation

The official Git tutorial is distributed as manual (man) pages. On macOS, GNU/Linux, and WSL, you can simply type man command to get more information about command. On Windows, you may prefer to browse the online Git documentation instead.

  1. Locate the commands that we discussed during the lecture in man gittutorial, man gittutorial-2, and man giteveryday.

  2. Look up concepts and commands you don’t know in man gitglossary.

Configuration ⭐️

  1. To use Git, you need to configure your user name and email. Look up the corresponding git config commands online, or follow the instructions in man gittutorial. This ensures that your commits carry the right author information.

  2. git commit and other interactive commands sometimes need to open a text editor. Use git config to set the core.editor variable globally to your preferred editor. (The demo given in class used Emacs. On GNU/Linux, Git typically defaults to nano, which is reasonably intuitive, while on macOS it defaults to vi, which is great but not too beginner-friendly. For this course, we recommend VSCode.)

  1. Use git config with the --global flag to apply the change across all repositories of the currently logged-in user:

    git config --global user.name "Your Name"
    git config --global user.email you@yourdomain.example.com
    

    Alternatively, you can use --local to configure only the particular repository in which you run the commands, or --system to configure git for all repositories on the current machine (including for repositories created by other users of the same computer).

  2. If you want to use VSCode as your git editor, you can follow the instructions found on VSCode as Git editor:

    Here are the steps to do so:

    1. Make sure you can run
      code --help
      

      from the command line and you get help.

      • if you do not see help, please follow these steps:
        • macOS: Select Shell Command: Install ‘Code’ command in path from the Command Palette.
        • Windows: Make sure you selected Add to PATH during the installation.
        • Linux: Make sure you installed Code via our new .deb or .rpm packages.
    2. From the command line, run git config --global core.editor "code --wait"

    If you want to use Vim as your git editor, use:

    git config --global core.editor "vim"
    

    If you want to use Emacs as your git editor, use:

    git config --global core.editor "emacs"
    

    (If you want to open Emacs within the terminal, you can replace emacs with emacs -nw).

    If you want to use Nano as your git editor, use:

    git config --global core.editor "nano"
    

Reproducing the first Git demo

  1. ⭐️ Re-watch the 15-minutes Git demo that was given at the end of the first Git lecture and follow along on your own terminal (use “Git bash” on Windows). Which commands were necessary? Which were just for demo purposes? (If you run into any issues, try reading the corresponding manual pages.)

    (If you prefer to read, you can follow this written transcript instead.)

  2. Notice the file called .gitignore. What is this file for? What would git status have shown after sbt run if sbt new hadn’t created it? (Try it! Replicate the commands, then delete .gitignore and see what git status shows.)

  1. (No solution provided.)

  2. The .gitignore file specifies intentionally untracked files that Git should ignore (run man gitignore for more details). For example, MacOS creates a .DS_Store file in each directory, but you don’t want to maintain this file in your Git repository, because it’s unrelated to your project. It would be annoying if it keept showing up in the “untracked files” section of git status, so you can ask Git to ignore it by adding .DS_Store to .gitignore.

    In fact, if you open the .gitignore file created by sbt new, you will find that it already contains .DS_Store. You will also find target/ (the /, which means that the whole target directory (generated by sbt run) will be ignored by Git (it’s enough to specify target/ to exclude every file in it; there is no need to list every file that it contains).

    If you run sbt run, delete the .gitignore file, and then run git status, you will that all the files that were previously ignored show up again (for example, the target/ directory).

Browsing history

  1. Clone a large repository, using the git clone command. For example:

    git clone https://github.com/microsoft/vscode.git
    
  2. Read up on the git blame command. What does it do? Try it on the command line on a file in the repository you just cloned.

  3. Graphical interfaces can be useful to browse complex history. Find out how to perform a git blame and a git log restricted to a set of lines in your favorite editor, as demonstrated at the end of the demo (the demo used Emacs with magit as the Git UI; you can use VSCode with the Gitlens plugin).

  1. No solution provided.

  2. The git blame command shows every line in a source file annotated with information about the last commit that modified it. Pretty convenient!

  3. VSCode does not have a git blame interface out of box. Instead, you need to install extensions like GitLens.

    For more info about how to use the built-in Git interface in VSCode, you may find Using Git source control in VS Code useful.

    If you have GitLens installed, you can press “Ctrl + Shift + P” (or click: Help → Show All Commands) to open the prompt. Type “GitLens: Toggle File Blame” to use git blame for the currently opened file (even better: give it a key binding so you don’t need to type the command every time!).

    vscode-gitlens-git-blame

    To show the history restricted to a set of lines in a file, you can type “GitLens: Show Line History View”. On the left side you can find “LINE HISTORY”.

    vscode-gitlens-lines-history

    If you click any of these commits in the line history, you can see the diff before and after the commit.

    vscode-gitlens-diff.png

Tracking history and creating patches ⭐️

  1. Make a new clone of the course’s exercise repository, and chose an exercise to work on.

  2. ⭐️ Make a commit after completing each exercise. Use git commit -m for some of the commits, and git commit without -m for others. What happens when you leave out the -m?

  3. If you forget to commit between two exercises, find a Git command or a tool in the VSCode UI that lets you stage only part of your changes (hint).

  4. ⭐️ Use git format-patch to share the solution to one exercise with a classmate. Use git am to apply a patch received from a classmate.

  5. 🔥 What happens if you try to apply the patch after you’ve already started solving the same exercise? Look up how to handle the resulting situation, which is known as a conflict in Git.

Alternatively, you can pick one of your classes that doesn’t use git, and initialize a new git repository to hold a copy of your exercise solutions for that class.

  1. git clone git@gitlab.epfl.ch:cs214/exercises.git

  2. Here is an example:

    # After finishing the "length" exercise:
    git add recursion/src/main/scala/playground.worksheet.sc
    git commit -m "recursion: Finish length exercise"
    # Continue working on recursion exercises ...
    

    Using -m "<message>" with git commit specifies the commit message without opening a text editor.

    Running git commit without -m opens a file containing information about the current commit in the default editor (this will be a terminal program unless you configure Git otherwise): type your commit message, save it and exit. If you set Git’s core.editor setting to point to VSCode instead, then running git commit without -m will open a VSCode window (there, you can use the tick icon at the top right of the editor window to confirm the commit message and exit).

  3. In VSCode, you can stage a portion of a file by making a text selection in a file and then choosing “Stage Selected Ranges” from the “Command Palette” or from the diff editor context menu (right-click). More info: Staging in VSCode.

  4. Here is an example:

    git format-path -1 # Create a patch for the last commit
    

    Then, send the generated patch file to your classmate.

    To apply the patch, run:

    git am <path_to_the_patch_file>
    

If you chose to create a new Git repo to track another class’ exercises, you can use the following:

git init cs123-solutions
cd cs123-solutions
unzip ~/Downloads/cs123-week1-exercises.zip
git add -A
git commit -m "Add week 1 exercises"

Exploring large codebases ⭐️

  1. On your computer, locate at least one free software (“open source”) application that you use regularly, and find out how to get its source code. Is it hosted in a Git repository?

  2. Many projects are hosted on Git platforms such Github or Gitlab. Does Git require you to use such a platform?

  3. Clone a git repository of an application that you use and explore the history. Find Git commands to answer the following questions:

    1. What happened in the last three days?
    2. Who contributed to writing the README file?
    3. What is the most recent commit that mentions a bug fix? How did the fix work? When was the buggy code introduced (in which commit?) How long did the bug remain?
  4. 🔥 Sometimes you want more information than Git alone can provide directly. Find a way to answer the following questions:

    1. How many C source files does the repository contain?
    2. Which files had the most changes over the last 2 years?
    3. Which is the most commonly edited files in commits that mention a bugfix?
    4. Which was the largest commit (in terms of changed lines) in the last 3 months?
  1. Many programs on your machine are open source. In fact, Git itself is one! Even better, Git’s source code is hosted in a Git repository. Let’s have a look! The repository is hosted here:

    https://git.kernel.org/pub/scm/git/git.git/

  2. No: many repositories are hosted on other platforms, or on no platform at all: Git does not require a central repository, and you can clone from any copy of a repository:

    $ mkdir a
    $ cd a
    $ git init
    Initialized empty Git repository in /home/cpc/cs214/a/.git/
    $ echo "Hello" > example.txt
    $ git add -A; git commit -m "Add example.txt"
    [main (root-commit) bc6cc40] Add example.txt
     1 file changed, 1 insertion(+)
     create mode 100644 example.txt
    $ cd ..
    $ git clone a b
    Cloning into 'b'...
    done.
    $ ls b
    example.txt
    
  3. Here are commands to answer these questions:

    1. Clone the repository:

      $ git clone https://git.kernel.org/pub/scm/git/git.git
      Cloning into 'git'...
      remote: Enumerating objects: 374926, done.
      remote: Counting objects: 100% (646/646), done.
      remote: Compressing objects: 100% (99/99), done.
      remote: Total 374926 (delta 562), reused 602 (delta 547), pack-reused 374280
      Receiving objects: 100% (374926/374926), 123.81 MiB | 11.69 MiB/s, done.
      Resolving deltas: 100% (282875/282875), done.
      
    2. Use the git log --pretty=oneline command to show one commit per line:

      $ git log --pretty=oneline --since="3 days"
      ed155187b429a2a6b6475efe1767053df37ccfe1 Sync with Git 2.46.1
      9cf95c0ca06376c2797664a7f2a1d1b276729070 The sixteenth batch
      77cf81e98866c79722284258c9b605a70fe27382 Merge branch 'bl/trailers-and-incomplete-last-line-fix'
      bf42b2390177dae176c0f88f332ac0f5fd822cf2 Merge branch 'rj/cygwin-has-dev-tty'
      41390eb3e6411925365a4932ca4b5c12e609c779 Merge branch 'rs/diff-exit-code-fix'
      da1c402a47e574c13fe2f0f205a2f0eec2224bb7 Merge branch 'jc/doc-skip-fetch-all-and-prefetch'
      19de221f3638517b7c1411b478cb7eb77db09ce1 Merge branch 'ds/doc-wholesale-disabling-advice-messages'
      17ae0b824911640dcccda9ff2f3a312ebe9dbacb Merge branch 'jk/sparse-fdleak-fix'
      0299251319669eba8c343f56c4bd5393148674da Merge branch 'ds/scalar-no-tags'
      a731929aa8016750c09bccc67c68feaf1259ce90 Git 2.46.1
      8ef5549b0664ab0acc02d49564dd5b679fade359 Merge branch 'rj/compat-terminal-unused-fix' into maint-2.46
      8b4bb65a8fc4864525c9c9b1752cde1ca3d6ac63 Merge branch 'jc/config-doc-update' into maint-2.46
      d3d7c8dfb810c44b6f6046134bdbc42110b7bc47 Merge branch 'aa/cat-file-batch-output-doc' into maint-2.46
      118c74d143ec5d05ca16756bea4ed2ba2a7ad1e9 Merge branch 'cl/config-regexp-docfix' into maint-2.46
      bb57f055aeede67cb2ff68a6cf96dbd9dc8a6321 Merge branch 'jc/coding-style-c-operator-with-spaces' into maint-2.46
      480124470caf9401e05f9278258c108795d66cad Merge branch 'ps/stash-keep-untrack-empty-fix' into maint-2.46
      be344f36318c7f17e852ecb7aadab1247ccc67c7 Merge branch 'ps/index-pack-outside-repo-fix' into maint-2.46
      bc799320482a1ee7fb0e20cb648fe25dcb30b7ab Merge branch 'jk/free-commit-buffer-of-skipped-commits' into maint-2.46
      57974d46a4d5a079471c4f7eaa5933b1922299e4 Sync with 'maint'
      f8ca6d006426c0a46216bc21cd9950d1df3f9bf1 The fifteenth batch
      f286e0a01c1e8c1b401d0290b8fc0f9847cfbd06 Merge branch 'kl/cat-file-on-sparse-index'
      b64f24972651dd7a46b0563a60c62bab7afbd68d Merge branch 'jk/messages-with-excess-lf-fix'
      143682ec43d5772ee9327ed84eb0cdc007b1f489 Merge branch 'ps/pack-refs-auto-heuristics'
      3bf057a0cd6a75125a5bd239372f9fc4fea6c338 Merge branch 'tb/multi-pack-reuse-fix'
      04595eb4079e33496124a132832d28aba4a94dfc Merge branch 'gt/unit-test-oid-array'
      63b5fcdde90361195f7f6ade199c90d91e2ecf53 Merge branch 'ps/index-pack-outside-repo-fix'
      3265304f944ed95dfa94d2e834bdd615e4cfb0f0 Merge branch 'jc/mailinfo-header-cleanup'
      6074a7d4ae6b658c18465f10bbbf144882d2d4b0 Another batch of topics for 2.46.1
      d2b936f1dc1a48e6fbe43aabcdf66fb06e648c1a Merge branch 'jc/grammo-fixes' into maint-2.46
      1b9a1246ef1ea532c7e030e47777035ba0d12a9b Merge branch 'jc/tests-no-useless-tee' into maint-2.46
      9e2cb073ec4a32417953df5b48fb698d0cb3979c Merge branch 'jc/how-to-maintain-updates' into maint-2.46
      b4e826a7203a6cc1152d84df818f79456eaba457 Merge branch 'ps/bundle-outside-repo-fix' into maint-2.46
      41c952ebacf7e3369e7bee721f768114d65e50c4 Merge branch 'jc/patch-id' into maint-2.46
      712d970c0145b95ce655773e7cd1676f09dfd215 Merge branch 'jk/apply-patch-mode-check-fix' into maint-2.46
      
    3. Use the git shortlog --summary command to print authors:

      $ git shortlog --numbered --summary README.md
           6	Matthieu Moy
           4	Johannes Schindelin
           3	Junio C Hamano
           2	Jeff King
           1	Benjamin Dopplinger
           1	Doug Ilijev
           1	Eric Wong
           1	Josh Soref
           1	Philip Oakley
           1	Thomas Gummerer
           1	Ævar Arnfjörð Bjarmason
      
    4. Use the --grep flag to look for the word “bug”:

      $ git log -1 --no-merges --grep="bug"
      commit a219a6739cc14b52a7ba08170eebe9cf11505667
      Author: Jeff King <peff@peff.net>
      Date:   Wed Aug 28 00:00:49 2024 -0400
      
          config.mak.dev: enable -Wunused-parameter by default
      
          Having now removed or annotated all of the unused function parameters in
          our code base, I found that each instance falls into one of three
          categories:
      
            1. ignoring the parameter is a bug (e.g., a function takes a ptr/len
               pair, but ignores the length). Detecting these helps us find the
               bugs.
      
            2. the parameter is unnecessary (and usually left over from a
               refactoring or earlier iteration of a patches series). Removing
               these cleans up the code.
      
            3. the function has to conform to a specific interface (because it's
               used via a function pointer, or matches something on the other side
               of an #ifdef). These ones are annoying, but annotating them with
               UNUSED is not too bad (especially if the compiler tells you about
               the problem promptly).
      
          Certainly instances of (3) are more common than (1), but after finding
          all of these, I think there were enough cases of (1) that it justifies
          the work in annotating all of the (3)s.
      
          And since the code base is now at a spot where we compile cleanly with
          -Wunused-parameter, turning it on will make it the responsibility of
          individual patch writers going forward.
      
          Signed-off-by: Jeff King <peff@peff.net>
          Signed-off-by: Junio C Hamano <gitster@pobox.com>
      

      Look at how nice this commit message is!

    5. Read the message above: the commit we found describes an action that was taken after fixing a number of bugs / unused parameters. So we want to look at previous commits. For this, we use git log commit_id:

      $ git log a219a6739cc14b52a7ba08170eebe9cf11505667 --no-merges --pretty=oneline
      

      Browsing the history, we find dozens of commits removing unused parameters. Let’s take one of them:

      4756494504 * pack-bitmap: drop unused parameters from select_pseudo_merges()
      

      We see the following message:

      $ git log -1 4756494504 --stat
      commit 47564945046f8cda634ee72c5b3838c1b0aa4582
      Author: Jeff King <peff@peff.net>
      Date:   Sat Aug 17 03:29:37 2024 -0400
      
          pack-bitmap: drop unused parameters from select_pseudo_merges()
      
          We take the array of indexed_commits (and its length), but there's no
          need. The selection is based on ref reachability, not the linearized set
          of commits we're packing.
      
          Signed-off-by: Jeff King <peff@peff.net>
          Signed-off-by: Junio C Hamano <gitster@pobox.com>
      
       pack-bitmap-write.c | 2 +-
       pseudo-merge.c      | 3 +--
       pseudo-merge.h      | 3 +--
       3 files changed, 3 insertions(+), 5 deletions(-)
      

      To investigate the source of the issue, we find which lines were affected:

      $ git log --patch -1 4756494504
      […]
      diff --git a/pseudo-merge.c b/pseudo-merge.c
      index 1d7f5381a4..c952a7cba9 100644
      --- a/pseudo-merge.c
      +++ b/pseudo-merge.c
      @@ -425,8 +425,7 @@ static void sort_pseudo_merge_matches(struct pseudo_merge_matches *matches)
          QSORT(matches->unstable, matches->unstable_nr, commit_date_cmp);
       }
      
      -void select_pseudo_merges(struct bitmap_writer *writer,
      -			  struct commit **commits, size_t commits_nr)
      +void select_pseudo_merges(struct bitmap_writer *writer)
       {
          struct progress *progress = NULL;
          uint32_t i;
      

      We can now blame the corresponding lines (the ~ indicates that we want the blame as of the immediately preceding commit):

      $ git blame -L428,430 4756494504~ pseudo-merge.c
      faf558b23ef (Taylor Blau 2024-05-23 17:26:42 -0400 428) void select_pseudo_merges(struct bitmap_writer *writer,
      faf558b23ef (Taylor Blau 2024-05-23 17:26:42 -0400 429) 			  struct commit **commits, size_t commits_nr)
      faf558b23ef (Taylor Blau 2024-05-23 17:26:42 -0400 430) {
      

      We see that the code was last modified in May 2024. Using a line history confirms that the code was in fact introduced at that time:

      $ git log -L428,430:pseudo-merge.c 4756494504~
      commit faf558b23ef55b18f395d1f7a7c2714ccc1320e2
      Author: Taylor Blau <me@ttaylorr.com>
      Date:   Thu May 23 17:26:42 2024 -0400
      
          pseudo-merge: implement support for selecting pseudo-merge commits
      
          Teach the new pseudo-merge machinery how to select non-bitmapped commits
          for inclusion in different pseudo-merge group(s) based on a handful of
          criteria.
      
          Note that the selected pseudo-merge commits aren't actually used or
          written anywhere yet. This will be done in the following commit.
      
          Signed-off-by: Taylor Blau <me@ttaylorr.com>
          Signed-off-by: Junio C Hamano <gitster@pobox.com>
      
      diff --git a/pseudo-merge.c b/pseudo-merge.c
      --- a/pseudo-merge.c
      +++ b/pseudo-merge.c
      @@ -3,0 +423,3 @@
      +void select_pseudo_merges(struct bitmap_writer *writer,
      +			  struct commit **commits, size_t commits_nr)
      +{
      
  4. Here are Bash scripts to answer these questions. You can also use Powershell, or any other scripting language — or you can use Scala scripts!

    1. We can count files with find and wc:

      $ find . -name '*.c' | wc --lines
      587
      
    2. We can tally changed files by asking Git to print changed files for each commit and then sorting and counting them:

      $ git log --no-merges --format= --name-only --since="2 years" \
        | sort | uniq -c | sort -rn | head -n 10
          128 sequencer.c
          110 config.c
          107 Makefile
           98 refs.c
           95 GIT-VERSION-GEN
           90 commit-graph.c
           83 diff.c
           81 cache.h
           75 builtin/rebase.c
           74 builtin/log.c
      
    3. We can use the log command with the grep flag to find mentions of bug or fix. We use the pattern [bB]ug\|[fF]ix to mean either bug, or fix, allowing for Bug and Fix as well:

      $ git log --no-merges --format= --name-only --grep='[bB]ug\|[fF]ix' \
        | sort | uniq -c | sort -rn | head -n10
          409 Makefile
          292 diff.c
          235 cache.h
          212 gitk
          185 gitweb/gitweb.perl
          184 git-svn.perl
          171 revision.c
          166 sequencer.c
          158 sha1_file.c
          154 Documentation/config.txt
      
    4. We can get the number of added and removed lines in each commit using git log --numstat, and we can remove all other information using --format=:

      $ git log --no-merges --format=":: %H" --since="3 months" --numstat
      

      We can then sum the added and deleted counts, using an Awk script:

      $ git log --no-merges --format=":: %H" --since="3 months" --numstat \
        | grep -v '^$' \
        | awk '/::/{ print sum, commit; commit=$0; sum=0; } !/::/{ sum += $1 + $2; }' \
        | sort -rn | head -n10
      3611 :: 47d2691ae95453743b35e4bfe2f2d81db0f484e1
      1807 :: 03763e68fb7fc4b37dd2d184dde501618e6c171d
      1732 :: 15b02a3d4b04a617196d7f5dfb8f9c48329a9f48
      1028 :: db5104501bb126fe304dfc21de04d700d76498a0
      899 :: 68f66648def4bf31455c50e5aa4b87fb90b4b9d7
      866 :: de86879acedd3d6be01ac912bd6b9bc6601d03d7
      810 :: 9200fe2a93876edc8aeae4805b62f6014cba05ea
      801 :: ebe8720ed4729a367e96514463af49f4f1e9fd0e
      750 :: ebd7b1ebd5a332acd7a59309e272d15e97a34b71
      749 :: dadb75a2ddaba42f713eec52bea29891c0674a15
      

      We conclude that the largest commit was 47d2691ae95453743b35e4bfe2f2d81db0f484e1:

      $ git log --shortstat -1 47d2691ae95453743b35e4bfe2f2d81db0f484e1
      commit 47d2691ae95453743b35e4bfe2f2d81db0f484e1
      Author: Peter Krefting <peter@softwolves.pp.se>
      Date:   Thu Oct 26 21:30:12 2023 +0100
      
          git-gui: sv.po: Update Swedish translation (576t0f0u)
      
          Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
          Signed-off-by: Johannes Sixt <j6t@kdbg.org>
      
       1 file changed, 1842 insertions(+), 1769 deletions(-)
      

      The second largest was 03763e68fb7fc4b37dd2d184dde501618e6c171d:

      $ git log --shortstat -1 03763e68fb7fc4b37dd2d184dde501618e6c171d
      commit 03763e68fb7fc4b37dd2d184dde501618e6c171d
      Author: Jeff King <peff@peff.net>
      Date:   Wed Jul 10 04:37:55 2024 -0400
      
          chainlint.pl: check line numbers in expected output
      
          While working on chainlint.pl recently, we introduced some bugs that
          showed incorrect line numbers in the output. But it was hard to notice,
          since we sanitize the output by removing all of the line numbers! It
          would be nice to retain these so we can catch any regressions.
      
          The main reason we sanitize is for maintainability: we concatenate all
          of the test snippets into a single file, so it's hard for each ".expect"
          file to know at which offset its test input will be found. We can handle
          that by storing the per-test line numbers in the ".expect" files, and
          then dynamically offsetting them as we build the concatenated test and
          expect files together.
      
          The changes to the ".expect" files look like tedious boilerplate, but it
          actually makes adding new tests easier. You can now just run:
      
            perl chainlint.pl chainlint/foo.test |
            tail -n +2 >chainlint/foo.expect
      
          to save the output of the script minus the comment headers (after
          checking that it is correct, of course). Whereas before you had to strip
          the line numbers. The conversions here were done mechanically using
          something like the script above, and then spot-checked manually.
      
          It would be possible to do all of this in shell via the Makefile, but it
          gets a bit complicated (and requires a lot of extra processes). Instead,
          I've written a short perl script that generates the concatenated files
          (we already depend on perl, since chainlint.pl uses it). Incidentally,
          this improves a few other things:
      
            - we incorrectly used $(CHAINLINTTMP_SQ) inside a double-quoted
              string. So if your test directory required quoting, like:
      
                 make "TEST_OUTPUT_DIRECTORY=/tmp/h'orrible"
      
              we'd fail the chainlint tests.
      
            - the shell in the Makefile didn't handle &&-chaining correctly in its
              loops (though in practice the "sed" and "cat" invocations are not
              likely to fail).
      
            - likewise, the sed invocation to strip numbers was hiding the exit
              code of chainlint.pl itself. In practice this isn't a big deal;
              since there are linter violations in the test files, we expect it to
              exit non-zero. But we could later use exit codes to distinguish
              serious errors from expected ones.
      
            - we now use a constant number of processes, instead of scaling with
              the number of test scripts. So it should be a little faster (on my
              machine, "make check-chainlint" goes from 133ms to 73ms).
      
          There are some alternatives to this approach, but I think this is still
          a good intermediate step:
      
            1. We could invoke chainlint.pl individually on each test file, and
               compare it to the expected output (and possibly using "make" to
               avoid repeating already-done checks). This is a much bigger change
               (and we'd have to figure out what to do with the "# LINT" lines in
               the inputs). But in this case we'd still want the "expect" files to
               be annotated with line numbers. So most of what's in this patch
               would be needed anyway.
      
            2. Likewise, we could run a single chainlint.pl and feed it all of the
               scripts (with "--jobs=1" to get deterministic output). But we'd
               still need to annotate the scripts as we did here, and we'd still
               need to either assemble the "expect" file, or break apart the
               script output to compare to each individual ".expect" file.
      
          So we may pursue those in the long run, but this patch gives us more
          robust tests without too much extra work or moving in a useless
          direction.
      
          Signed-off-by: Jeff King <peff@peff.net>
          Signed-off-by: Junio C Hamano <gitster@pobox.com>
      
       74 files changed, 913 insertions(+), 894 deletions(-)
      

Being a good Gitizen

  1. Browse a few popular repositories on Github or Gitlab. Look in particular at the commit messages. Do they use a consistent format? If you’re not sure where to look, a good example is the Spark repository.

  2. ⭐️ Some projects take Git commits and change logs very seriously. Pick a free software project and find its contributor guidelines. What do they recommend doing for commits? If you don’t find great examples, see these GNU guidelines, these PostgreSQL ones, and these from Git itself.

  1. Almost all large projects have strict guidelines on how to write commit messages, to ensure that the codebase remains easy to browse and comprehensible. In Spark, the commit headers all have the format [SPARK-id][tag] message, for example [SPARK-48824][SQL] Add Identity Column SQL syntax.

  2. Detailed commit messages are very valuable: make sure that your own commit messages are precise and informative!