Last updated on

Webapps

Welcome to week 9 of CS-214 — Software Construction!

This exercise set is intended to reinforce some concepts used in the webapp lab: serialization, and state machines.

As usual, ⭐️ indicates the most important exercises and questions and 🔥 indicates the most challenging ones. Exercises or questions marked 🧪 are intended to build up to concepts used in this week’s lab.

You do not need to complete all exercises to succeed in this class, and you do not need to do all exercises in the order they are written.

We strongly encourage you to solve the exercises on paper first, in groups. After completing a first draft on paper, you may want to check your solutions on your computer. To do so, you can download the scaffold code (ZIP).

Serialization / deserialization 🧪

The traditional way to handle exceptional results or errors in functional programming is the Option type, which we studied in the calculator lab. At times, this style can be a bit heavy. We’ll see why it may still be worth using in the monads week, but for this week’s exercises (and lab!) let’s use exceptions instead.

We’ll illustrate the issue with a serialization problem.

Serialization is the process of transforming data within a program into a form that can be transmitted over a network, stored in files, etc. Deserialization is the reverse operation. Once your program has state, it’s pretty natural to want to save it, and for that you need serialization.

A common serialization format is JSON. In Scala, a JSON value can be represented like this:

package ujson

enum Value:
  case Null
  case True
  case False
  case Str(s: String)
  case Num(d: Double)
  case Arr(items: ArrayBuffer[Value])
  case Obj(items: LinkedHashMap[String, Value])

A serialization function for type T is a function of type T => ujson.Value. A deserialization function is a function of type ujson.Value => Try[T]. The type Try[T] models either a Success(t: T) or a Failure, and comes with a very convenient constructor Try(…) which runs the computation , and returns a Failure if it raises an exception, and a Success otherwise. A wire is a pair of a serialization and a deserialization function; it can be represented thus:

trait Wire[T]:
  def serialize(t: T): ujson.Value
  def deserialize(js: ujson.Value): Try[T]

src/main/scala/serialization/Wire.scala

As an example, here are a serializer and a deserializer for a pair of an integer and a string:

object IntStringWire extends Wire[(Int, String)]:
  def serialize(t: (Int, String)): ujson.Value =
    ujson.Arr(
      ujson.Num(t._1),
      ujson.Str(t._2)
    )

  def deserialize(js: ujson.Value): Try[(Int, String)] =
    Try {
      val arr = js.arr // .arr throws an exception if the input isn't an array
      val (fst, snd) = (arr(0).num, arr(1).str)
      if !fst.isValidInt then
        throw IllegalArgumentException(f"Not an int: $fst")
      (fst.toInt, snd)
    }

src/main/scala/serialization/Wire.scala

And if we are given a serializer and deserializer for some type T, we can use it to create a serializer and deserializer for Option[T]:

case class OptionWire[T](w: Wire[T]) extends Wire[Option[T]]:
  def serialize(t: Option[T]): ujson.Value =
    t match
      case None =>
        ujson.Arr(ujson.Str("none"))
      case Some(t) =>
        ujson.Arr(ujson.Str("some"), w.serialize(t))

  def deserialize(js: ujson.Value): Try[Option[T]] =
    Try {
      val arr = js.arr
      val tag = arr(0).str
      if arr.size == 1 && arr(0).str == "none" then
        None
      else if arr.size == 2 && arr(0).str == "some" then
        Some(w.deserialize(arr(1)).get)
      else
        throw IllegalArgumentException(f"Unexpected: ${arr.toList}")
    }

src/main/scala/serialization/Wire.scala

  1. Read the documentation of the Try type.

  2. Skim through the examples provided in the documentation of the ujson library (you can skip the last section on Converting To/From Scala Data Types).

  3. Consider the following type:

    enum Formula:
      case Lit(b: Boolean)
      case Var(name: String)
      case FnCall(fn: String, args: List[Formula])
    

    src/main/scala/serialization/Wire.scala

    How would you serialize a value of this type? Think about representation choices on paper first. Try a few different examples by hand. Then, write the serializer and deserializer.

    object FormulaWire extends Wire[Formula]:
      import Formula.*
    
      def serialize(e: Formula): ujson.Value =
        ???
    
      def deserialize(js: ujson.Value): Try[Formula] =
        ???
    

    src/main/scala/serialization/Wire.scala

  4. All encoder/decoder pairs should verify deserialize(serialize(v)) === Success(v) for all v. Use this equation to test your serializer and deserializer together.

    (This is necessary but not sufficient for correctness; do you see why?)

Isolating state: state machines 🧪

Most stateful applications are thoroughly stateful: there’s state everywhere, and anything can change under your feet at any time. A better model, when you must have state, is to contain it in one place. This is the state machine model, where you have a single piece of state, and a function to update it.

You have already seen many examples of this pattern, without knowing it:

Another way to put this is that we have a state and instructions that are interpreted to update the state. In that sense, foldLeft and foldRight are interpreters for a language whose instructions are the elements of the list being processed.

This week’s lab will demonstrate how powerful this model is for building reliable single- and multi-user applications for desktop, mobile, and web environments. For now, let’s get used to thinking in terms of state machines. We’ll use the following model:

trait StateMachine[State, Event, Result]:
  def init: State
  def update(s: State, e: Event): State // Sometimes called *transition*
  def finish(s: State): Result

src/main/scala/rec/StateMachines.scala

def processEvents[S, E, R](sm: StateMachine[S, E, R])(events: List[E]): R =
  sm.finish(events.foldLeft(sm.init)(sm.update))

src/main/scala/rec/StateMachines.scala

Why do we bother with all this if it all boils down to a foldLeft? Because in general we are interested in modeling streaming processes, where events are user inputs, messages on a network, interrupts, etc. In that case, instead of a foldLeft, we would store the state in a var and update it every time we receive an event from the user, the network, a sensor, etc. (and there would be no finish stage)

Here is a simple example: counting how many times the word "CS" is followed by "214" in a list of words. The state has two parts:

case class State(lastWord: Option[String], occurencesCount: Int)

src/main/scala/rec/StateMachines.scala

lastWord remembers which word we saw last, and occurences stores the ongoing count.

The events and results are simpler: events are simply words, and the final result is a number:

type Event = String

src/main/scala/rec/StateMachines.scala

type Result = Int

src/main/scala/rec/StateMachines.scala

We can now define the state machine itself:

object PatternMatchingSM extends StateMachine[State, Event, Result]:

src/main/scala/rec/StateMachines.scala

… it starts with no previous word and a count of 0:

def init = State(None, 0)

src/main/scala/rec/StateMachines.scala

… every time we receive a new word, it updates its state to increment the count if needed, and replaces the last word:

def update(s: State, e: Event) =
  State(
    Some(e),
    s.occurencesCount
      + (if s.lastWord == Some("CS") && e == "214" then 1 else 0)
  )

src/main/scala/rec/StateMachines.scala

… and finally, when the stream is exhausted, it just returns the final count:

def finish(s: State) = s.occurencesCount

src/main/scala/rec/StateMachines.scala

Now, how do we use this state machine? With processEvents!

val countMatchesSM: List[String] => Int =
  processEvents(PatternMatching.PatternMatchingSM)

src/main/scala/rec/StateMachines.scala

Notice the currying: we could have also written things this way:

def countMatchesSM2(l: List[String]): Int =
  processEvents(PatternMatching.PatternMatchingSM)(l)

src/main/scala/rec/StateMachines.scala

… and then called the resulting function as usual:

scala> countMatchesSM(List("A", "special", "CS", "214", "exercise"))
1

If you find yourself unsure of how this function really works, try tracing processEvents — it helps a lot to see exactly which states are being traversed.

Counting words ⭐️

This isn’t in fact the first time that we see state machines — we’ve seen them once before, when we wrote wordCount using foldLeft:

case class WordCountState(count: Int, lastWasWS: Boolean)

def wordCount(l: List[Char]): Int =
  l.foldLeft[WordCountState](WordCountState(0, true))((state: WordCountState, c: Char) =>
    val cIsWS = c.isWhitespace
    val count = state.count + (if state.lastWasWS && !cIsWS then 1 else 0)
    WordCountState(count, cIsWS)
  ).count

src/main/scala/rec/StateMachines.scala

Rewrite this function as a state machine.

Counting parentheses

When studying parallelism, we solved the parentheses-balancing problem in the following way:

def isBalanced(str: List[Char]): Boolean =
  def loop(str: List[Char], numOpen: Int): Int =
    if numOpen < 0 then numOpen
    else
      str match
        case Nil           => numOpen
        case '(' :: next   => loop(next, numOpen + 1)
        case ')' :: next   => loop(next, numOpen - 1)
        case other :: next => loop(next, numOpen)
  loop(str, 0) == 0

src/main/scala/rec/StateMachines.scala

Rewrite this function as a state machine.