Last updated on
Webapps
Welcome to week 9 of CS-214 — Software Construction!
This exercise set is intended to reinforce some concepts used in the webapp
lab: serialization, and state machines.
As usual, ⭐️ indicates the most important exercises and questions and 🔥 indicates the most challenging ones. Exercises or questions marked 🧪 are intended to build up to concepts used in this week’s lab.
You do not need to complete all exercises to succeed in this class, and you do not need to do all exercises in the order they are written.
We strongly encourage you to solve the exercises on paper first, in groups. After completing a first draft on paper, you may want to check your solutions on your computer. To do so, you can download the scaffold code (ZIP).
Serialization / deserialization 🧪
The traditional way to handle exceptional results or errors in functional programming is the Option
type, which we studied in the calculator lab. At times, this style can be a bit heavy. We’ll see why it may still be worth using in the monads week, but for this week’s exercises (and lab!) let’s use exceptions instead.
We’ll illustrate the issue with a serialization problem.
Serialization is the process of transforming data within a program into a form that can be transmitted over a network, stored in files, etc. Deserialization is the reverse operation. Once your program has state, it’s pretty natural to want to save it, and for that you need serialization.
A common serialization format is JSON. In Scala, a JSON value can be represented like this:
package ujson
enum Value:
case Null
case True
case False
case Str(s: String)
case Num(d: Double)
case Arr(items: ArrayBuffer[Value])
case Obj(items: LinkedHashMap[String, Value])
A serialization function for type T
is a function of type T => ujson.Value
. A deserialization function is a function of type ujson.Value => Try[T]
. The type Try[T]
models either a Success(t: T)
or a Failure
, and comes with a very convenient constructor Try(…)
which runs the computation …
, and returns a Failure
if it raises an exception, and a Success
otherwise. A wire is a pair of a serialization and a deserialization function; it can be represented thus:
trait Wire[T]:
def serialize(t: T): ujson.Value
def deserialize(js: ujson.Value): Try[T]
src/main/scala/serialization/Wire.scala
As an example, here are a serializer and a deserializer for a pair of an integer and a string:
object IntStringWire extends Wire[(Int, String)]:
def serialize(t: (Int, String)): ujson.Value =
ujson.Arr(
ujson.Num(t._1),
ujson.Str(t._2)
)
def deserialize(js: ujson.Value): Try[(Int, String)] =
Try {
val arr = js.arr // .arr throws an exception if the input isn't an array
val (fst, snd) = (arr(0).num, arr(1).str)
if !fst.isValidInt then
throw IllegalArgumentException(f"Not an int: $fst")
(fst.toInt, snd)
}
src/main/scala/serialization/Wire.scala
And if we are given a serializer and deserializer for some type T
, we can use it to create a serializer and deserializer for Option[T]
:
case class OptionWire[T](w: Wire[T]) extends Wire[Option[T]]:
def serialize(t: Option[T]): ujson.Value =
t match
case None =>
ujson.Arr(ujson.Str("none"))
case Some(t) =>
ujson.Arr(ujson.Str("some"), w.serialize(t))
def deserialize(js: ujson.Value): Try[Option[T]] =
Try {
val arr = js.arr
val tag = arr(0).str
if arr.size == 1 && arr(0).str == "none" then
None
else if arr.size == 2 && arr(0).str == "some" then
Some(w.deserialize(arr(1)).get)
else
throw IllegalArgumentException(f"Unexpected: ${arr.toList}")
}
src/main/scala/serialization/Wire.scala
-
Read the documentation of the Try type.
-
Skim through the examples provided in the documentation of the
ujson
library (you can skip the last section onConverting To/From Scala Data Types). -
Consider the following type:
enum Formula: case Lit(b: Boolean) case Var(name: String) case FnCall(fn: String, args: List[Formula])
src/main/scala/serialization/Wire.scala
How would you serialize a value of this type? Think about representation choices on paper first. Try a few different examples by hand. Then, write the serializer and deserializer.
object FormulaWire extends Wire[Formula]: import Formula.* def serialize(e: Formula): ujson.Value = ??? def deserialize(js: ujson.Value): Try[Formula] = ???
src/main/scala/serialization/Wire.scala
-
All encoder/decoder pairs should verify
deserialize(serialize(v)) === Success(v)
for allv
. Use this equation to test your serializer and deserializer together.(This is necessary but not sufficient for correctness; do you see why?)
Isolating state: state machines 🧪
Most stateful applications are thoroughly stateful: there’s state everywhere, and anything can change under your feet at any time. A better model, when you must have state, is to contain it in one place. This is the state machine model, where you have a single piece of state, and a function to update it.
You have already seen many examples of this pattern, without knowing it:
-
Hardware circuits, e.g. the CPUs in your computer and mobile phone, are designed as combinational circuits (i.e., purely functional) circuits wiring together state elements (latches, flip-flops/registers).
-
All uses that we’ve made of
foldLeft
andfoldRight
take a starting statez
and update it once per item in their input list.
Another way to put this is that we have a state and instructions that are interpreted to update the state. In that sense, foldLeft
and foldRight
are interpreters for a language whose instructions are the elements of the list being processed.
This week’s lab will demonstrate how powerful this model is for building reliable single- and multi-user applications for desktop, mobile, and web environments. For now, let’s get used to thinking in terms of state machines. We’ll use the following model:
trait StateMachine[State, Event, Result]:
def init: State
def update(s: State, e: Event): State // Sometimes called *transition*
def finish(s: State): Result
src/main/scala/rec/StateMachines.scala
def processEvents[S, E, R](sm: StateMachine[S, E, R])(events: List[E]): R =
sm.finish(events.foldLeft(sm.init)(sm.update))
src/main/scala/rec/StateMachines.scala
Why do we bother with all this if it all boils down to a foldLeft
? Because in general we are interested in modeling streaming processes, where events are user inputs, messages on a network, interrupts, etc. In that case, instead of a foldLeft
, we would store the state in a var
and update it every time we receive an event from the user, the network, a sensor, etc. (and there would be no finish
stage)
Here is a simple example: counting how many times the word "CS"
is followed by "214"
in a list of words. The state has two parts:
case class State(lastWord: Option[String], occurencesCount: Int)
src/main/scala/rec/StateMachines.scala
lastWord
remembers which word we saw last, and occurences
stores the ongoing count.
The events and results are simpler: events are simply words, and the final result is a number:
type Event = String
src/main/scala/rec/StateMachines.scala
type Result = Int
src/main/scala/rec/StateMachines.scala
We can now define the state machine itself:
object PatternMatchingSM extends StateMachine[State, Event, Result]:
src/main/scala/rec/StateMachines.scala
… it starts with no previous word and a count of 0:
def init = State(None, 0)
src/main/scala/rec/StateMachines.scala
… every time we receive a new word, it updates its state to increment the count if needed, and replaces the last word:
def update(s: State, e: Event) =
State(
Some(e),
s.occurencesCount
+ (if s.lastWord == Some("CS") && e == "214" then 1 else 0)
)
src/main/scala/rec/StateMachines.scala
… and finally, when the stream is exhausted, it just returns the final count:
def finish(s: State) = s.occurencesCount
src/main/scala/rec/StateMachines.scala
Now, how do we use this state machine? With processEvents
!
val countMatchesSM: List[String] => Int =
processEvents(PatternMatching.PatternMatchingSM)
src/main/scala/rec/StateMachines.scala
Notice the currying: we could have also written things this way:
def countMatchesSM2(l: List[String]): Int =
processEvents(PatternMatching.PatternMatchingSM)(l)
src/main/scala/rec/StateMachines.scala
… and then called the resulting function as usual:
scala> countMatchesSM(List("A", "special", "CS", "214", "exercise"))
1
If you find yourself unsure of how this function really works, try tracing processEvents
— it helps a lot to see exactly which states are being traversed.
Counting words ⭐️
This isn’t in fact the first time that we see state machines — we’ve seen them once before, when we wrote wordCount
using foldLeft
:
case class WordCountState(count: Int, lastWasWS: Boolean)
def wordCount(l: List[Char]): Int =
l.foldLeft[WordCountState](WordCountState(0, true))((state: WordCountState, c: Char) =>
val cIsWS = c.isWhitespace
val count = state.count + (if state.lastWasWS && !cIsWS then 1 else 0)
WordCountState(count, cIsWS)
).count
src/main/scala/rec/StateMachines.scala
Rewrite this function as a state machine.
Counting parentheses
When studying parallelism, we solved the parentheses-balancing problem in the following way:
def isBalanced(str: List[Char]): Boolean =
def loop(str: List[Char], numOpen: Int): Int =
if numOpen < 0 then numOpen
else
str match
case Nil => numOpen
case '(' :: next => loop(next, numOpen + 1)
case ')' :: next => loop(next, numOpen - 1)
case other :: next => loop(next, numOpen)
loop(str, 0) == 0
src/main/scala/rec/StateMachines.scala
Rewrite this function as a state machine.