Type safety

What is type safety

Type-safety is making use of what we know of our values at compile time to minimize the consequences of most mistakes.

Avoid null

  • Reason: null values can appear in any variable or value, be passed into functions, assign them to other variables, and store them in collections. So null values cause errors far away from where they initialized, and are difficult to track down.
  • Option[T]: used to represent a value that may or may not exist

For example, we need to write a class Address that has street2 is optional value:

case class Address(street1: String,
                   street2: Option[String],
                   city: String,
                   state: String,
                   zip: String)
  • val: used to declare and initialize a variable at one go
def findHaoyiEmail(items: Seq[(String, String)]) = {
  val email = items
    .collectFirst { case (name, value) if name == "Haoyi" => value }
    .getOrElse("Not Found")
  doSomething(email)
}

Avoid exception

  • Reason:
    • The compiler doesn’t complain if you don’t handle it.
    • Function signature doesn’t clearly explain kind of exception.
    • Exceptions interrupt the program flow by jumping back to the caller.
    • Exceptions are also a rather bad solution for applications with a lot of concurrency. For instance, if you need to deal with an exception thrown by an Actor that is executed on some other thread, you obviously cannot do that by catching that exception – you will want a possibility to receive a message denoting the error condition.
  • Option[T]: Use when know that values may be absent but we don’t really care about the reason why.
For example, when performing division, instead of returning an Int or fail with an exception when the divisor is zero, we return Option[Int]:
def divide(a: Int, b: Int): Option[Int] =
  if (b == 0) None
  else Some(a / b)

When indexing a list, instead of returning a value A for that index or fail with an exception when that index doesn’t exist, we return Option[A]:

def index[A](xs: List[A], i: Int): Option[A] =
  try {
    Some(xs(i))
  } catch {
    case _: Exception => None
  }
  • Try[T]: If an instance of Try[A] represents a successful computation, it is an instance of Success[A], simply wrapping a value of type A. If, on the other hand, it represents a computation in which an error has occurred, it is an instance of Failure[A], wrapping a Throwable, i.e. an exception or other kind of error. If we know that a computation may result in an error, we can simply use Try[A] as the return type of our function. However, the signature of the function does not have information of exception type.

For example, let’s assume we want to write a function that parses the entered url and creates a java.net.URL from it:

def parseURL(url: String): Try[URL] = Try(new URL(url))

As you can see, we return a value of type Try[URL]. If the given url is syntactically correct, this will be a Success[URL]. If the URL constructor throws a MalformedURLException, however, it will be a Failure[URL]. Hence, parseURL("http://danielwestheide.com") will result in a Success[URL] containing the created URL, whereas parseURL("garbage") will result in a Failure[URL] containing a MalformedURLException.

  • Either[L, R]: Represents a Left value or a Right value. Left is the exceptional case and Right is the success value.
object EitherStyle extends App {
  def parse(s: String): Either[Exception, Int] =
    if (s.matches("-?[0-9]+")) Right(s.toInt)
    else Left(new NumberFormatException(s"$s is not a valid integer."))

  def reciprocal(i: Int): Either[Exception, Double] =
    if (i == 0) Left(new IllegalArgumentException("Cannot take reciprocal of 0."))
    else Right(1.0 / i)

  def stringify(d: Double): String = d.toString

  def magic(s: String): Either[Exception, String] =
    parse(s).flatMap(reciprocal).map(stringify)

  magic("123") match {
    case Left(_: NumberFormatException) => println("not a number!")
    case Left(_: IllegalArgumentException) => println("can't take reciprocal of 0!")
    case Left(_) => println("got unknown exception")
    case Right(s) => println(s"Got reciprocal: $s")
  }
}
  • Custom sealed trait: Represent results with more than one failure mode
object EitherStyle extends App {
  sealed abstract class Error
  final case class NotANumber(string: String) extends Error
  final case object NoZeroReciprocal extends Error

  def parse(s: String): Either[Error, Int] =
    if (s.matches("-?[0-9]+")) Right(s.toInt)
    else Left(NotANumber(s))

  def reciprocal(i: Int): Either[Error, Double] =
    if (i == 0) Left(NoZeroReciprocal)
    else Right(1.0 / i)

  def stringify(d: Double): String = d.toString

  def magic(s: String): Either[Error, String] =
    parse(s).flatMap(reciprocal).map(stringify)

  magic("123") match {
    case Left(NotANumber(_)) => println("not a number!")
    case Left(NoZeroReciprocal) => println("can't take reciprocal of 0!")
    case Right(s) => println(s"Got reciprocal: $s")
  }
}

Avoid side effect

As an example, let’s consider a piece of code like:

var result = 0
for (i <- 0 until 10) {
  result += i
}
if (result > 10) result = result + 5
println(result) // 50 
makeUseOfResult(result)

We initialize result to some placeholder value, and then use side-effects to modify result to get it ready for the makeUseOfResult function. If you leave out one of the mutations, the function makeUseOfResult gets invalid input and do the wrong thing.

var result = 0 
for (i <- 0 until 10){
 results += i 
} 
println(result) // 45 
makeUseOfResult(result) // getting invalid input!

In this case, we should eliminate the side effects, and giving the different “stages” of result different names:

val summed = (0 until 10).sum
val result = if (summed > 10) summed + 5 else summed 

println(result) // 50 
makeUseOfResult(result)

As a result,  leaving out one stage in the computation results in a compile error:

val summed = (0 until 10).sum 

println(result) // Compilation Failed: not found: value result 
makeUseOfResult(result)

Avoid Strings in favor of Structured data

For example, imagine we take in a string with user names and phone-numbers as comma-separated values, and want to find everyone whose phone number starts with 415 indicating they’re from California:

val phoneBook =
  """ 
    |Haoyi,6172340123 
    |Bob,4159239232 
    |Charlie,4159239232 
    |Alice,8239239232 """.stripMargin
val lines = phoneBook.split("\n") 
val californians = lines.filter(_.contains(",415")) 
// Bob,4159239232 
// Charlie,4159239232

However,  when the text format changes, e.g. to tab-separated values, and your code silently starts returning nothing:

val phoneBook =
  """
    |Haoyi  6172340123
    |Bob  4159239232
    |Charlie  4159239232
    |Alice  4159239232 """.stripMargin
val lines = phoneBook.split("\n") 
val californians = lines.filter(_.contains(",415")) 

In such situations, the better thing to do would be to first parse the phoneBook into some structured data format (here a Seq[(String, String)]) you expect, before working with the structured data to get your answer:

val parsedPhoneBook =
  for (
    line <- phoneBook.split("\n")
  ) yield {
    val Seq(name, phone) = line.split(",")(name, phone)
  }
// Seq( 
// ("Haoyi", "6172340123"), 
// ("Bob", "4159239232"), 
// ("Charlie", "4159239232"), 
// ("Alice"," 8239239232") 
// ) 

parsedPhoneBook.filter(_._2.startsWith("415"))
// Seq( 
// ("Bob", "4159239232"), 
// ("Charlie", "4159239232") 
// )

In such a scenario, if the input data format changes unexpectedly, your code will fail when computing parsedPhoneBook.  Thus, data-format errors will only happen in one place, and after that the code is “safe”.

Choosing data structure

For example, we need to store the phone cook containing users and phone numbers and a lookup function to get a user’s phone-number based on their name.

def lookup(name: String): ???

We can use following data structure to store the phone cook

  • Seq[(String, String)]: use when allowing duplicate entries.
val phoneCookSeq = Seq(("a", 123), ("a", 123), ("a", 456), ("b", 123))
// Seq(("a", 123), ("a", 123), ("a", 456), ("b", 123))

The lookup function will return Seq[String] or (String, Seq[String]) if there is at least a phone numbers corresponding to the name existed.

  • Set[(String, String)]: use when (name, phone-number) pair is unique.
val phoneCookSet = Set(("a", 123), ("a", 123), ("a", 456), ("b", 123))
// Set(("a", 123), ("a", 456), ("b", 123))

The lookup function will return Set[String] or (String, Set[String]) if there is at least a phone numbers corresponding to the name existed.

  • Map[String, String]: use when duplicate phone-numbers but no duplicate names.
val phoneCookMap = Map("a" -> 123, "a" -> 123, "a" -> 456, "b" -> 123)
// Map("a" -> 456, "b" -> 123)

The lookup function will return Option[String] or String if there certainly is a phone numbers corresponding to the name existed.

Avoid Integer Constants

val ERROR_CODE_BAD_REQUEST = 400 
val ERROR_CODE_NOT_FOUND = 404 
val ERROR_CODE_INTERNAL_SERVER_ERROR = 500
  • Advantage:
    • Int takes minimal memory to store or pass
    • More obvious than magic number
  • Disadvantage: Cause runtime error in some situations when you modify unexpectedly a ERROR_CODE constant or you pass pass in all sorts of Int into places where a ERROR_CODE constant is required:
responseErrorCode(ERROR_CODE_BAD_REQUEST + 100)
val currentErrorCode: Int = 15
  • Solution: Use case object
sealed abstract class ErrorCode(value: Int) 

object ErrorCode {
  case object BadRequest extends ErrorCode(400)
  case object NotFound extends ErrorCode(404) 
  case object InternalServerError extends ErrorCode(500)
} 

responseErrorCode(ErrorCode.BadRequest + 10) // Compilation Failed 
val currentErrorCode: ErrorCode = 15 // Compilation Failed

Avoid String Constants

val ERROR_BAD_REQUEST = "Bad request" 
val ERROR_NOT_FOUND = "Not found"
val ERROR_INTERNAL_SERVER_ERROR = "Internal server error"
  • This has the same problem as using integers; you can call .subString.length.toUpperCase and all other string methods on these ERROR constants, and they’re all meaningless and definitely not what you want. Similarly, you can pass in all sorts of Strings into places where a ERROR constant is required
responseError(ERROR_BAD_REQUEST.substring(1, 2)) // Cause runtime error
val currentError: String = a // Cause runtime error
  • Solution: Use case object
sealed abstract class Error (value: String)

object Error {
  case object BadRequest extends Error("Bad request")
  case object NotFound extends Error("Not found")
  case object InternalServerError extends Error("Internal server error")
}

responseError(Error.BadRequest.substring(1, 2)) // Compilation Failed
val currentError: Error = 15 // Compilation Failed

Box Integer IDs

For example, you have a function deploy a machine. If id is Int or String,  passing id of a user instead of id of a machine will not result into a compile error. Or worse, having no error at runtime and instead silently deploying the wrong machine.

Solution: We must box id into a case class

case class UserId(id: Int)
case class MachineId(id: Int)

def deploy(machineId: MachineId)

References

http://www.lihaoyi.com/post/StrategicScalaStylePracticalTypeSafety.html

https://typelevel.org/cats/datatypes/either.html

https://danielwestheide.com/blog/the-neophytes-guide-to-scala-part-6-error-handling-with-try/

Add a Comment

Scroll Up