Errors and Partial Functions

---

# Disclaimer
Beyond my general disclaimer, these are personal notes that may be out of date.  Read generously.

---

---

## Wheel of Fortune

When someone spins the wheel, which of the following would you consider to be an "Error"?

- $500
- $1000
- WEAK_SPIN
- JACKPOT
- BANKRUPT
- WHEEL_ON_FIRE
- LOSE_TURN

---

Most people will divide this list in the following way:

Valid:

- $500
- $1000
- BANKRUPT
- LOSE_TURN
- JACKPOT

Errors:

- WEAK_SPIN
- WHEEL_ON_FIRE

---

There's another way you might model this, however:

Valid:

- $500
- $1000
- BANKRUPT
- LOSE_TURN
- JACKPOT
- WEAK_SPIN

Errors:

- WHEEL_ON_FIRE

The idea here is that `spin-wheel` returns a `SpinWheelResult` instead of a `Wedge`, which could then have `WheelLogicViolation` as a return value, making `WEAK_SPIN` a proper value *instead of an error*.

---

The core idea here is that nothing is *inherently* an error.  Rather, we choose to model things certain ways because they modularize our concepts well - we like to think of a `Wedge` as the output of spinning the wheel because it results in a conceptually concise "happy-path", but this is purely an artifact of what is an ergonomic way of *thinking about things*.

---

## Partial Functions

There is, however, a more formal way of thinking about what an error is.

---

A function `f: Domain -> Codomain` is simply a mapping from elements of a Domain to elements of a Codomain.

A **total** function is simply one that is defined for all elements of the Domain, in contrast to a **partial** function, which is undefined for certain inputs.

---

### Example: Division

The division function `/: (R,R) -> R` is undefined for the denominator value of `0`, and is therefore a **partial** function.

If we had instead defined it as `/: (R,R\{0}) -> R`, this is now a **total** function.

What are some other examples?

---

### Arbitrariness of Totality

As we can see from the division example, Totality is mostly a *modeling* property by how you choose to *describe* your function - in a very real sense, both of those functions are division *and* are the division that we're all familiar with.

The question of whether or not the function is "total" is simply a matter of how you want to describe its possible inputs and outputs.

---

### Partiality-Error Equivalence

The fact that partial functions exhibit the same kind of "modeling arbitrariness" as errors is not a coincidence.  Indeed, everything that we think of as an *Error* can be formalized as a *cause of Partiality* - put another way, it's *Errors* that make total functions *partial*.

The division function that accepts `0` has a `DivideByZero` error.  The function that asks a database to look up a particular value has a `NetworkTimeout` error when the internet connection gets cut.

---

## Pure & IO-based Partiality

---

### Pure Partiality

The "pure" set of concerns revolves around the *input* values that are passed into your functions, of which there are typically two sources of error:

- **Static Enforcement**: The point of describing your Domain and Codomain is to only have those values enter and exit your function.  A dynamic type system, while allowing you to describe your function, does nothing to prevent invalid values from runtime evaluation.
    - Example: Python will allow you to try and divide a string.

- **Imprecise Definition**: Your Domain may be specified in a way such that there are values with no corresponding mapped values in the Codomain.
    - Example: Passing in 0 to the denominator of a division function.
    - Example: Querying a database for an ID that doesn't exist.

In today's appendix, we'll discuss Dependent Typing, which is a type system powerful enough to eliminate the divide-by-0 problem, which raises an interesting question: Is it possible to eliminate the second class of errors?

It's actually a consequence of the *Halting Problem* that type systems cannot be powerful enough to do that.

---

### IO Partiality

The second source of Partiality is of fundamental importance, because computing must run on *real world* machines, which has two main sources:

- **Physical Layer**: every function ultimately runs against a real machine despite our virtualized programming model.  Therefore, *every single function is Partial*, since there are physical constraints (such as `OutOfMemory`) that can cause potentially any function to fail.
- **Side-Effect**: Virtually every function that is of *practical use* will perform some side-effect in the real world that *may fail*.

---

## Practice: Query a Database

Let's say you've got a database, and you have to query a table to fetch a particular record by its ID.  What are the possible errors?  How would you classify each of them by the type of Partiality they introduce?

---

## Error Representations

Before concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
--

- Sentinel Values: Returning `null`, `false`, `-1`, etc.
--

- Convention-based Sum-types: Returning `val, err := fn()` as in golang
--

- Exceptions and Panics: Java's `throw Exception` or golang's `panic()`
--

- Sum-types and Monads: Scala's `Try`, Haskell's `Maybe`, etc.

When you run through a bunch of different use cases, you start noticing that there doesn't seem to be much of a rhyme or reason for people using a particular error representation for a particular use cases - there's a lot of inconsistency and personal preference/familiarity dictating these choices.

---

## Appendix: Dependent Types

As we established previously, it's not possible for a type-system to be strong enough to eliminate the possibility of Pure Partiality errors, but it's nonetheless interesting to explore more powerful type systems as they *can* create tighter bounds on the domains of your functions, which can *reduce* the number of Pure Partiality error cases you must handle.

```idris
safe_div : (x : Int) -> (y : Int) -> {auto p : so (y /= 0)} -> Int
safe_div x y = div x y
```

---

## Wrap-up

- Today we've equipped a mental model for what an error is:

> For every function you write, anything for which an output value isn't defined for a particular input, or an IO error causing incorrect termination, is a cause of partiality and should be modeled as an error.

- Work through the exercises at (https://github.com/justin-yan/errorsandpartialfunctions) in order to develop a better feel for the monadic interface and pattern matching.

---

# Day 2: Error Handling

---

## Good Error Code

---

- Authors of code should be encouraged to write *total* functions by explicitly modeling all errors.
- Callers of code should be *forced* to handle all branches of behavior explicitly.
- These branches should ideally be handled with the same branching mechanisms all other code in the language is written with.

---

### Total Your Functions

The essence of robust software is not to *eliminate errors* since errors are simply *branches of other behavior* due to how we have chosen to model our functions.

Rather, it is to *exhaustively handle those branches in an intentional fashion*.

> Error handling mechanisms exist in order to help you convert Partial functions into "Total" functions.

Let's consider:

```java
public static Rational divide(Rational numerator, Rational denominator) throws ArithmeticException
```

this signature is essentially

```
/: (Q,Q) -> Rational || ArithmeticException
```

which is Total!

---

### Sentinel Values

> I call it my billion-dollar mistake. It was the invention of the null reference in 1965.
> Tony Hoare

This is the act of returning a value that acts as the "sentinel" and represents an error (such as `null` or `-1`).

- Sentinel Value errors are implicitly contextualized (`-1` means different things depending on where the error occurred), which means that your handling must be maximally localized - invoke the function and handle the error immediately before passing the return value anywhere else (ideally before even binding it to a variable).
- This also implies that you should impeccably document your return values since that and reading source code is the only way your callers will know those errors exist.
- Try to use things like a coalescing operator to keep this compact.

```php
$username = $_GET['username'] ?? 'not passed';
```

---

### Exceptions

The second mechanism we'll consider is the Exception.  As a separate value with explicit control flow implications, they were considered an improvement over sentinel values since they made it very difficult for people to silently and accidentally ignore an error.

- Bias towards using checked exceptions.  Representing your true return values in your type signature will recruit the compiler to help force your callers to robustly handle all cases as well as minimize documentation.
- Localize your try/catch clauses as much as possible.  Handle the exception and then translate to a default or other value as soon as possible so you can go back to using normal branching mechanisms (if/else, switch, polymorphism, etc.)
- *Wrap* exceptions instead of blindly propagating them.  If your system calls a database under the hood, don't catch-log-rethrow - instead, catch and throw a more domain-specific exception (e.g. `MySystemException()`)

---

### Error Monads

The last model we'll consider are the error monads.  These are the `Option`, `Maybe`, `Try`, and `Result`s of the world.

They are *containers*, which means that you'll have to crack them open (and handle errors) in order to get at the real value, but it also means that the errors are all well-contextualized, making it possible to pass them around as values safely.

- Avoid using them as if they were alternatives for `if/else` statements - methods like `.isPresent()` or `.get()` should generally be avoided.  Methods like `.filter()` or `.getOrElse()` are generally preferable.
- Use `.map()` in order to defer accessing the actual value, and use `.flatMap()` in order to cleanly chain with other partial functions.

---

## Well-Modeled Errors

Errors are simply other branches to be dealt with, which means error-handling code is *still code*, which means that *Modularity* matters.

In particular, there are three guiding heuristics that are particularly worth considering for error handling code:

- Authority
- Volatility Risk
- Branch Elimination

---

### Authority

One of the key questions you should be asking yourself when it comes to handling an error is simply: "Who has enough information to actually handle this error properly?"

Let's consider a case:

If you have a platform that abstracts away integrations with many different partners, how would you use the Authority heuristic to decide which errors the platform should encapsulate and which it should bubble up?

---

### Physical Failures

An interesting consequence of the Authority Heuristic is in how to deal with Physical Layer failures.

The reality is that these are inevitable and unhandleable - your code cannot remediate failures in the physical machine it runs on - and as such, they are not worth modeling.  Process Teardown should be how we proceed, and recovery must be delegated to the *meta-system* (such as a daemon or a human).

This then means that *all code* should be written as if it could fail and terminate at *any moment of time*, and that the resulting state should be such that the meta-system's recovery process will result in *correct application state*.

---

### Volatility Risk

Another major modularity principle is that of Volatility Risk.

Code to handle errors is the same as code to handle the happy path - if you have components that are vulnerable to change, then you'll want to decompose your code along those boundaries, so that changes do not risk contaminating the rest of your code.

A great example of this is database IO.

The vast majority of server-side logic allows database-specific exceptions or errors to leak out of the persistence layer adapter.  This means that *anyone* in *any other* part of the codebase could `catch (SQLException e)` and write logic that is now dependent on the specific database you are using.

If you were to ever try and change the database you were using, you'd now have invisible dependencies all over the codebase that would be very difficult to tease out.

---

### Branch Elimination

Arbitrariness allows us to model our functions however we want, but we should look to the idea of eliminating branches of behavior in order to help us choose less complexity among these many options.

John Ousterhout has described this idea as "define errors out of existence".

Consider his example, the `unset` function that removes a variable.

What should this function do if the input variable already doesn't exist?

---

## Wrap-up

I want to leave you with two heuristics to regularly use:

- Make every function I write total in its type signature.
- Ensure I handle every output value from the total functions I call.

If you consistently follow these ideas and continually seek ways to improve the expressiveness and clarity of your code, the rest will follow.