Introducing error reporting in optics
A frequently requested feature is the ability to report why an optic failed. It is particularly crucial when you build a
sophisticated optic. Say you have a large configuration document, and you want to focus on kafka.topics.order-events.partitions.
There may not be a partitions key, or if it exists, it may have an unexpected format, e.g. it is a String instead of an Int.
In Monocle and other optics libraries, we currently cannot provide any details about
the failure. In this blog post, I will discuss my experiments with a new optics encoding that supports detailed error reporting.
In particular, I will present a step-by-step refactoring of one specific type of optic such as you can see the intermediate
attempts as well as the final solution.
This article is written using Dotty 0.21.0-RC1. All code is available in the following GitHub
repository.
Overview of Optional
An Optional (aka Affine Traversal) is an optics used to access a section (To) of a larger object (From).
As the name implies, the area targeted by an Optional may be missing. Here is the interface of an Optional.
|
|
Unsurprisingly, we can use an Optional to access an optional field in a class.
|
|
We can also build an Optional to target any value in a Map.
|
|
The interfaces follow three rules which ensure that if an Optional successfully accesses a value, then it can only
modify this particular section of the larger object and nothing else. These rules are generally checked using property based testing:
- if
getOption(from) == Some(x), thenreplace(x, from) == from. - if
getOption(from) == Some(x), thengetOption(replace(y, from)) == Some(y). - if
getOption(from) == None, thenreplace(x, from) == from.
Perhaps surprisingly, an Optional does not let you insert data. You can only change values that already exist, as this example shows.
|
|
Optional offers a rich API with more than a dozen useful combinators, but most importantly, Optional composes with other
Optional such as you can easily access nested data structure.
|
|
Here is the core of the problem. Both index("marie") >>> email and index("bob") >>> email fail with users. The former
because Marie doesn't have an email and the latter because Bob is not a valid user. Sadly, we have no way to distinguish
these two cases.
The issue is that Optional returns an Option which does not provide any details when it fails. If we want to report
a precise error message, we need to use something like Either. The question is then, what should be the type of the error
message? Let's start with something simple like String and see how far we can go.
Optional with String error
We only need to change the signature of getOption to return an Either[String, From]. Let's also use that occasion to rename
this method getOrError. We should probably rename Optional too, but I don't have a better idea at the moment.
|
|
Yeah! That wasn't too difficult. We could stop here but having the error type hardcoded to String is a bit unsatisfying. What if someone wanted to build a DSL to manipulate JSON or YAML ala JsonPath from Circe. In that case, we may want to return a path from the root element as well as an error message.
|
|
Optional with custom error
If we want the error type to be fully customisable by the users, it needs to be a type parameter of Optional, e.g.
Optional[CustomError, User, Email].
|
|
Now, let's define a simple configuration language with three types of values: Int, String and Object.
|
|
When we access a Config, we can experience two kinds of failure. Either the data is missing, or it is an unexpected
format, e.g. we want an Int, but it is a String. Let's also use an enumeration to encode errors.
|
|
Now, we can define a few Optionals that parse a generic Config into each data type: Int, String, or Object.
These Optionals can only fail because the config type is incorrect, so let's use a specific InvalidFormat error.
|
|
If you are familiar with optics hierarchy, you may have noticed that int, str and obj could be a Prism (a more
specific optics). However, Optional works fine too, so let's keep things simple.
Next step, we can adapt index to return a MissingKey error.
|
|
Finally, let's define an optics called property such as we can have a friendly syntax to access nested Config objects.
|
|
Property checks if a Config is an Object, and then it focuses into a key of the Map. The error type of property
should be a ConfigFailure because it can fail for either reasons.
|
|
Regrettably, the compiler rejects composing obj and index together because they have different failure types. MissingKey
and InvalidFormat are both ConfigFailure, but >>> is too strict. It requires both sides of >>> to have exactly the same error.
There are two solutions to this problem:
- We update the definition of
objandindexto useConfigFailureinstead of a more specialised error. - We use variance on the error type of
Optional, and let the compiler automatically adapt the error type when required.
Historically, in the functional programming side of Scala, variance had a lousy reputation. However, libraries like fs2 or ZIO recently demonstrated that variance offers a great user experience in terms of type inference. The implementation is slightly more complicated, but it is completely acceptable if end-users enjoy a better experience.
Optional with covariant error
Variance is quite tricky to grasp. Fortunately, the compiler is here to help us. If we ever use the wrong variance annotation,
the compiler will let us know and usually give us some suggestions. In our case, we can deduce Optional is covariant (+)
in Error because the two core methods of Optional: getOrError and replace, only mention Error in their output.
|
|
>>> signature is a bit scary, let's go through an example.
|
|
obj has an error type of InvalidFormat and >>> has the constraint NewError >: Error. It means the
error type of index must be a super type of InvalidFormat. We defined index with a MissingKey error which is not
a super type of InvalidFormat. However, since Optional is covariant in Error, the Scala/Dotty compiler can automatically
upcast index error to satisfy the constraint (see types in green).
ConfigFailure, AnyRef, or Any are all valid options. The compiler has some heuristics to determine which
type should be inferred. In this case, it chose the lower bound ConfigFailure, see this presentation
from Guillaume Martres for more details about type inference.
Now, let's go back to our main use case.
|
|
Great, >>> lifted automatically the error of obj and index to ConfigFailure which is precisely what we wanted.
Can we do better? Are they still some corner cases?
Imagine we have a third party library like refined,
offering an Optional to check if an Int is a PortNumber.
|
|
We may want to use this validation with our configuration DSL. However, portNumber uses a generic String error message
and not our ConfigFailure enumeration. So, the type inference algorithm will use the first common parent of
ConfigFailure and String which is a rather useless Anyref.
|
|
One could argue we should transform the String error message from the third-party library error type to our domain model
using something like mapError.
|
|
However, it would be better if by default combining two unrelated errors offered a more precise type than Any or AnyRef.
In Scala 2, we are out of luck, but Dotty has some helpful features.
Composing error with union types
Union types are a new feature of Dotty. They allow
defining the most precise upper bound between two or more types. In our previous example, String | ConfigFailure would
be the best possible type to return when we compose int >>> portNumber.
We only need to change the signature of >>>; the implementation would stay the same. In my opinion, union types
simplify the signature of >>> since we don't need the constraint NewError >: Error anymore.
|
|
Perfect. Type inference works, and we have the most precise error type. Since we are building an Optional, we can as quickly
update an arbitrary Config (assuming we have some macro to convert a 9000 literal to PortNumber).
|
|
Conclusion and future work
I am super excited about this new encoding. It is to my knowledge a novel approach leveraging Scala specific features: variance and union types. There were several attempts to add error reporting to Haskell Lens, but the main issue seems to be related to type inference (see discussion on coindexing).
In my next blog post, I will explore the impact of error reporting on the rest of the optics hierarchy, e.g. can we return errors
with other optics like Prism and Traversal? How can we add error reporting without breaking existing code? What does
it mean for Optional to have Nothing as error type? Surprisingly, we will see that variance and inheritance combine
exceptionally well and offer a compelling optics encoding.
Stay tuned. In the meantime, you can follow me on twitter or discuss this article on reddit.