Introducing error reporting in optics
A frequently requested feature is the ability to report why an optic failed. It is particularly crucial when you build a
sophisticated optic. Say you have a large configuration document, and you want to focus on kafka.topics.order-events.partitions
.
There may not be a partitions
key, or if it exists, it may have an unexpected format, e.g. it is a String instead of an Int.
In Monocle and other optics libraries, we currently cannot provide any details about
the failure. In this blog post, I will discuss my experiments with a new optics encoding that supports detailed error reporting.
In particular, I will present a step-by-step refactoring of one specific type of optic such as you can see the intermediate
attempts as well as the final solution.
This article is written using Dotty 0.21.0-RC1
. All code is available in the following GitHub
repository.
Overview of Optional
An Optional
(aka Affine Traversal) is an optics used to access a section (To
) of a larger object (From
).
As the name implies, the area targeted by an Optional
may be missing. Here is the interface of an Optional.
|
|
Unsurprisingly, we can use an Optional
to access an optional field in a class.
|
|
We can also build an Optional
to target any value in a Map
.
|
|
The interfaces follow three rules which ensure that if an Optional
successfully accesses a value, then it can only
modify this particular section of the larger object and nothing else. These rules are generally checked using property based testing:
- if
getOption(from) == Some(x)
, thenreplace(x, from) == from
. - if
getOption(from) == Some(x)
, thengetOption(replace(y, from)) == Some(y)
. - if
getOption(from) == None
, thenreplace(x, from) == from
.
Perhaps surprisingly, an Optional
does not let you insert data. You can only change values that already exist, as this example shows.
|
|
Optional
offers a rich API with more than a dozen useful combinators, but most importantly, Optional
composes with other
Optional
such as you can easily access nested data structure.
|
|
Here is the core of the problem. Both index("marie") >>> email
and index("bob") >>> email
fail with users
. The former
because Marie doesn't have an email and the latter because Bob is not a valid user. Sadly, we have no way to distinguish
these two cases.
The issue is that Optional
returns an Option
which does not provide any details when it fails. If we want to report
a precise error message, we need to use something like Either
. The question is then, what should be the type of the error
message? Let's start with something simple like String
and see how far we can go.
Optional with String error
We only need to change the signature of getOption
to return an Either[String, From]
. Let's also use that occasion to rename
this method getOrError
. We should probably rename Optional
too, but I don't have a better idea at the moment.
|
|
Yeah! That wasn't too difficult. We could stop here but having the error type hardcoded to String is a bit unsatisfying. What if someone wanted to build a DSL to manipulate JSON or YAML ala JsonPath from Circe. In that case, we may want to return a path from the root element as well as an error message.
|
|
Optional with custom error
If we want the error type to be fully customisable by the users, it needs to be a type parameter of Optional
, e.g.
Optional[CustomError, User, Email]
.
|
|
Now, let's define a simple configuration language with three types of values: Int
, String
and Object
.
|
|
When we access a Config
, we can experience two kinds of failure. Either the data is missing, or it is an unexpected
format, e.g. we want an Int
, but it is a String
. Let's also use an enumeration to encode errors.
|
|
Now, we can define a few Optionals
that parse a generic Config
into each data type: Int
, String
, or Object
.
These Optionals
can only fail because the config type is incorrect, so let's use a specific InvalidFormat
error.
|
|
If you are familiar with optics hierarchy, you may have noticed that int
, str
and obj
could be a Prism
(a more
specific optics). However, Optional
works fine too, so let's keep things simple.
Next step, we can adapt index
to return a MissingKey
error.
|
|
Finally, let's define an optics called property
such as we can have a friendly syntax to access nested Config
objects.
|
|
Property
checks if a Config
is an Object
, and then it focuses into a key of the Map
. The error type of property
should be a ConfigFailure
because it can fail for either reasons.
|
|
Regrettably, the compiler rejects composing obj
and index
together because they have different failure types. MissingKey
and InvalidFormat
are both ConfigFailure
, but >>>
is too strict. It requires both sides of >>>
to have exactly the same error.
There are two solutions to this problem:
- We update the definition of
obj
andindex
to useConfigFailure
instead of a more specialised error. - We use variance on the error type of
Optional
, and let the compiler automatically adapt the error type when required.
Historically, in the functional programming side of Scala, variance had a lousy reputation. However, libraries like fs2 or ZIO recently demonstrated that variance offers a great user experience in terms of type inference. The implementation is slightly more complicated, but it is completely acceptable if end-users enjoy a better experience.
Optional with covariant error
Variance is quite tricky to grasp. Fortunately, the compiler is here to help us. If we ever use the wrong variance annotation,
the compiler will let us know and usually give us some suggestions. In our case, we can deduce Optional
is covariant (+
)
in Error
because the two core methods of Optional
: getOrError
and replace
, only mention Error
in their output.
|
|
>>>
signature is a bit scary, let's go through an example.
|
|
obj
has an error type of InvalidFormat
and >>>
has the constraint NewError >: Error
. It means the
error type of index
must be a super type of InvalidFormat
. We defined index
with a MissingKey
error which is not
a super type of InvalidFormat
. However, since Optional
is covariant in Error
, the Scala/Dotty compiler can automatically
upcast index
error to satisfy the constraint (see types in green).
ConfigFailure
, AnyRef
, or Any
are all valid options. The compiler has some heuristics to determine which
type should be inferred. In this case, it chose the lower bound ConfigFailure
, see this presentation
from Guillaume Martres for more details about type inference.
Now, let's go back to our main use case.
|
|
Great, >>>
lifted automatically the error of obj
and index
to ConfigFailure
which is precisely what we wanted.
Can we do better? Are they still some corner cases?
Imagine we have a third party library like refined,
offering an Optional
to check if an Int
is a PortNumber
.
|
|
We may want to use this validation with our configuration DSL. However, portNumber
uses a generic String
error message
and not our ConfigFailure
enumeration. So, the type inference algorithm will use the first common parent of
ConfigFailure
and String
which is a rather useless Anyref
.
|
|
One could argue we should transform the String
error message from the third-party library error type to our domain model
using something like mapError
.
|
|
However, it would be better if by default combining two unrelated errors offered a more precise type than Any
or AnyRef
.
In Scala 2, we are out of luck, but Dotty has some helpful features.
Composing error with union types
Union types are a new feature of Dotty. They allow
defining the most precise upper bound between two or more types. In our previous example, String | ConfigFailure
would
be the best possible type to return when we compose int >>> portNumber
.
We only need to change the signature of >>>
; the implementation would stay the same. In my opinion, union types
simplify the signature of >>>
since we don't need the constraint NewError >: Error
anymore.
|
|
Perfect. Type inference works, and we have the most precise error type. Since we are building an Optional
, we can as quickly
update an arbitrary Config
(assuming we have some macro to convert a 9000
literal to PortNumber
).
|
|
Conclusion and future work
I am super excited about this new encoding. It is to my knowledge a novel approach leveraging Scala specific features: variance and union types. There were several attempts to add error reporting to Haskell Lens, but the main issue seems to be related to type inference (see discussion on coindexing).
In my next blog post, I will explore the impact of error reporting on the rest of the optics hierarchy, e.g. can we return errors
with other optics like Prism
and Traversal
? How can we add error reporting without breaking existing code? What does
it mean for Optional
to have Nothing
as error type? Surprisingly, we will see that variance and inheritance combine
exceptionally well and offer a compelling optics encoding.
Stay tuned. In the meantime, you can follow me on twitter or discuss this article on reddit.