17 May 2023
C# does not support sum types, which are commonly used to encode failure at the type level in statically typed functional languages. So in C#, how can we encode failure in the type system in order to keep our functions total and our signatures true (rather than throw exceptions and perforate our information pipelines in the process, making them as air tight as a strainer)?
In a previous post, I lamented the prevalent use of exceptions in C# because it break the contract stated a function’s signature, and it forces the caller of the function to know implementation details, thus violating the “program to an interface, not an implementation” mantra1.
As an alternative to restricting the function’s input type to
only valid values, I gave an example, in F#, of using
Option
as the return type. This
Option<'T>
2 type can be simply
defined as the collection of all values of T
, tagged with a
Some
data constructor, plus an additional None
value.3 Sum types like this one are
supported in F# and other (predominantly functional) languages (e.g.,
Haskell, OCaml, SML, Scala, Rust, etc.), and specifying such a type in
those languages is trivial. In F#, for example, the type is defined
as
type Option<'T> = None | Some of 'T
But how might you do this in languages that do not have native
support for sum types, like C#? Church encoding
is one way to do it.4 Church encoding, named after Alonzo
Church, is a means of representing data and operators in the untyped
lambda calculus, i.e., using only functions. To represent our two
options None
and Some of 'T
using Church
encoding, we need two functions, one for each case. What might these
functions look like?
In Haskell syntax:
= \n s -> n
none = \x n s -> s x some
In JavaScript syntax:
const none = n => s => n
const some = x => n => s => s(x)
Each function takes as arguments each of the possible cases:
n
(None
) and s
(Some
). We define none
simply as the function
than returns n
, and some
as the function that
“loads” an x
value (this corresponds to the T
value in Option<'T>
), ignores the n
value, and returns s
applied to x
. These
functions serve as a kind of
pattern
matching mechanism. For example, say we have a match
function that takes an o
function that is either
none
or some
, a default d
in case
o
is none
, and a function f
in
case o
it some
:
Haskell:
= \d f o -> o d f match
JavaScript:
const match = d => f => o => o(d)(f)
Then applying match
to some
would yield
f
applied to the x
in some
,
e.g.,
Haskell:
0 (+1) (some 2)
match == (some 2) 0 (+1)
== (\n s -> s 2) 0 (+1)
== 3
JavaScript:
match(0)(x => x+1)(some(2))
== some(2)(0)(x => x+1)
== (n => s => s(2))(0)(x => x+1)
== 3
Applying match
to none
would yield
d
, e.g.,
Haskell:
0 (+1) none
match == none 0 (+1)
== (\n s -> n) 0 (+1)
== 0
JavaScript:
match(0)(x => x+1)(none)
== none(0)(x => x+1)
== (n => s => n)(0)(x => x+1)
== 0
How do we encode this in C#? Since we want to tie these two option
together as one type, let’s define two classes None
and
Some
that implement a common interface IOption
with a Match
function.
public interface IOption<T>
{
<R>(R none, Func<T, R> some);
R Match}
public class None<T> : IOption<T>
{
public R Match<R>(R none, Func<T, R> some)
{
return none;
}
}
public class Some<T> : IOption<T>
{
private readonly T value;
public Some(T value)
{
this.value = value;
}
public R Match<R>(R none, Func<T, R> some)
{
return some(value);
}
}
The Some
constructor take a value, which corresponds to
the x in the
\x n s -> s x
lambda function above, and serves as a
kind of partial application of it.
The Match
method, as the match
function
previously shown, is used to transform the Option
type
values (None
and Some
) into other values or
even other types. For example, if we have a value
IOption opt
, we can call
.Match("this is None", s => s.ToString()); opt
to perform a transformation on opt
into a
string
. We don’t need to know if opt
is
None
or Some
, but if it is None
,
the expression will return "this is None"
, and if it is
Some
, it will return the string representation of that
value.
We can make the code more powerful by introducing a few more
functions, for example, a LINQ-like Select
and
SelectMany
(using extension methods). These functions take
a function to apply for the case where we have Some
-thing,
and otherwise simply return None
if None
is
the value5:
public static class Option
{
public static IOption<R> Select<T, R>(this IOption<T> source, Func<T, R> selector)
{
return source.Match<IOption<R>>(new None<R>(), x => new Some<R>(selector(x)));
}
public static IOption<R> SelectMany<T, R>(this IOption<T> source, Func<T, IOption<R>> f)
{
return source.Match<IOption<R>>(new None<R>(), f);
}
}
And here’s how you’d use them:
var n = new Some<int>(42);
var nplus1 = n.Select(i => i+1);
var nplus10 = n.SelectMany(i => (new Some<int>(i+10)));
To sum up, we can encode sum types using functions a-la
lambda calculus. In C#, we can use this in combination with classes and
interfaces to have a form of sum type encoding in the type system. Here,
we have only demonstrated the implementation of one particular sum type
(Option<'T>
), but Curch encoding is generic, so any
other sum type can be encoded this way.
Ironically, C# has (and encourages the use of)
interfaces in the syntactical sense (i.e., the interface
construct, e.g., IEnumerable
), presumably to adhere to the
mantra, yet all the exception throwing forces us to peek into the
implementations to know 1) if something could fail, 2) how it could
fail, 3) with what exception(s) a failure is communicated; all this so
we know if and how to handle failures via exceptions for each and every
function call we make. As a bonus, these failure cases (presumably an
important part of a program’s flow) communicated via exceptions are not
compile-time checked.↩︎
Note that this F# Option<'T>
type,
with data constructors None
and Some
, is
exactly the same as Haskell’s Maybe a
type, with data
constructors Nothing
and Just
.↩︎
Readers familiar with C# will know that later versions
of the language now support
nullable
value types as a way to capture some of this functionality. However,
note that, while we demonstrate church encoding for a
“there-may-be-no-value” situations, this encoding is generic and can be
applied to any sum type, e.g., the F# Result
(Haskell
Either
) and many others.↩︎
If you search the web for “church-encoding sum types” you will find many articles explaining and demonstrating this encoding in a variety of languages.↩︎
For those familiar with Haskell, you’ll know that these methods essentially (though not strictly) correspond to the functor and monad type classes.↩︎