What are "ADTs"?
Algebraic Data Types.
Lovely. What the heck does that mean?
It means you can mix product and sum types together.
Okay. I have one more question, and please start answering in English: What are product and sum types?
Product types you'll be deeply familiar with - it's when you make a
new type by combining others. For example, you could define a Person
type as the combination of a String
for their name, and an Int
for their
age. Like this:
type Person
= Person String Int
...or like this:
type Person
= Person
{ name : String
, age : Int
}
In C that'd be a struct. In Java you'd probably call it a POJO. In Clojure they say, "Just use a map."
Why's it called a 'product' type?
It might help to think of how many possible values there are for
Person
now.
It's all the possible values for String
, multiplied by all the
possible values for Int
. It's the multiple, or product, of the two
types.
Okay, product types combine types together. What's a sum type?
Well, product types are a bit like an AND
operation, aren't they?
They AND
types together. If you think of it that way, you're
instantly going to start to wonder what an OR
would look like. For
that, we can pull out another old chestnut from the Big Book of Type
Examples. Chapter two, playing cards:
type Suit
= Heart
| Spade
| Club
| Diamond
A suit is a heart, or a spade, or a club, or a diamond. Any one of those four values, but only one. It's an exclusive-or, as a type.
So a sum type is just an enum?
Well...yeah I suppose it is. But that's not what makes this interesting. An ADT isn't just structs and enums, it's the free mixing of them to make sophisticated data models.
Give me a concrete example.
Glad you asked. I'm going to make a credit card payment. When the answer comes back from my payment provider, there are lots of things that could happen. Hopefully it goes well and I get a receipt ID back as a string. But maybe the card on file has expired, and I get a few details about the card to prompt the user with. Or it just blows up with different kinds of error. I can model all possible responses neatly with an ADT, like so:
type PaymentResponse
= Paid String
| CardExpired Int Date
| HttpError Int String
| AuthError
Here we have Paid
, which is a wrapper around a String
;
CardExpired
and HttpError
which are product types; and AuthError
which is a "traditional" enum value. All mixed together in a sum type
to model all possible payment responses.
Most languages don't have support for this. They might have enums, but not ones that carry data. You can't have an enum of structs or an enum of POJOs1.
Clojure's an interesting case where it does have ADTs, but they're possibly too flexible. You can mix together maps, lists and primitives to make any shape of data you like. But you can suffer with the exclusion problem. While you can say all these things are valid responses:
{:paid "eefa9112"}
{:card-expired [8783 #date "03/19"]}
{:http-error [500 "Server just exploded"]}
:auth-error
...you have a harder time saying what isn't a valid response, like:
{:card-expired "we don't know why"}
That's where Plumatic schema and spec become essential.
So is that it? Is that all an ADT is?
No. You're at the entrance to a delightful rabbit hole. ADTs can have type variables, and can be further generalised, and all sorts of useful stuff. But if you have the core idea - mixing and matching ANDs and ORs to build up data models - you're on your way.
So why is it useful?
Because data-modelling is the most important thing we do, so tools for modelling data are fundamentally interesting. But let me give you a concrete example of something cool to go away with. Once we've defined our rich data type, a suitably-savvy compiler will be able to help us use it properly. Here's some pseudo-code to handle payment responses:
handle paymentResponse =
case paymentResponse of
Paid id ->
insert db "receipts" db id
CardExpired fourDigits date ->
notifyUser ("Your card " ++ fourDigits ++ " expired on " ++ date ++ ".")
HttpError code msg ->
logError code msg
Did you spot the error in that code? You probably did, but the compiler is guaranteed to notice:
-- MISSING PATTERNS ---------------------------------------------- src/Main.elm
This `case` does not have branches for all possibilities.
...
You need to account for the following values:
PaymentResponse.AuthError
Add a branch to cover this pattern!
The compiler knows which values PaymentResponse
can have, and so can check our case
statements for completeness. In this case, it's instantly noticed that
we've ignored the authorization failure.
Sweet. Okay, last question. If there's AND and OR...where's NOT?
You mean a type that can be any type except something?
Yeah.
Interesting question. Yup, that'd still be part of an Algebra of Data Types, so it'd count. I've never seen it though. Let me know if you do...