https://images.prismic.io/tripshot-new/96c864a9-a054-4f97-a299-2d1e8db5a89f_11-Variant-Forms-of-Entities-Using-Type-Families-f.png?ixlib=gatsbyFP&auto=compress%2Cformat&fit=max

Variant Forms of Entities Using Type Families

Like most companies, we use datatypes to represent database entities. But we realized that there can be slightly different forms of an entity that require different types for the fields. Rather than create entirely new datatypes to handle each combination, each with their own fields, we used a parameterized data type, and type families to select the varying types for each field.

Example: CRUD

You might have the following datatype to represent a User. The Maybe fields correspond to nullable columns in the database table. For this example, suppose that the database provides the IDs when you create a record.

1data User = User 2 { userId :: Int 3 , userName :: Text 4 , phoneNumber :: Text 5 , email :: Maybe Text 6 , deletedAt :: Maybe UTCTime 7 }

Now suppose you want to have a function to create users. What would its type be? Perhaps:

1createUser :: User -> IO User

But there’s a problem here - we don’t know what the ID is, so we can’t specify it on creation. So perhaps the userId field should have the type Maybe Int? Then we could supply a User with userId = Nothing, and get back a User with userId = Just some_id.

But then we’ve lost the guarantee in the type system that we get back an ID. And every function that accepts a User as an argument must now handle the case that the userId might be Nothing.

We could make a separate UserCreate datatype, but we’d have to duplicate most of the field definitions. And we couldn’t write code that could accept either a UserCreate or a User as an argument.

Also, when we’re creating a record, it wouldn’t make sense to set deletedAt.

We can use a data kind and a type family to solve this problem:

1{-# LANGUAGE DataKinds #-} 2{-# LANGUAGE TypeFamilies #-} 3data Mode = CreateMode | ReadMode 4type family UserIdType (mode :: Mode) where 5 UserIdType CreateMode = () 6 UserIdType ReadMode = Int 7type family DeletedAtType (mode :: Mode) where 8 DeletedAtType CreateMode = () 9 DeletedAtType ReadMode = Maybe UTCTime 10data User (mode :: Mode) = User 11 { userId :: UserIdType mode 12 , userName :: Text 13 , phoneNumber :: Text 14 , email :: Maybe Text 15 , deletedAt :: DeletedAtType mode 16 }

The Mode datatype represents our two possible forms of the User datatype: CreateMode when we are creating a user, and ReadMode when we are reading a user record.

Now suppose we also want to be able to update these records. When we update them, we must specify userId. We can’t specify deletedAt. As for the other fields, we may or may not want to update each field. We can use a Maybe type to indicate whether it should be updated.

1data Mode = ReadMode | CreateMode | UpdateMode 2type family UserIdType (mode :: Mode) where 3 UserIdType CreateMode = () 4 UserIdType ReadMode = Int 5 UserIdType UpdateMode = Int 6type family DeletedAtType (mode :: Mode) where 7 DeletedAtType CreateMode = () 8 DeletedAtType ReadMode = Maybe UTCTime 9 DeletedAtType UpdateMode = () 10type family Updatable (mode :: Mode) t where 11 Updatable CreateMode t = t 12 Updatable ReadMode t = t 13 Updatable UpdateMode t = Maybe t 14data User (mode :: Mode) = User 15 { userId :: UserIdType mode 16 , userName :: Updatable mode Text 17 , phoneNumber :: Updatable mode Text 18 , email :: Updatable mode Text 19 , deletedAt :: DeletedAtType mode 20 }

Example: Denormalization

Another way in which we can use type families is for normalized vs. denormalized versions of a datatype. Sometimes we want a record that corresponds directly to the database record; other times we want to attach a list of records from a child table.

For example, perhaps a user can have any number of addresses. So we have a table of addresses with a foreign key referencing the user table.

1data Mode = CreateMode | ReadMode | UpdateMode 2data ShouldIncludeAddresses = WithoutAddresses | WithAddresses 3type family UserIdType (mode :: Mode) where 4 UserIdType ReadMode = Int 5 UserIdType CreateMode = () 6 UserIdType UpdateMode = Int 7type family DeletedAtType (mode :: Mode) where 8 DeletedAtType ReadMode = Maybe UTCTime 9 DeletedAtType CreateMode = () 10 DeletedAtType UpdateMode = () 11type family Updatable (mode :: Mode) t where 12 Updatable ReadMode t = t 13 Updatable CreateMode t = t 14 Updatable UpdateMode t = Maybe t 15type family AddressesType 16 (mode :: Mode) 17 (shouldIncludeAddresses :: ShouldIncludeAddresses) where 18 AddressesType CreateMode WithoutAddresses = () 19 AddressesType CreateMode WithAddresses = [Address] 20 AddressesType ReadMode WithoutAddresses = () 21 AddressesType ReadMode WithAddresses = [Address] 22 AddressesType UpdateMode WithoutAddresses = () 23 AddressesType UpdateMode WithAddresses = Maybe [Address] 24data User (mode :: Mode) (shouldIncludeAddresses :: ShouldIncludeAddresses) = 25 User 26 { userId :: UserIdType mode 27 , userName :: Updatable mode Text 28 , phoneNumber :: Updatable mode Text 29 , email :: Updatable mode Text 30 , addresses :: AddressesType mode shouldIncludeAddresses 31 , deletedAt :: DeletedAtType mode 32 }

I haven’t detailed the Address type here, but suppose that it’s imported from another module. It has an addressId and a userId as a foreign key.

Whether or not you allow creating/updating along with addresses is up to you. It can be tricky to handle the logic. If you do not want to allow those modes, then set the type for that particular configuration to ().

Example: Start/end times

Suppose you want to represent a trip for a rider. The rider waits for the vehicle to pick them up, gets picked up and driven to the dropoff point, where they will get dropped off. You might represent it like this:

1data TripState = 2 TripNotStarted 3 | WaitingForPickup 4 | OnBoard 5 | TripCompleted 6data LatLong = LatLong { lat :: Double, long :: Double } 7data Trip = Trip 8 { tripId :: Int 9 , userId :: Int 10 , pickupPoint :: LatLong 11 , dropoffPoint :: LatLong 12 , state :: TripState 13 , pickupTime :: Maybe UTCTime 14 , dropoffTime :: Maybe UTCTime 15 }

It can be useful to track the trip state. The problem here is that we are forced to leave pickupTime and dropoffTime as Maybe types, even though they we should know based on the state when they should be set. Sometimes we want to write functions that only need to handle trips that are in a certain state, for example, recordDropoff :: Trip -> UTCTime -> IO () would have to check the state to see whether it’s correct, and then it still can’t assume that dropoffTime is Nothing, so it has to check that, too.

You could build the times into the TripState datatype:

1data TripState = 2 TripNotStarted 3 | WaitingForPickup 4 | OnBoard { pickupTime :: UTCTime } 5 | TripCompleted { pickupTime :: UTCTime, dropoffTime :: UTCTime }

However, this makes dropoffTime into a partial function. If it’s called on the wrong constructor, it will throw an exception. It doesn’t allow you to constrain a function to take only trips in a particular state. You also give up having a simple enumerated type. Using data kinds and type families, we can write it like this (retaining the original TripState definition):

1type family PickupTimeType (state :: TripState) where 2 PickupTimeType TripNotStarted = () 3 PickupTimeType WaitingForPickup = () 4 PickupTimeType OnBoard = UTCTime 5 PickupTimeType TripCompleted = UTCTime 6type family DropoffTimeType (state :: TripState) where 7 DropoffTimeType TripNotStarted = () 8 DropoffTimeType WaitingForPickup = () 9 DropoffTimeType OnBoard = () 10 DropoffTimeType TripCompleted = UTCTime 11data Trip (state :: TripState) = Trip 12 { tripId :: Int 13 , userId :: Int 14 , pickupPoint :: LatLong 15 , dropoffPoint :: LatLong 16 , state :: TripState 17 , pickupTime :: PickupTimeType state 18 , dropoffTime :: DropoffTimeType state 19 }

Now we can type recordDropoff as Trip 'OnBoard -> UTCTime -> IO (), so that it can only be called if the trip is known to be in the right state.

But, there’s still too much wiggle room. The value in the state field doesn’t necessarily correspond to the type parameter. We can fix this if we make singletons for our TripState type:

1import Data.Singletons.TH 2$(singletons [d| 3 data TripState = 4 TripNotStarted 5 | WaitingForPickup 6 | OnBoard 7 | TripCompleted 8 |]) 9data Trip (state :: TripState) = Trip 10 { tripId :: Int 11 , userId :: Int 12 , pickupPoint :: LatLong 13 , dropoffPoint :: LatLong 14 , state :: Sing TripState 15 , pickupTime :: PickupTimeType state 16 , dropoffTime :: DropoffTimeType state 17 }

Now TripState will be one of STripNotStarted, SWaitingForPickup, SOnBoard, or STripCompleted, and we can use fromSing to convert to the regular datatype values.

We can also make database constraints that mirror these type constraints, based on the state field:

1CHECK (pickup_time IS NULL OR state IN ('OnBoard', 'TripCompleted'), 2CHECK (dropoff_time IS NULL OR state IN ('TripCompleted'))

We haven’t gotten fancy enough to generate those from the Haskell types.

Conclusion

We have seen how to construct varying forms of a datatype by parameterizing it over data kinds, and using type families to define field types. This is useful for CRUD operations, denormalization, and controlling whether fields should be specified based upon a state field. Another way to use this technique is to version a datatype. The possibilities are unlimited! We find great utility in being able to maintain a flat record for a database entity, while also controlling which fields can be specified under what conditions.