blog.ploeh.dk

Easier encapsulation with static types

A metaphor. While I'm still struggling with the notion that dynamically typed languages may have compelling advantages, I keep coming back to the benefits of statically typed languages. One such benefit is how it enables the communication of contracts, as I recently discussed in Encapsulating rod-cutting. As usual, I base my treatment of encapsulation on my reading of Bertrand Meyer's seminal Object-Oriented Software Construction. A major aspect of encapsulation is the explicit establishment of contracts. What is expected of client code before it can invoke an operation (preconditions)? What is guaranteed to be true after the operation completes (postconditions)? And what is always true of a particular data structure (invariants)? Contracts constitute the practical part of encapsulation. A contract can give you a rough sense of how well-encapsulated an API is: The more statements you can utter about the contract, the better encapsulation. You may even be able to take all those assertions about the contract and implement them as property-based tests. In other words, if you can think of many properties to write as tests, the API in question probably has good encapsulation. If, on the other hand, you can't think of a single precondition, postcondition, or invariant, this may indicate that encapsulation is lacking. Contracts are the practical part of encapsulation. The overall notion provides guidance of how to achieve encapsulation. Specific contracts describe what is possible, and how to successfully interact with an API. Clearly, the what and how. They don't, however, explain why encapsulation is valuable. Why encapsulate? # Successful code bases are big. Such a code base rarely fits in your head in its entirety. And the situation is only exacerbated by multiple programmers working concurrently on the code. Even if you knew most of the code base by heart, your team members are changing it, and you aren't going to be aware of all the modifications. Encapsulation offers a solution to this problem. Instead of knowing every detail of the entire code base, encapsulation should enable you to interact with an API (originally, an object) without knowing all the implementation details. This is the raison d'être of contracts. Ideally, knowing the contract and the purpose of an object and its methods should be enough. Imagine that you've designed an API with a strong contract. Is your work done? Not yet. Somehow, you'll have to communicate the contract to all present and future client developers. How do you convey a contract to potential users? I can think of a few ways. Good names are important, but only skin-deep. You can also publish documentation, or use the type system. The following metaphor explores those two alternatives. Doing a puzzle # When I was a boy, I had a puzzle called Das verflixte Hunde-Spiel, which roughly translates to the confounded dog game. I've previously described the game and an algorithm for solving it, but that's not my concern here. Rather, I'd like to discuss how one might abstract the information carried by each tile. As the picture suggests, the game consists of nine square tiles, each with two dog heads and two tails. The objective of the puzzle is to lay all nine tiles in a three-by-three grid such that all the heads fit the opposing tails. The dogs come in four different colour patterns, and each head must fit a tail of the same pattern. It turns out that there are umpteen variations of this kind of puzzle. This one has cartoon dogs, but you can find similar games with frogs, cola bottles, playing card suits, trains, ladybirds, fast food, flowers, baseball players, owls, etc. This suggests that a generalization may exist. Perhaps an abstraction, even. "Abstraction is the elimination of the irrelevant and the amplification of the essential" Robert C. Martin, Designing Object-Oriented C++ Applications Using The Booch Method, ch. 00 How to eliminate the irrelevant and amplify the essential of a tile? To recapitulate, a single tile looks like this: In a sense, we may regard most of the information on such a tile as 'implementation details'. In a code metaphor, imagine looking at a tile like this as being equivalent to looking at the source code of a method or function (i.e. API). That's not the essence we need to correctly assemble the puzzle. Imagine that you have to lay down the tiles according to a known solution. Since you already know the solution, this task only involves locating and placing each of the nine tiles. In this case, there are only nine tiles, each with four possible rotations, so if yo

Feb 24, 2025 - 16:15

A metaphor.

While I'm still struggling with the notion that dynamically typed languages may have compelling advantages, I keep coming back to the benefits of statically typed languages. One such benefit is how it enables the communication of contracts, as I recently discussed in Encapsulating rod-cutting.

As usual, I base my treatment of encapsulation on my reading of Bertrand Meyer's seminal Object-Oriented Software Construction. A major aspect of encapsulation is the explicit establishment of contracts. What is expected of client code before it can invoke an operation (preconditions)? What is guaranteed to be true after the operation completes (postconditions)? And what is always true of a particular data structure (invariants)?

Contracts constitute the practical part of encapsulation. A contract can give you a rough sense of how well-encapsulated an API is: The more statements you can utter about the contract, the better encapsulation. You may even be able to take all those assertions about the contract and implement them as property-based tests. In other words, if you can think of many properties to write as tests, the API in question probably has good encapsulation. If, on the other hand, you can't think of a single precondition, postcondition, or invariant, this may indicate that encapsulation is lacking.

Contracts are the practical part of encapsulation. The overall notion provides guidance of how to achieve encapsulation. Specific contracts describe what is possible, and how to successfully interact with an API. Clearly, the what and how.

They don't, however, explain why encapsulation is valuable.

Why encapsulate? #

Successful code bases are big. Such a code base rarely fits in your head in its entirety. And the situation is only exacerbated by multiple programmers working concurrently on the code. Even if you knew most of the code base by heart, your team members are changing it, and you aren't going to be aware of all the modifications.

Encapsulation offers a solution to this problem. Instead of knowing every detail of the entire code base, encapsulation should enable you to interact with an API (originally, an object) without knowing all the implementation details. This is the raison d'être of contracts. Ideally, knowing the contract and the purpose of an object and its methods should be enough.

Imagine that you've designed an API with a strong contract. Is your work done? Not yet. Somehow, you'll have to communicate the contract to all present and future client developers.

How do you convey a contract to potential users? I can think of a few ways. Good names are important, but only skin-deep. You can also publish documentation, or use the type system. The following metaphor explores those two alternatives.

Doing a puzzle #

When I was a boy, I had a puzzle called Das verflixte Hunde-Spiel, which roughly translates to the confounded dog game. I've previously described the game and an algorithm for solving it, but that's not my concern here. Rather, I'd like to discuss how one might abstract the information carried by each tile.

A picture of the box of the puzzle, together with the tiles spread out in unordered fashion.

As the picture suggests, the game consists of nine square tiles, each with two dog heads and two tails. The objective of the puzzle is to lay all nine tiles in a three-by-three grid such that all the heads fit the opposing tails. The dogs come in four different colour patterns, and each head must fit a tail of the same pattern.

It turns out that there are umpteen variations of this kind of puzzle. This one has cartoon dogs, but you can find similar games with frogs, cola bottles, playing card suits, trains, ladybirds, fast food, flowers, baseball players, owls, etc. This suggests that a generalization may exist. Perhaps an abstraction, even.

"Abstraction is the elimination of the irrelevant and the amplification of the essential"
Robert C. Martin, Designing Object-Oriented C++ Applications Using The Booch Method, ch. 00

How to eliminate the irrelevant and amplify the essential of a tile?

To recapitulate, a single tile looks like this:

One of the tiles of the Hunde-Spiel.

In a sense, we may regard most of the information on such a tile as 'implementation details'. In a code metaphor, imagine looking at a tile like this as being equivalent to looking at the source code of a method or function (i.e. API). That's not the essence we need to correctly assemble the puzzle.

Imagine that you have to lay down the tiles according to a known solution. Since you already know the solution, this task only involves locating and placing each of the nine tiles. In this case, there are only nine tiles, each with four possible rotations, so if you already know what you're looking for, that is, of course, a tractable endeavour.

Now imagine that you'd like to undertake putting together the tiles without having to navigate by the full information content of each tile. In programming, we often need to do this. We have to identify objects that are likely to perform some subtask for us, and we have to figure out how to interact with such an object to achieve our goals. Preferably, we'd like to be able to do this without having to read all the source code of the candidate object. Encapsulation promises that this should be possible.

The backside of the tiles #

If we want to eliminate the irrelevant, we have to hide the information on each tile. As a first step, consider what happens if we flip the tiles around so that we only see their backs.

An empty square, symbolizing the backside of a tile.

Obviously, each backside is entirely devoid of information, which means that we're now flying blind. Even if we know how to solve the puzzle, our only recourse is to blindly pick and rotate each of the nine tiles. As the previous article calculated, when picking at random, the odds of arriving at any valid solution is 1 to 5,945,425,920. Not surprisingly, total absence of information doesn't work.

We already knew that, because, while we want to eliminate the irrelevant, we also must amplify the essential. Thus, we need to figure out what that might be.

Perhaps we could write the essential information on the back of each tile. In the metaphor, this would correspond to writing documentation for an API.

Documentation #

To continue the metaphor, I asked various acquaintances to each 'document' a title. I deliberately only gave them the instruction that they should enable me to assemble the puzzle based on what was on the back of each tile. Some asked for additional directions, as to format, etc., but I refused to give any. People document code in various different ways, and I wanted to capture similar variation. Let's review some of the documentation I received.

The back of a tile, with written documentation and some arrows.

Since I asked around among acquaintances, all respondents were Danes, and some chose to write the documentation in Danish, as is the case with this one.

Unless you have an explicit, enforced policy, you might run into a similar situation in software documentation. I've seen more than one example of software documentation written in Danish, simply because the programmer who wrote it didn't consider anything else than his or her native language. I'm sure most Europeans have similar experiences.

The text on the tile says, from the top and clockwise:

light brown dog/light snout/dark ears
dark brown, white/snout
orange tail/brown spots on/back
orange tail/orange back

Notice the disregard for capitalization rules or punctuation, a tendency among programmers that I've commented upon in Code That Fits in Your Head.

In addition to the text, the back of the above tile also includes six arrows. Four of them ought to be self-evident, but can you figure out what the two larger arrows indicate?

It turns out that the person writing this piece of documentation eventually realized that the description should be mirrored, because it was on the backside of the tile. To be fair to that person, I'd asked everyone to write with a permanent marker or pen, so correcting a mistake involved either a 'hack' like the two arrows, or starting over from scratch.

Let's look at some more 'documentation'. Another tile looks like this:

The back of a tile, with some fairly cryptic symbols in the corners.

At first glance, I thought those symbols were Greek letters, but once you look at it, you start to realize what's going on. In the upper right corner, you see a stylized back and tail. Likewise, the lower left corner has a stylized face in the form of a smiley. The lines then indicate that the sides indicated by a corner has a head or tail.

Additionally, each side is encoded with a letter. I'll leave it as an exercise for the reader to figure out what G and B indicate, but also notice the two examples of a modified R. The one to the right indicates red with spots, and the other one uses the minus symbol to indicate red without spots.

On the one hand, this example does an admirable job of eliminating the irrelevant, but you may also find that it errs on the side of terseness. At the very least, it demands of the reader that he or she is already intimately familiar with the overall problem domain. You have to know the game well enough to be able to figure out that R- probably means red without spots.

Had this been software documentation, we might have been less than happy with this level of information. It may meet formal requirements, but is perhaps too idiosyncratic or esoteric.

Be that as it may, it's also possible to err on the other side.

The back of a tile, this time with an almost one-to-one replica of the picture on the front.

In this example, the person writing the documentation essentially copied and described every detail on the front of the tile. Having no colours available, the person instead chose to describe in words the colour of each dog. Metaphorically speaking, the documentation replicates the implementation. It doesn't eliminate any irrelevant detail, and thereby it also fails to amplify the essential.

Here's another interesting example:

The back of a tile, with text along all four sides.

The text is in Danish. From the top clockwise, it says:

dark brown dog with blue collar
light brown dog with red collar
brown dog with small spots on back
Brown dog with big spots on back

Notice how the person writing this were aware that a tile has no natural up or down. Instead, each side is described with letters facing up when that side is up. You have to rotate the documentation in order to read all four sides. You may find that impractical, but I actually consider that to represent something essential about each tile. To me, this is positive.

Even so, an important piece of information is missing. It doesn't say which sides have heads, and which ones have tails.

Finally, here's one that, in my eyes, amplifies the essential and eliminates the irrelevant:

The back of a tile, with text along all four sides.

Like the previous example, you have to rotate the documentation in order to read all four sides, but the text is much terser. If you ask me, Grey head, Burnt umber tail, Brown tail, and Spotted head amplifies the essential and eliminates everything else.

Notice, however, how inconsistent the documentation is. People chose various different ways in their attempt to communicate what they found important. Some people inadvertently left out essential information. Other people provided too much information. Some people never came through, so in a few cases, documentation was entirely absent. And finally, I've hinted at this already, most people forgot to 'mirror' the information, but a few did remember.

All of this has direct counterparts in software documentation. The level of detail you get from documentation varies greatly, and often, the information that I actually care about is absent: Can I call this method with a negative number? Does the input string need to be formatted in a particular way? Does the method ever return null? Which exceptions may it throw?

I'm not against documentation, but it has limitations. As far as I can tell, though, that's your only option if you're working in a dynamically typed language.

Static types with limited expression #

Can you think of a way to constrain which puzzle pieces fit together with other pieces?

That's how jigsaw puzzles work. As a first attempt, we may try to cut out out the pieces like this:

An anonymous jigsaw puzzle piece

This does help some, because when presented with the subtask of having to locate and find the next piece, at least we can't rotate the next piece in four different positions. Instead, we have only two options. Perhaps we'll choose to lay down the next piece like this:

Two anonymous jigsaw pieces fit together

You may also decide to rotate the right piece ninety degrees clockwise, but those are your only two rotation options.

We may decide to encode the shape of the pieces so that, say, the tabs indicate heads and the indentations indicate tails. This, at least, prevents us from joining head with head, or tail with tail.

This strikes me as an apt metaphor for C, or how many programmers use the type systems of C# or Java. It does prevent some easily preventable mistakes, but the types still don't carry enough information to enable you to identify exactly the pieces you need.

More expressive static types #

Static type systems come in various forms, and some are more expressive than others. To be honest, C#'s type system does come with good expressive powers, although it tends to require much ceremony. As far as I can tell, Java's type system is on par with C#. Let's assume that we either take the time to jump through the hoops that make these type systems expressive, or that we're using a language with a more expressive type system.

In the puzzle metaphor, we may decide to give a tile this shape:

A jigsaw puzzle piece with four distinct tab and indentation shapes.

Such a shape encodes all the information that we need, because each tab or indentation has a unique shape. We may not even have to remember exactly what a square indentation represents. If we're presented with the above tile and asked to lay down a compatible tile, we have to find one with a square tab.

Two jigsaw puzzle pieces with distinct tab and indentation shapes, arranged so that they fit together.

Encoding the essential information into tile shapes enables us to not only prevent mistakes, but identify the correct composition of all the tiles.

Completed puzzle with nine distinctly shaped pieces.

For years, I've thought about static types as shapes of objects or functions. For practical purposes, static types can't express everything an operation may do, but I find it useful to use a good type system to my advantage.

Code examples #

You may find this a nice metaphor, and still fail to see how it translates to actual code. I'm not going to go into details here, but rather point to existing articles that give some introductions.

One place to start is to refactor from primitive obsession to domain models. Just wrapping a string or an integer in a predicative type helps communicate the purpose and constraints of a data type. Consider a constructor like this:

public Reservation(
    Guid id,
    DateTime at,
    Email email,
    Name name,
    NaturalNumber quantity)

While hardly sophisticated, it already communicates much information about preconditions for creating a Reservation object. Some of the constituent types (Guid and DateTime) are built in, so likely well-known to any developer working on a relevant code base. If you're wondering whether you can create a reservation with a negative quantity, the types already answer that.

Languages with native support for sum types let you easily model mutually exclusive, heterogeneous closed type hierarchies, as shown in this example:

type PaymentService = { Name : string; Action : string }
 
type PaymentType =
| Individual of PaymentService
| Parent of PaymentService
| Child of originalTransactionKey : string * paymentService : PaymentService

And if your language doesn't natively support sum types, you can emulate them with the Visitor design pattern.

You can, in fact, do some quite sophisticated tricks even with .NET's type system.

Conclusion #

People often argue about static types with the assumption that their main use is to prevent mistakes. They can help do that, too, but I also find static types an excellent communication medium. The benefits of using a static type system to model contracts is that, when a type system is already part of a language, it's a consistent, formalized framework for communication. Instead of inconsistent and idiosyncratic documentation, you can embed much information about a contract in the types of an API.

And indeed, not only can the types help communicate pre- and postconditions. The type checker also prevents errors.

A sufficiently sophisticated type system carries more information that most people realize. When I write Haskell code, I often need to look up a function that I need. Contrary to other languages, I don't try to search for a function by guessing what name it might have. Rather, the Hoogle search engine enables you to search for a function by type.

Types are shapes, and shapes are like outlines of objects. Used well, they enable you to eliminate the irrelevant, and amplify the essential information about an API.

This blog is totally free, but if you like it, please consider supporting it.