Type safe identifiers and phantom types

Quite often, in a database program, the values of an integer type (for example, long ) are used as identifiers of entities. But people tend to make mistakes, and the programmer may mistakenly use the identifier of one type of entity to address another. Such a problem can go unnoticed for a long time if the identifiers of the entities overlap, and this happens quite often. Fortunately, in languages that allow manipulation of types, which is C ++, there is a fairly simple solution to this problem.

Formulation of the problem

Suppose our program works with several types of entities. For example, let's take widgets ( Widget class) and gadgets ( Gadget class):

 class Widget { public: long id() const; // ... }; class Gadget { public: long id() const; // ... };

In addition to the high probability of error, the use of "raw" types as identifiers significantly reduces the readability of the code. It’s not very easy to understand code that contains many types like std::vector<long> std::map<long, long> . Using type synonyms:

 typedef long WidgetId; typedef long GadgetId;

allow the programmer to write more expressive code by manipulating types like std::map<WidgetId, GadgetId> . But this approach will solve only the problem of readability. The compiler still does not know that we consider the values of the WidgetId and GadgetId types to GadgetId incompatible.
')

We inform the compiler our intentions

What would a person do if he had to operate on paper with a multitude of abstract identifiers so as not to get confused in all these numbers? I think it’s quite a reasonable approach to add a type tag to the identifiers - a prefix or a suffix meaning an identifiable entity. For example, K-12 could mean a computer for 12 in a row as a workstation, and P-12 - a twelfth registered user in a row.

Fortunately, in C ++ there is a mechanism that allows you to attach tags to templates - templates. To solve our problem, we just need to implement a class parameterized by the type and storing the identifier:

 template <typename ModelType, typename ReprType = long> class IdOf { public: typedef ModelType model_type; typedef ReprType repr_type; IdOf() : value_() {} explicit IdOf(repr_type value) : value_(value) {} repr_type value() const { return value_; } bool operator==(const IdOf &rhs) const { return value() == rhs.value(); } bool operator!=(const IdOf &rhs) const { return value() != rhs.value(); } bool operator<(const IdOf &rhs) const { return value() < rhs.value(); } bool operator>(const IdOf &rhs) const { return value() > rhs.value(); } private: repr_type value_; };

Let's apply the new class to our gadgets and widgets:

 class Gadget; class Widget; typedef IdOf<Gadget> GadgetId; typedef IdOf<Widget> WidgetId; class Widget { public: WidgetId id() const; // ... }; class Gadget { public: GadgetId id() const; // ... };

Due to how we defined the IdOf class, the following code containing logical errors will not compile:

 // This won't compile. vector<GadgetId> gadgetIds; gadgetIds.push_back(WidgetId(5)); // This won't compile either. if (someGadget.id() == someWidget.id()) { doSomething(); }

Operations on identifiers of the same type will work correctly. Now the compiler knows more about our intentions, it will not allow us to load the gadget by the widget's identifier or place the wrong type identifier in the vector.

If we still need to compare different types of identifiers, or compare an identifier with a “raw” value, you can always call the value() method explicitly.

Phantom types

It turns out that the trick we just cranked with identifiers has been known in functional programming for quite some time. Parameterized types that do not use type-parameter in the definition are called phantom types ( Phantom Types ).
For example, in Haskell, a similar technique can be implemented as follows:

 newtype IdOf a = IdOf { idValue :: Int } deriving (Ord, Eq, Show, Read)

Wow, just a couple of lines of code! Now add the definitions of our models:

 data Widget = Widget { widgetId :: IdOf Widget } deriving (Show, Eq) data Gadget = Gadget { gadgetId :: IdOf Gadget } deriving (Show, Eq)

and check the desired behavior by creating instances of different types and trying to compare their identifiers:

 Prelude> let g = Gadget (IdOf 5) Prelude> let w = Widget (IdOf 5) Prelude> widgetId w == gadgetId g <interactive>:1:15: Couldn't match type `Gadget' with `Widget' Expected type: IdOf Widget Actual type: IdOf Gadget In the return type of a call of `gadgetId' In the second argument of `(==)', namely `gadgetId g' In the expression: widgetId w == gadgetId g

Well, the compiler (more precisely, here I used the ghci interpreter for experiments) refused to accept comparison of identifiers of different types. This is just what you need.

This technique can be used to bind to the numerical values of currency labels, units of measure, and other information that can be useful to both the program reader and the compiler.

Results

Only one small class can save us a lot of time, which would have to spend on finding errors. In addition, the use of this approach will not affect the performance and memory consumption of the program at runtime during compilation with optimization turned on. The Haskell version also does not incur any overhead.

The disadvantage is the need to type (and read) a little more letters and, perhaps, explain the idea to colleagues, but quite often the advantages of more stringent logic checking by the compiler outweigh the disadvantages.

Phantom types are popular in applications that require high reliability, where each additional test, automatically performed by the compiler, reduces the company's losses. In particular, they are used extensively when programming on OCaml at Jane Street and in Standard Chartered Bank products written in Haskell (as described by Don Stewart at Google Tech Talk 2015 ).

It is impossible not to mention the powerful library Boost.Units , which allows performing type-safe operations on values of different types with automatic output of the result type.

Source: https://habr.com/ru/post/198568/

All Articles

Type safe identifiers and phantom types

Formulation of the problem

We inform the compiler our intentions

Phantom types

Results

More articles: