Intrinsic Types in TypeScript

Greg Pabian
Level Up Coding
Published in
4 min readFeb 15, 2021

--

Even though the notion of intrinsic types might baffle the reader, I will not define them right away, as I want to introduce the concept using a practical example. I would love to start the article with the following question, something I have asked myself in various forms over the years:

In an ideal, statically-typed language, how could one define the string capitalisation type, using the properties of its type system?

I could substitute string capitalisation with string permutation and the issue still remains valid, as long as it touches not the abstraction of types, but their intrinsic properties.

Solving the Assignment

The way I wrote the aforementioned assignment grants a lot of freedom in regards to choosing the utilised type system. However, one must recognise implicit requirements for certain features, like generics, string literals or inheritance. Since ordered lists of characters make strings, the question “which encoding applies to characters?” requires an instant answer.

I would name the type CapitalizedString and define it as a parametrised type (or a generic type, depends on the domain language of the chosen type system) with the single parameter T constrained to a string literal type. All characters in strings should adhere to the UTF-8 encoding. One could write it down easily using pseudocode in the following fashion:

type CapitalizedString<T extends string literal> = ...

By definition, if the string literal in question has no characters, the operation has obviously no effect on it. Otherwise, capitalisation alters only the first character of the string in question, leaving the rest unchanged. To uppercase a character, the type system needs to support it ad hoc, as defining it explicitly in the type system sounds impractical for UTF-8 characters due to complexity of such an algorithm (I find it inadvisable even for ASCII characters).

Therefore, the compiler itself must provide implementation of uppercasing any UTF-8 character, making the uppercase operation an ad hoc, or rather an intrinsic property of the type system. There exist multiple ways of declaring such a property in type definitions, for example:

intrinsic type Uppercase<T extends character>; 
type Uppercase<T extends character> = intrinsic;

An avid reader might note that the aforewritten definitions provide no return type explicitly — since the compiler performs the operation, nobody knows the return type beforehand. The type system might define uppercasing as an unary operator, accepting a single character, as shown in the following snippet:

type UppercaseA = uppercase 'a';

In the end, the CapitalisedString type boils down to the following definition:

type CapitalisedString<T extends string literal> =
T === '' ? '' : Uppercase<T[0]> + T[1...];

In order for the definition to work, the type system in question requires:

  • an ability for equality comparison of two string literal types,
  • a possibility to access both the n-th character and a range of characters of any string literal type,
  • permission to use conditional statements in type definitions.

Intrinsic Types in TypeScript

As of TypeScript 4.1, there exist 4 intrinsic types : Lowercase, Uppercase, Capitalize and Uncapitalize, all of them defined using the intrinsic keyword. I have not found any operators connected to case management in the language. As an exercise, I would advice the reader to research these types and think of another way to define Capitalize and Uncapitalize (hint: the infer keyword). Also, what other string operations could one think of?

The way I would define CamelCase using the current type system of TS

Migrating between a lowercased logic to an uppercased one constitutes one of many usage cases, which employs the type system to verify the correctness of string literals when operating on a new system — I have personally become quite fond of utilising my toolchain to automatically perform mundane sanity checks. Certain systems, like SQL databases, do not care about casing (to a certain degree) but there exists no such guarantee regarding other programs.

Another observation shows that some programming languages, frameworks and libraries expect variable names to conform to some opinionated standards like camel-case or snake-case. For networking purposes, all microservices in the system should adhere to a predefined way of serializing data for reliable communication. Leveraging intrinsic types to generate correct structures in compile-time provides another layer of safety checks that could spot problems before shipping a single line of code to a production system.

Summary

The very existence of type systems does not guarantee simplicity of seemingly straightforward operations, like uppercasing or lowercasing. Since I believe that creators of programming languages should design them taking into account contemporary needs of software developers, I do not expect type systems to be Turing-complete on their own. It might happen that expressing simple concepts in the designed system requires significant amount of work and thus, some people might incline towards moving the concepts directly into the compiler domain.

I would recommend researching some closely-related topics, like metaprogramming or the theory of compilers to fully grasp all the possibilities and limitations of software development in this day and age. All in all, programmers who invest into understanding the languages they code with, should gain skills to build architectures which take full advantage of the underlying type systems.

Post Scriptum

To satisfy my intellectual curiosity, I wrote this article in E-Prime, a subset of the English language without the verb “to be”. After I have created more works in such a fashion, I might publish a piece regarding all my related findings.

--

--