latest news

Distractions and amusements, with a sandwich and coffee.

This love loves love. It's a strange love, strange love.
•
• find a way to love
• more quotes

especially a short one.

To see a World in a grain of sand,

And Heaven in a wild flower,

Hold Infinity in the palm of your hand

And Eternity in an hour.

— William Blake

Below are the mathematical details behind the theory of transfinite numbers. The story is aimed at a math-enthusiast audience and will take you from the number $1$ all the way to $\aleph_1$ and beyond.

Two caveats. First, this may cause you to lose sleep. I hope it does, as many things in life are worth staying awake for. Two, I'm not a set theoretician and therefore appreciate any and all reports of errors and technical shortcomings.

The mathematical details reflect the progression of the scenes in the video. Way down below, I include additional material about ordinal numbers.

The video begins humbly is with the first natural number, 1.

What follows are all the other natural numbers—everyone has their favourite. I like 17.

All the natural numbers make up a set, $\mathbb{N}$. $$\mathbb{N} = \{ 1, 2, 3...\}$$

We never run out of these numbers—to get the next one just take the previous and add one. This process of adding one to get the next number is called the successor function. For now, this is easy but we'll see below an application of this that's a little trickier.

How big is $\mathbb{N}$? In set theory, the size of a set is referred to as its cardinality and denoted by $|\mathbb{N}|$. At this point you might simply want to answer "It's infinitely large!" and be done with it.

Hang around, though. This is just where the story gets interesting.

There is a magical place called the Grand Hotel. It has an infinite number of rooms, each indexed by a natural number.

Despite the fact that this hotel is always completely full, it can always accommodate another guest. To do this, we ask everyone to move to room $n+1$ thus freeing up room $1$. We have solved the problem at the cost of annoying infinitely many people.

But wait. There's infinitely more.

The hotel can always accommodate an additional infinite number of guests. Everyone is asked to move to room $2n$ thus freeing up all odd-numbered rooms. Since there's an infinite number of these rooms, we've just doubled the capacity!

I don't mean to suggest that I loved you the best

I can't keep track of each fallen robin

I remember you well in the Chelsea Hotel

That's all, I don't even think of you that often

— Leonard Cohen, Chelsea Hotel No. 2

The craziness is only beginning. If an infinite number of busses show up, each with an infinite number of guests, guess what? Yup, they can all be accommodated.

First, denote the $i$th guest in the $j$th bus by $s_n = (i,j)$. Since $i \in \mathbb{N}$ and $j \in \mathbb{N}$ then, we can enumerate $s_n$ as $s_1, s_2, s_3, ...$ Then, the formula for assigning guests to rooms is to assign guest $i$ from bus $j$ to room $n$. In this scheme, the hotel itself is treated as a bus indexed by $j = 0$.

This is sometimes called “Hilbert's Paradox” but it's not really a paradox. Rather, it's a demonstration that we should not expect intuition about finite quantities to carry over to the behaviour of infinite quantities.

Case in point, in Hilbert's hotel "there is a guest in every room" does not imply that "no more guests can be accommodated".

Your mind may be in revolt at this moment. “Surely, there are fewer even numbers than natural numbers, since even numbers are a proper subset of natural numbers!”

First, in a hotel with finite number of rooms, you would be correct.

Second, your confusion puts you in good company. Gregor Cantor (1845-1918), the mathematician who was the first to develop a theoretical framework of infinite quantities, suffered recurrent nervous breakdowns.

Remember the concept of the cardinality of a set? Cantor assigned the number $\aleph_0$ (named so after the first letter, aleph, in the Hebrew alphabet) to represent the cardinality of the naturals.

He called the family of numbers, in which $\aleph_0$ is the first, transfinite numbers . Today ‘infinite’ is more commonly used to refer to these numbers.

For two sets to have the same cardinality a special condition has to be met. We say that $|X| = |Y|$ if and only if we can demonstrate a function $f(X) \mapsto Y$ that is one-to-one, meaning that $f(x \in X)$ maps to a unique $y \in Y$, and onto, meaning that every $y \in Y$ is mapped to by some $x \in X$.

In other words, to show that two sets have the same cardinality, we just have to find a bijection. Here is one between the even numbers and natural numbers $$ f(\{2,4,6,8,...\} \mapsto \mathbb{N}) : x \mapsto x/2 $$

To show that $f$ is a bijection we need to show that $f$ is injective and surjective. It is injective (one-to-one) since any two distinct even numbers $x \ne y$ are sent to distinct values $x/2 \ne y/2$. It is surjective (onto) since every natural number $x$ has an even number $2x$ that is mapped to it.

There are lots of sets that have the same cardinality as the naturals: odd numbers, even numbers, prime numbers. Any infinite subset of naturals has the same cardinality as the naturals.

The next chapter in the video brings us to the observation that since the naturals are a subset of the integers, $\mathbb{Z}$ and since both are infinite, both have to have the same cardinality.

The video shows a bijection between the two sets, $f(\mathbb{N}) \mapsto \mathbb{Z}$ defined as follows: $f(0) \mapsto 0$ and $f(2k) \mapsto k$, $f(2k-1) \mapsto -k$ for $k \in \mathbb{N}$. Even naturals are sent to positive integers and odd naturals are sent to negative integers. Each natural has a unique integers (injective) and all integers are covered (surjective).

For example, the positive integer $n$ is mapped from $2n \in \mathbb{N}$ (e.g. $22 \mapsto 11$) and the negative integer $-n$ from $-(2n-1)$ (e.g. $23 \mapsto -11$).

The next part of the video shows many more possible bijections.

The first column on the right of the naturals is the bijection described above and others take the form $$f(x \in \mathbb{N}) = \begin{cases} 0, & \text{if $x = 1$} \\k, & \text{if $x$ is the $k$th number that passes the rule $g(x)$} \\-k, & \text{if $n$ is the $k$th number that fails the rule $g(x)$} \end{cases} $$

For our first bijection, the rule was $g(x) \stackrel{?}{=} 2k$, which checks whether $x$ is even.

The second column on the right of the naturals uses the rule $g(x) \stackrel{?}{=} 4k$, which checks whether $x$ is a multiple of 4. Thus, $4 \mapsto 1$, $8 \mapsto 2$, $12 \mapsto 3$ and so on. All naturals that are not multiples of 4 are sent to negative integers.

Columns on the left of the naturals use an odd multipler in the $g(x)$ rule and additionally flip the sign of the integer to which $f(x)$ maps. This was done so that I could have an equal balance of white positive integers in the columns on the right and white negative integers in the columns on the left.

As we sample more natural numbers, pages of bijections update faster and faster, in waves of numbers that decay into symbols used in set theory.

Because I could not stop for Death —

He kindly stopped for me —

The Carriage held but just Ourselves —

And Immortality.

...

Since then — 'tis Centuries — and yet

Feels shorter than the Day

I first surmised the Horses' Heads

Were toward Eternity —

— Emily Dickinson

We ramp up the complexity in the video by showing a bijection between the natural numbers and rational numbers, $\mathbb{Q}$, which are all fractions of the form $\mathbb{Q} = \{ q = x/y, x,y \in \mathbb{N}_0, y \ne 0 \}$.

First, we create a table in which the cell in row $x$ and column $y$ is assigned the rational number $x/y$.

To create the bijection, we assign each rational to a natural number as we traverse the table in a zig-zag fashion.

We snake our way from the upper-left corner ($1/1$) to the bottom-right ($78/31$), which is assigned 2418.

Given that rational numbers can be considered as two-dimensional naturals, $\mathbb{Q} = \mathbb{N}^2 = { \mathbb{N_0} \times \mathbb{N} }$, the same traversal argument can be used to show that all higher-dimensional spaces of naturals also have the same cardinality as the naturals, $|\mathbb{N}^k| = |\mathbb{N}|$. The bijection construction is the same as in the $k=2$ case above, except that now we're snaking across a higher dimensional space. When $k$ itself is infinite, we have the scenario of infinite guests in infinite busses arriving at Hilbert's Grand Hotel.

In our story so far, we have shown bijections between $\mathbb{N}$ and $\mathbb{Z}$ and $\mathbb{Q}$ we have proven that all these sets have the same cardinality $|\mathbb{N}| = |\mathbb{Z}| = |\mathbb{Q}| = \aleph_0$.

The naturals are considered to be infinite but countable and any set that has a bijection with the naturals is also countable.

Our discussion has brought us to infinity, $\aleph_0$. What lies beyond?

The first hint that there is indeed something beyond lies in the proof that the real numbers, $\mathbb{R}$ are not countable: there is no bijection between the naturals and reals. If we try to pair up naturals and reals we'll always run out of naturals.

Real numbers are continuous quantities that, for example, can measure the distance along a line. They include the naturals, integers, rationals as well as irrationals, which include numbers like $\sqrt{2}$, which cannot be written as a fraction, and numbers like $\pi$, which are transcendentals and not solutions to polynomial equations.

Cantor's demonstration that $|\mathbb{R}| > |\mathbb{N}|$ is the next part of the story. The proof is by contradiction and applies to the unit interval $ [0,1) = \{ x \, | \, 0 \le x \lt 1 \}$. First, suppose that there is a bijection $f(\mathbb{N}) \mapsto [0,1)$. This implies that for each $n \in \mathbb{N}$ there is some associated $r \in [0,1)$. We write down this assignment—for each natural, we pair up a natural with a real from the unit interval. Obviously, this list goes on forever in both the vertical and horizontal direction.

We don't know what the exact assignment is, so the numbers in the story are only representative. Typically, the proof is written out symbolically with each natural $n_i$ assigned to a series of digits $n_{i1}n_{i2}n_{i3} ... $.

Because this assignment is a bijection it is a surjection and every real number from the unit interval appears somewhere in the list. If we could demonstrate that there is a real number from $[0,1)$ that doesn't appear in the list, we would have a contradiction and the assumption that a bijection exists would be invalid.

We do so as follows. We transform the first digit of the first real to $x \mapsto x + 1 \, \text{mod} \, 10$. In other words $0 \mapsto 1$, $1 \mapsto 2$ and $9 \mapsto 0$.

We do the same for the second digit of the second number, the third digit of the third number, and so on.

This creates a new real, shown here without the leading $0.$

By construction, this real is nowhere in our list of reals. It can't be—it's different from each of the reals in at least one digit. It's different from the first number in the first digit, the second number in the second digit, the third number in the third digit and so on. But it's obviously in the unit interval. A contradiction.

If we write the numbers in the unit interval in binary $ [0,1) = \{ 0.b_1b_2b_3,... \, | \, b \in \{0,1\} \}$ we can use the fact that $b_i$ is indexed by $\mathbb{N}$ to realize that there are $2^{|\mathbb{N}|}$ such binary numbers, since at each position we have two choices ($0$ or $1$). And because $|\mathbb{N}| = \aleph_0$ we have $$ |\mathbb{R}| = 2^{|\mathbb{N}|} = 2^{\aleph_0} $$

Given a set $X$, the power set is the set of all subsets of $X$, including the empty set.

The next part of the story builds up the power set of naturals, $\mathbb{P}(\mathbb{N})$.

For example, for $X = \{1\}$ the power set has two elements, the empty set $\{\}$ and the whole set $\{1\}$. We write $ \mathbb{P}(X) = \{\{\},\{1\}\} $.

For $X = \{1,2\}$ the power set has four elements, the empty set $\{\}$, each of the naturals on their own $\{1\}$ and $\{2\}$ and the whole set $\{1,2\}$. We write $ \mathbb{P}(X) = \{\{\},\{1\},\{2\},\{1,2\}\} $.

In general, the power set of $\{1,2,3,...,n\}$ has cardinality $2^n$.

If we look closely at the $ 2^{\aleph_0} $, we can interpret it as the cardinality of the power set of naturals. This is because for each natural, of which there are $ \aleph_0 $ we have two choices: put it in the subset or not.

As the video continues, the power set elements appear faster and faster. The braces form hypnotising patterns.

Because the reals are continuous quantities, the number of reals is (wonderfully) called the cardinality of the continuum. I wouldn't turn down the job of cardinal of the continuum.

Given what we learned about power sets of naturals above, we can write $$ |\mathbb{R}| = | \mathbb{P}(\mathbb{N}) | = 2^{\aleph_0} $$

With Cantor's diagonal proof, we know that $|\mathbb{R}| > \aleph_0$ but we don't know how much larger. Cantor therefore proposed the Continuum Hypothesis which stated that whatever the size of $2^{\aleph_0}$ was, it was a distinct kind of infinite number and, importantly, the next smallest infinite number after $\aleph_0$.

A consequence of this theorem is that there is no set $X$ for which $$ \aleph_0 \lt |X| \lt 2^{\aleph_0} $$

meaning that there is no set that is larger than the naturals but smaller than the reals.

The Continuum Hypothesis also implies that the cardinality of the continuum is the next number ($\aleph_1$) in the hierarchy of transfinite cardinals, $$ |\mathbb{R}| = 2^{\aleph_0} = \aleph_1 $$

From it, we also get that the cardinality of the power set of an infinite set is the next transfinite cardinal. In other words, for sets $\mathbb{N}, \mathbb{P}(\mathbb{N}), \mathbb{P}(\mathbb{P}(\mathbb{N})), ...$ the cardinalities are $\aleph_0, \aleph_1, \aleph_2, ...$ And in general, $$ \aleph_{\alpha+1} = 2^{\aleph_\alpha} $$

The Continuum Hypothesis is thus far unproven.

At this point, we arrive at the third infinity in the video—the cardinality of the power set of reals is $ | \mathbb{P}({\mathbb{R}}) | = \aleph_2 $.

Elements of the power sets of naturals and reals continue to flash. If the Continuum Hypothesis is true, their cardinality is $\aleph_1$ and $\aleph_2$ respectively.

The music grows in intensity and the scene deteriorates into set theory symbols.

The story brings us back to where we started from: the list of naturals. These pick up where we left off and continue counting.

These too decay in a jitter of symbols

with soon nothing but symbols left

Suddently $\aleph$ appears.

And while everything else decays,

We are reminded of where we started, how far we've gone and how many more infinites are left to go.

How many ages hence

Shall this our lofty scene be acted over,

In states unborn and accents yet unknown!

— William Shakespeare

If you've studied transfinite numbers before, you'll no doubt notice that some core ideas have been left out from the video, such as ordinals and Beth numbers.

The math behind these adds a layer of complexity to the story that we felt was unnecessary for a first and dynamic telling of the ideas. They're explained below, for those who are hardy to follow this deep.

We already saw that the cardinality of the naturals defined the first transfinite number $ | \mathbb{N} | = \aleph_0$. If the Continuum Hypothesis holds, the cardinality of the power set of the naturals and the cardinality of the continuum was the next in the $\aleph$ hierarchy, $ | \mathbb{P}(\mathbb{N}) | = | \mathbb{R} | = \aleph_1$.

The story is a little more complicated than this.

The size of power sets is expressed by a family of transfinite cardinals called $ \beth $ (Beth) numbers. I think that the $\beth$ numbers are easier to understand.

The cardinality of the naturals is defined as $|\mathbb{N}| = \beth_0$ and the cardinality of their power set is the next $\beth$, $|\mathbb{P}(\mathbb{N})| = \beth_1$. Taking the power set again, $|\mathbb{P}(\mathbb{P}(\mathbb{N}))| = \beth_2$, $|\mathbb{P}(\mathbb{P}(\mathbb{P}(\mathbb{N})))| = \beth_3$, and so on.

In general, a set of cardinality $\beth_\alpha$ the cardinality of its power set is $\beth_{\alpha+1}$. In other words, $$ \beth_{\alpha+1} = 2^{\beth_\alpha} $$

As we saw, the cardinality of the continnum is the same as the cardinality of the power set of naturals, $ | \mathbb{R} | = | \mathbb{N} | = \beth_1$. Using the definition above for stepping up the $\beth$ hierarchy, the cardinality of the power set of the continuum is $ | \mathbb{P}(\mathbb{R}) | = \beth_2 $, $ | \mathbb{P}(\mathbb{P}(\mathbb{R})) | = \beth_3 $, and so on.

By definition, there are no infinite cardinalities between $\aleph_0$ and $\aleph_1$. And since infinite cardinalities are linearly ordered (we can compare their sizes) it means that $$ \aleph_1 \le \beth_1 $$

and in general $$ \aleph_\alpha \le \beth_\alpha $$

If the Continuum Hypothesis is true, then $$ \aleph_1 = \beth_1 $$

There may be transfinite numbers indexed by $\aleph$ that are not indexed by $\beth$. For example, if the Continuum Hypothesis is false, then it's possible that $\beth_1 > \aleph_1$ and so $\aleph_1$ doesn't appear in the set of $\beth$ numbers.

You may have noticed that while the definition of $\aleph_1$ was relatively brief. We said that it is the cardinality of the continuum if the Continuum Hypothesis holds.

In actual fact, $\aleph_1$ has a specific definition that arises from another family of numbers—the ordinals.

$\aleph_1$ is defined as the cardinality of the countable ordinal numbers. To understand this, we have to look at ordinal numbers and how they differ from cardinal numbers.

Cardinals, which express the size of a set, can be used to classify sets based on their cardinality. A set of three apples and a set of three bananas have the same cardinal classification because they have the same size.

Orderinals express the rank or order of something. For example, the ordinal 42 is intrepreted as 42nd and implies that 41 things came before it (if we started at $1$). The cardinal 42 has no such implication—you can have 42 bananas in a bag and there is no expectation that there ever was a bag with 41 bananas.

The rank of the biggest loser in a race is an ordinal (they came in 42nd) but the total size of the field is cardinal (there were 42 racers in total).

Although most of the numbers we use are cardinals, the distinction between cardinals and ordinals is practically irrelevant for finite quantities. Thus, the examples above (bananas and racres) are almost painfully trivial. The subtle distinctions between cardinals and ordinals in the context of finite quantities belie the deep differences between them in general.

Mathematically, ordinals classify sets based on their *well-order*. In order to unpack this, we have to understand a little bit about how set order is defined.

We first define something called a *total order*, which is a function that can be used to establish an order for elements in a set. For example, $\le$ is one such total order and $\gt$ is another. This function applies to a pair of elements and identifies which one comes first.

A set with a total order is a *totally ordered set*. This is nothing more than some set (e.g. the naturals) and some total order (e.g. $\le$).

Now, such a totally ordered set is said to be *well-ordered* if every non-empty subset of the set has a least element. This is the part that's very important and it's here that the first sign of confusion may arise. In particular, even though it doesn't look like much, the statement
“has a least element”
is very powerful.

Consider the set of naturals, $\mathbb{N}$. This set is well-ordered using the total order $\le$. This is because every subset of the naturals has a least element, as defined by $\le$.

However, the integers are not well-ordered under $\le$ because we can find a subset of them (e.g. negative integers) that don't have a least element. This is because for any member in the set of negative integers, $-n$ there is always a smaller one, $-n-1$.

This doesn't meant that the integers can't be well-ordered—they can, we just have to pick a different total order. For example, one function under which the integers are well-ordered places them in the order $0,1,2,3,...,-1,-2,-3,...$. Under this function, $3$ comes before $-1$. Keep in mind that the use of the word ‘least’ in this context implies ‘first’ not ‘smallest’ because we're talking about order and not size.

The reals are not well ordered under the $\lt$ operation since any open subset doesn't have a least element. For example, the interval $(0,1)$ doesn't have a least element since for any small number $0 < \epsilon$ there is a smaller number $ 0 < \epsilon' < \epsilon$. However, the reals can be put into a well-order, as described by the Well-ordering Theorem that states that every set can be well-ordered.

A lucid definition of ordinals is provided by von Neumann: the von Neumann ordinals.

He defined the first ordinal trivially as the empty set, $\varnothing$. He then defined the next ordinal as the set that includes all the previous ordinals, $\{\varnothing\}$. Don't confuse the difference between $x$ and $\{x\}$. The former is the object $x$ and the latter is the set that contains the object $x$. These two are as different as a banana and a bag that contains a banana.

According to this rule, the third ordinal would be the set of the first two ordinals, $\{ \varnothing, \{\varnothing\} \}$.

Here are the first 5 ordinals in this definition $$ \begin{array}{cll} 0 & \{ \} & \varnothing \\1 & \{ 0 \} & \{\varnothing\} \\2 & \{ 0,1 \} & \{\varnothing,\{\varnothing\}\} \\3 & \{ 0,1,2 \} & \{ \varnothing, \{\varnothing\}, \{ \varnothing, \{ \varnothing \} \} \} \\4 & \{ 0,1,2,3 \} & \{ \varnothing, \{\varnothing\}, \{\varnothing,\{\varnothing\}\}, \{ \varnothing, \{\varnothing\}, \{ \varnothing, \{ \varnothing \} \} \} \} \\\end{array} $$

The labels $0, 1, 2, ...$ in the table above are a visual aid to shorten the notation and better communicate the relationship between the structures in the right-most column. Do not think of these labels as ‘numbers’ and certainly not as cardinal numbers.

When written this way, ordinal numbers are sets. Moreover, every member of every ordinal number is a subset. For example, the ordinal $2 = \{0,1\}$ is a member of $3$ but it's also a subset of $3$ because both $0$ and $1$ are members of $3$. Here I have used the labels in the table above to stand in for the set definitions of each of the ordinals.

The ordering of the ordinals is defined by set membership (containment). The ordinals are thus well-ordered.

Given an ordinal, $\alpha$, we can create the next ordinal by $\{ \alpha, \{ \alpha \} \}$. This rule is called a successor function, $$ S(\alpha) = \{ \alpha, \{ \alpha \} \} $$

and in the table above, we've simply applied the successor function iteratively to $\varnothing$. To see this more clearly, consider the ordinal $\{0,1\}$. Applying the successor function gives us $$ S(2) = S( \{ 0,1 \} ) = \{ 0,1 \} \cup \{ \{ 0,1 \} \} = \{ 0, 1, \{ 0, 1 \} \} = \{ 0,1,2 \} = 3 $$

We've casually demonstrated that the successor as defined by $S(\alpha)$ of an ordinal is also an ordinal. Also, if $\alpha$ is an ordinal than any $\beta \in \alpha$ is also an ordinal.

So far, so good.

Let's now build towards an explanation of how ordinal numbers classify order of sets.

Two sets are said to be order isomorphic if there is a bijection between them that preserves the order of elements. In other words, these sets are basically the same—only their element labels are different. In this scenario, we can create one set from another simply by renaming the elements.

For example, the following two sets are order isomorphic $$ \begin{array}{l} X = \{ 0,1,2,3 \} \\Y = \{ 11,11,12,13 \} \end{array} $$

because under the total order $\lt$ (for both $X$ and $Y$) and the bijection $f(X \mapsto Y ) : x \mapsto 10 + x$, order is preserved.

The sets $$ \begin{array}{l} X = \{ 0,1,2,3 \} \\Y = \{ 3,1,2,0 \} \end{array} $$

are also order isomorphic under total order $X_\lt$ and $Y_\gt$ and the identity bijection.

The power of the ordinals comes from the theorem that any well-ordered set is order isomorphic to some ordinal, $\alpha$. For example, the set $\{1,2,3,4\}$ is order isomophic to $\alpha = 4$.

This is the tricky part and where, in my mind, a lot of confusion arises.

Suppose you have a set of ordinals, $A$. For example, $A$ could be the set of the first 10 ordinals. (Note: $A$ can't be the set of all ordinals because, as we'll see below, this construct doesn't make sense). Define $$ \text{sup} \, A = \cup \, A $$

as the union of all the ordinals in $A$. This can be thought of as the least upper bound of $A$. For finite sets of ordinals, this is equivalent to the successor function. For example, $\text{sup}({0,1,2}) = 3$.

However, when $A$ is infinite, an ordinal constructed using $\text{sup}(A)$ is called a limit ordinal. It is distinct from other ordinals in that it is not a successor of any other ordinal. This concept has no correspondence in cardinal numbers.

Now, for a very important question. What ordinal is the set of naturals order isomophic to? In other words, for the infinite set under the total order $\lt$ $$ \mathbb{N} = {1,2,3,...} $$

what is the corresponding ordinal number that has the same order (under the total order of membership). Remember that we're using different ways (total orders) to determine what it means to be ordered in the set of naturals and in the set of ordinals.

The ordinal we're after is the $\text{sup}()$ of all the finite ordinals that we've seen defined above. It's the first limit ordinal. It's also the first infinite ordinal. To construct this number we start with the set of all finite ordinals $$ X = {0,1,2,3,...} $$

Members in this set are created by the application of the successor function $S(\alpha)$. We then create the following number by applying $\text{sup}()$ $$ \omega = \text{sup} \, X = \cup \, X $$

The order type of the naturals is $\omega$.

Adding $\omega$ to our set of ordinals, we get $$ {0,1,2,3,...,\omega} $$

Two things to notice. First, other than $0$, $\omega$ is the only ordinal in this set that doesn't have a successor. It is a limit ordinal.

Second, this set is still countable.

Let's apply the successor function to $\omega$ to get $S(\omega) = \{ \omega \cup \{ \omega \}\} $ which we can write in short hand as $\omega+1$. Remember, this is an ordinal so the $+ \, 1$ here doesn't mean the same as it does for cardinal numbers. It means we've applied the successor function. $$ {0,1,2,3,...,\omega, \omega+1} $$

If $\omega$ is order isomorphic to $\mathbb{N}$ then what is $\omega+1$ order isomorphic to? $$ \omega + 1 \cong {1,2,3,4,5,...,1} $$

where I've used $\cong$ to mean order isomophism.

We can keep applying the successor function, $$ \{ 0,1,2,3,...,\omega, \omega+1, \omega+2, \omega+3 \} $$

Eventually, we can apply $\text{sup}()$ again and get $\omega + \omega$ $$ \{ 0,1,2,3,...,\omega, \omega+1, \omega+2, \omega+3, ... , \omega + \omega \} $$

This number is written as $\omega + \omega = \omega \cdot 2$ and reflects the fact that we've applied $\text{sup}()$ twice (this, by the way, is the same number of $...$ in our original set).

Note that $\omega \cdot 2$ is not the same as $2 \cdot \omega$. The former means "two omegas". The latter means "omega many twos", which would correspond to a set like $$ \{0,1,\, 10,11,\, 20,21,\, 30,31,\, ...\} $$

But this set is order isomophic to $\omega$, which tells us that multiplication of ordinals is not commutative, since $2 \cdot \omega \ne \omega \cdot 2$. Ordinal arithmetic is a completely different beast than cardinal arithmetic.

Let's keep adding to our ordinals by applying the successor function $S(\alpha)$ infinitely many times, then applying $\text{sup}()$ and repeating this process. $$ \{ 0,1,2,3,...,\omega, \omega+1, ... \omega \cdot 2, ..., \omega \cdot 3, ... , \omega \cdot \omega , ... , \omega ^\omega , ... , \omega ^ {\omega ^ \omega}, \omega ^ {\omega ^ { ... ^ \omega }} \} $$

Everything at and after $\omega$ is an infinite ordinal. However, this set of ordinals is still countable.

That last number in the set above has a special name and is the first number that satisfies $$ \epsilon_0 = \omega ^ { \epsilon_0 } $$

Basically, we're taking $\omega$ to the power of $\omega$ infinitely many times. This is the first of the epsilon numbers.

There are infinitely many such epsilon numbers. The next one is $$ \epsilon_1 = \omega ^ { \epsilon_1 } $$

and so on.

All these are countable.

Waaat? Are you serious?

Yes.

Eventually, we reach the first uncountable ordinal, $\omega_1$. The word "reach" here is a metaphor—there are infinitely many limit ordinals that come before it.

$\omega_1$ is a limit ordinal and the first uncountable limit ordinal.

Note that we cannot have a set of all ordinals, since we could then take the $\text{sup}()$ of this set to create a new ordinal. This is Russel's Paradox.

We finally get to the point of introducting the ordinals. The cardinality of $\omega_1$ is $\aleph_1$. $$ | \{ 0,1,2,...,\omega, \omega+1, ... \omega \cdot 2, ..., \omega \cdot \omega , ... , \omega ^\omega , ... , \omega ^ {\omega ^ \omega}, ..., \epsilon_0, ... \epsilon_1, ... \} | = \aleph_1 $$

In other words $\aleph_1$ is the size of the set of all countable ordinals.

And we can now understand the Continnum Hypthesis as the statement that the number of countable ordinals is the same as the number of reals.

All this is very weird, very fun, and, for now, very over!

Discover Cantor's transfinite numbers through my music video for the Aleph 2 track of Max Cooper's Yearning for the Infinite (album page, event page).

I discuss the math behind the video and the system I built to create the video.

*Everything we see hides another thing, we always want to see what is hidden by what we see.
—Rene Magritte*

A Hidden Markov Model extends a Markov chain to have hidden states. Hidden states are used to model aspects of the system that cannot be directly observed and themselves form a Markov chain and each state may emit one or more observed values.

Hidden states in HMMs do not have to have meaning—they can be used to account for measurement errors, compress multi-modal observational data, or to detect unobservable events.

In this column, we extend the cell growth model from our Markov Chain column to include two hidden states: normal and sedentary.

We show how to calculate forward probabilities that can predict the most likely path through the HMM given an observed sequence.

Grewal, J., Krzywinski, M. & Altman, N. (2019) Points of significance: Hidden Markov Models. *Nature Methods* **16**:795–796.

Altman, N. & Krzywinski, M. (2019) Points of significance: Markov Chains. *Nature Methods* **16**:663–664.

My cover design for Hola Mundo by Hannah Fry. Published by Blackie Books.

Curious how the design was created? Read the full details.

*You can look back there to explain things,
but the explanation disappears.
You'll never find it there.
Things are not explained by the past.
They're explained by what happens now.
—Alan Watts*

A Markov chain is a probabilistic model that is used to model how a system changes over time as a series of transitions between states. Each transition is assigned a probability that defines the chance of the system changing from one state to another.

Together with the states, these transitions probabilities define a stochastic model with the Markov property: transition probabilities only depend on the current state—the future is independent of the past if the present is known.

Once the transition probabilities are defined in matrix form, it is easy to predict the distribution of future states of the system. We cover concepts of aperiodicity, irreducibility, limiting and stationary distributions and absorption.

This column is the first part of a series and pairs particularly well with Alan Watts and Blond:ish.

Grewal, J., Krzywinski, M. & Altman, N. (2019) Points of significance: Markov Chains. *Nature Methods* **16**:663–664.

*Places to go and nobody to see.*

Exquisitely detailed maps of places on the Moon, comets and asteroids in the Solar System and stars, deep-sky objects and exoplanets in the northern and southern sky. All maps are zoomable.

Quantile regression explores the effect of one or more predictors on quantiles of the response. It can answer questions such as "What is the weight of 90% of individuals of a given height?"

Unlike in traditional mean regression methods, no assumptions about the distribution of the response are required, which makes it practical, robust and amenable to skewed distributions.

Quantile regression is also very useful when extremes are interesting or when the response variance varies with the predictors.

Das, K., Krzywinski, M. & Altman, N. (2019) Points of significance: Quantile regression. *Nature Methods* **16**:451–452.

Altman, N. & Krzywinski, M. (2015) Points of significance: Simple linear regression. *Nature Methods* **12**:999–1000.