Let me tell you about something.

Distractions and amusements, with a sandwich and coffee.

Mad about you, orchestrally.
•
• feel the vibe, feel the terror, feel the pain

Typography geek? If you like the geometry and mathematics of these posters, you may enjoy something more lettered. Visions of type: Type Peep Show: The Private Curves of Letters posters.

Watch the video at Numberphile about my art.

numbers.tgz

1,000,000 digits of π, φ, e and ASN.

All the artwork can be purchased from Fine Art America. Most of the pieces were created by myself, and some by Cristian Ilies Vasile.

Numerology is bogus, but art based on numbers has a beautiful random quality.

For other examples of numerical art, see my *i*nessiness project. Nixie clock lovers should investigate the accidental similarity number (ASN), which I render in a ASN Nixie poster.

It's fitting to use Circos to visualize the digits of π. After all, what is more round than Circos?

Cristian Ilies Vasile had the idea of representing the digits of π as a path traced by links between successive digits. Each digit is assigned a segment around the circle and a link between segment *i* and *j* corresponds to the appearance of *ij* in π. For example, the "14" in "3.14..." is drawn as a link between segment 1 and segment 4.

The position of the link on a digit's segment is associated with the position of the digit π. For example, the "14" link associated with the 2nd digit (1) and the 3rd digit (4) is drawn from position 2 on the 1 segment to position 3 on the 4 segment.

As more digits are added to the path, the image becomes a weaving mandala.

I added to Cristian's representation by showing the number of transitions between digits in a series of concentric circles placed outside the links. This summary representation counts the number of transition links within a region and addresses the question of what kind of digits appear immediately before or after a given digit in π. The approach is diagrammed below.

The original images were generated using the 10-color Brewer paired qualitative palette, which was later modified as shown below.

The bubbles that count the number of links quickly draw attention to regions where specific digit pairs are frequent. In the image for π below, which shows transitions for the first 1,000 digits, the large bubble on the 9 segment is due to the sequence "999999" sequence at decimal place 762.

This sequence of 6 9's occurs significantly earlier than expected by chance. Because the distribution and sequence of digits of π is, as far as we know, uniformly random, we can calculate how frequently we should expect a series of 6 identical digits.

For a given digit, the chance that the next 5 digits are the same is 0.00001 (0.1 that the next digit is the same * 0.1 that the second-nex digit is the same * ...). Therefore the chance that a given position the next 5 digits are *not* the same is 1 - 1/0.00001 = 0.99999. From this, the chance that *k* consecutive digits don't initiate a 6-digit sequence is therefore 0.99999^{k}.

If I ask what is *k* for which this value is 0.5, I need to solve 0.99999^{k}, which gives *k* = 69,314. Thus, chances are 50-50 that in a 69,000 digit random sequence we'll see a run of 6 idendical digits. This calculation is an approximation.

It's fun to look for words in π. For example, love appears at 13,099,586th digit.

The transition probabilities for each 10 digit bin for the first 2,000 digits of π, φ and e are shown in the image below.

The digits of π are, as far as we know, randomly distributed. Art based on its digits therefore as a quality that is influenced by this random distribution. To provide a reference of what such a random pattern looks like, below are 16 random numbers represented in the same way. They're all different, yet strangely the same.

Below are more images by Cristian Ilies Vasile, where dots are used to represent the adjacency between digits. As in the image above, each digit 0-9 is represented by a colored segment. For each digit sequence *ij*, a dot is placed on *i*th's segment at the position of *i* colored by *j*.

For example, for π the dot coordinates for the first 7 digits are (segment:position:label) 3:0:1 → 1:1:4 → 4:2:1 → 1:3:5 → 5:4:9 ...

segment position colored_by 3 0 1 1 1 4 4 2 1 1 3 5 5 4 9 9 5 2 2 6 6

Because there is a large number of digits, the dots stack up near their position to avoid overlapping. The layout of the dots is automated by Circos' text track layout.

When the digits of π, e and φ are aligned, positions at which the three numbers have the same digit yield the accidental similarity number (ASN). Below is a dot plot of the transition of the ASN.

By mapping the digits onto a red-yellow-blue Brewer palette (0 9) and placing them as circles on an Archimedean spiral a dense and pleasant layout can be obtained.

Why the Archimedean spiral? This spiral is defined as *r* = *a* + *b*θ and has the interesting property that a ray from the origin will intersect the spiral every 2π*b*. Thus, each spiral can accomodate inscribed circles of radius π*b*.

Why the Brewer palette? These color schemes have some very useful perceptual properties and are commonly used to encode quantitative and categorical data.

In the April Points of Significance Nature Methods column, we continue our and consider what happens when we run a large number of tests.

Observing statistically rare test outcomes is expected if we run enough tests. These are statistically, not biologically, significant. For example, if we run *N* tests, the smallest *P* value that we have a 50% chance of observing is 1–exp(–ln2/*N*). For *N* = 10^{k} this *P* value is *P*_{k}=10^{–k}ln2 (e.g. for 10^{4}=10,000 tests, *P*_{4}=6.9×10^{–5}).

We discuss common correction schemes such as Bonferroni, Holm, Benjamini & Hochberg and Storey's *q* and show how they impact the false positive rate (FPR), false discovery rate (FDR) and power of a batch of tests.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part II — Multiple Testing *Nature Methods* **11**:215-216.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — *t*-tests *Nature Methods* **11**:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, *P* values and *t*-tests *Nature Methods* **10**:1041-1042.

Celebrate Pi Day (March 14th) with the art of folding numbers. This year I take the number up to the Feynman Point and apply a protein folding algorithm to render it as a path.

For those of you who liked the minimalist and colorful digit grid, I've expanded on the concept to show stacked ring plots of frequency distributions.

And if spirals are your thing...

In the March Points of Significance Nature Methods column, we continue our discussion of *t*-tests from November (Significance, *P* values and *t*-tests).

We look at what happens how uncertainty of two variables combines and how this impacts the increased uncertainty when two samples are compared and highlight the differences between the two-sample and paired *t*-tests.

When performing any statistical test, it's important to understand and satisfy its requirements. The *t*-test is very robust with respect to some of its assumptions, but not others. We explore which.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I *Nature Methods* **11**:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, *P* values and *t*-tests *Nature Methods* **10**:1041-1042.

Beautiful Science explores how our understanding of ourselves and our planet has evolved alongside our ability to represent, graph and map the mass data of the time. The exhibit runs 20 February — 26 May 2014 and is free to the public. There is a good Nature blog writeup about it, a piece in The Guardian, and a great video that explains the the exhibit narrated by Johanna Kieniewicz, the curator.

I am privileged to contribute an information graphic to the exhibit in the Tree of Life section. The piece shows how sequence similarity varies across species as a function of evolutionary distance. The installation is a set of 6 30x30 cm backlit panels. They look terrific.

Quick, name three chart types. Line, bar and scatter come to mind. Perhaps you said pie too—tsk tsk. Nobody ever thinks of the box plot.

Box plots reveal details about data without overloading a figure with a full frequency distribution histogram. They're easy to compare and now easy to make with BoxPlotR (try it). In our fifth Points of Significance column, we take a break from the theory to explain this plot type and—I hope— convince you that they're worth thinking about.

The February issue of Nature Methods kicks the bar chart two more times: Dan Evanko's Kick the Bar Chart Habit editorial and a Points of View: Bar charts and box plots column by Mark Streit and Nils Gehlenborg.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Visualizing samples with box plots *Nature Methods* **11**:119-120.

I recently presented at the Wired Data|Life 2013 conference, sharing my thoughts on The Art and Science of Data Visualization.

For specialists, visualizations should expose detail to allow for exploration and inspiration. For enthusiasts, they should provide context, integrate facts and inform. For the layperson, they should capture the essence of the topic, narrate a story and deligt.

Wired's Brandon Keim wrote up a short article about me and some of my work—Circle of Life: The Beautiful New Way to Visualize Biological Data.