Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
This love loves love. It's a strange love, strange love.Liz Fraserwatch

phi: exciting


Visualization Tour, Melbourne, October 9–20, 2014


visualization + design

download

numbers.tgz
1,000,000 digits of π, φ, e and ASN.

buy artwork

All the artwork can be purchased from Fine Art America.

buy Martin Krzywinski's work

← art(π,φ,e)

accidental similarity number

The accidental similarity number is a kind of overlap between numbers. I came up with this concept after creating typographical art about the 4ness of π.

To construct this number for π, φ and e we first write the numbers on top of each other and then identify positions for which the numbers have the same digit.

3.1415926535897932 … 21170679821 … 10270193852 … 
1.6180339887498948 … 93911374847 … 08659593958 … 
2.7182818284590452 … 51664274274 … 32862794349 … 

These digits are then used to create the accidental similarity number. In thise case,

0.979 …

By definition, the decimal is held in place.

accidental similarity art

The poster shows the accidental similarity number for π, φ and e created from the first 1,000,000 digits of each number. There are 9,997 positions in which these numbers have the same digit, but only 9,996 are shown because the distance between positions is used to color the digit and I was limited by input files with 1M digits.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca buy artwork
The accidental similarity number for π, φ and e created from the first 1,000,000 digits of each number. (PNG, BUY ARTWORK)

The distribution of distances follows a Poisson distribution with an average of 100, with about 1-1/e values being smaller than 100.

The font is Neutraface Slab Display Medium.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Segments of π, φ and e are connected by thin links if the same digit is shared between two numbers, and thick links if among all three. Shown for the first 10,000 digits. (PNG)

properties of the accidental similarity number

Any properties are accidental, but curiously ASN(π, φ, e) ≈ 1.

If you find other curiously accidental properties, let me know.

data files

Download the first 9,997 digits of the accidental similarity number. This file provides the ASN digit index, the digit and the position from which it is sampled.

other number art

I came up with Accidental Similarity Number immediately after creating this poster of the overlap between π, φ and e.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca buy artwork
The overlap between the three interesting numbers π, φ and e (nixie theme). (PNG, BUY ARTWORK)

This thought stream started with the 4ness of π.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca buy artwork
The 4ness of π — a measure of how similar each 4 is to its neighbours. (read more, BUY ARTWORK)

news + thoughts

Replication—Quality over Quantity

Tue 02-09-2014

It's fitting that the column published just before Labor day weekend is all about how to best allocate labor.

Replication is used to decrease the impact of variability from parts of the experiment that contribute noise. For example, we might measure data from more than one mouse to attempt to generalize over all mice.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Replication. (read)

It's important to distinguish technical replicates, which attempt to capture the noise in our measuring apparatus, from biological replicates, which capture biological variation. The former give us no information about biological variation and cannot be used to directly make biological inferences. To do so is to commit pseudoreplication. Technical replicates are useful to reduce the noise so that we have a better chance to detect a biologically meaningful signal.

Blainey, P., Krzywinski, M. & Altman, N. (2014) Points of Significance: Replication Nature Methods 11:879-880.

Background reading

Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of variance (ANOVA) and blocking Nature Methods 11:699-700.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

...more about the Points of Significance column

Monkeys on a Hilbert Curve—Scientific American Graphic

Tue 19-08-2014

I was commissioned by Scientific American to create an information graphic that showed how our genomes are more similar to those of the chimp and bonobo than to the gorilla.

I had about 5 x 5 inches of print space to work with. For 4 genomes? No problem. Bring out the Hilbert curve!

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Our genomes are much more similar to the chimp and bonobo than to the gorilla. And, we're practically still Denisovans. (details)

To accompany the piece, I will be posting to the Scientific American blog about the process of creating the figure. And to emphasize that the genome is not a blueprint!

As part of this project, I created some Hilbert curve art pieces. And while exploring, found thousands of Hilbertonians!

Happy Pi Approximation Day— π, roughly speaking 10,000 times

Wed 13-08-2014

Celebrate Pi Approximation Day (July 22nd) with the art of arm waving. This year I take the first 10,000 most accurate approximations (m/n, m=1..10,000) and look at their accuracy.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Accuracy of the first 10,000 m/n approximations of Pi. (details)

I turned to the spiral again after applying it to stack stacked ring plots of frequency distributions in Pi for the 2014 Pi Day.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frequency distribution of digits of Pi in groups of 4 up to digit 4,988. (details)

Analysis of Variance (ANOVA) and Blocking—Accounting for Variability in Multi-factor Experiments

Mon 07-07-2014

Our 10th Points of Significance column! Continuing with our previous discussion about comparative experiments, we introduce ANOVA and blocking. Although this column appears to introduce two new concepts (ANOVA and blocking), you've seen both before, though under a different guise.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Analysis of variance (ANOVA) and blocking. (read)

If you know the t-test you've already applied analysis of variance (ANOVA), though you probably didn't realize it. In ANOVA we ask whether the variation within our samples is compatible with the variation between our samples (sample means). If the samples don't all have the same mean then we expect the latter to be larger. The ANOVA test statistic (F) assigns significance to the ratio of these two quantities. When we only have two-samples and apply the t-test, t2 = F.

ANOVA naturally incorporates and partitions sources of variation—the effects of variables on the system are determined based on the amount of variation they contribute to the total variation in the data. If this contribution is large, we say that the variation can be "explained" by the variable and infer an effect.

We discuss how data collection can be organized using a randomized complete block design to account for sources of uncertainty in the experiment. This process is called blocking because we are blocking the variation from a known source of uncertainty from interfering with our measurements. You've already seen blocking in the paired t-test example, in which the subject (or experimental unit) was the block.

We've worked hard to bring you 20 pages of statistics primers (though it feels more like 200!). The column is taking a month off in August, as we shrink our error bars.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of Variance (ANOVA) and Blocking Nature Methods 11:699-700.

Background reading

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — t-tests Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

...more about the Points of Significance column