Martin Krzywinski / Genome Sciences Center / Martin Krzywinski / Genome Sciences Center / - contact me Martin Krzywinski / Genome Sciences Center / on Twitter Martin Krzywinski / Genome Sciences Center / - Lumondo Photography Martin Krzywinski / Genome Sciences Center / - Pi Art Martin Krzywinski / Genome Sciences Center / - Hilbertonians - Creatures on the Hilbert Curve
Tango is a sad thought that is danced.Enrique Santos Discépolothink & dancemore quotes

Functional annotation of gene sequences—a visualization workshop. Poznan, Poland. Dec 12, 2015

visualization + design

Typography geek? If you like the geometry and mathematics of these posters, you may enjoy something more lettered. Visions of type: Type Peep Show: The Private Curves of Letters posters.

The art of Pi (`pi`), Phi (`phi`) and `e`

Martin Krzywinski @MKrzywinski
Support Ellie Balk's Kickstarter community math mural project in which Brooklyn students learn math and art to visualize `pi`.

Pi Art Posters
 / Martin Krzywinski @MKrzywinski
2013 `pi` day

Pi Art Posters
 / Martin Krzywinski @MKrzywinski
2014 `pi` day

Pi Art Posters
 / Martin Krzywinski @MKrzywinski
2015 `pi` day

Pi Art Posters
 / Martin Krzywinski @MKrzywinski
2014 `pi` approx day

Pi Art Posters
 / Martin Krzywinski @MKrzywinski
Circular `pi` art

Numbers are a lot of fun. They can start conversations—the interesting number paradox is a party favourite: every number must be interesting because the first number that wasn't would be very interesting! Of course, in the wrong company they can just as easily end conversations.

The art here is my attempt at transforming famous numbers in mathematics into pretty visual forms, start some of these conversations and awaken emotions for mathematics—other than dislike and confusion

Martin Krzywinski @MKrzywinski
Like music with numbers? Try Angels at My Door (Una), Pt vs Ys (Yoshinori Sunahara), 2wicky (Hooverphonic), One (Aimee Mann), Straight to Number One (Touch and Go), 99 luftbaloons (Nena).

Numerology is bogus, but art based on numbers can be beautiful. Proclus got it right when he said (as quoted by M. Kline in Mathematical Thought from Ancient to Modern Times)

Wherever there is number, there is beauty.
Proclus Diadochus

Pi Art Posters
 / Martin Krzywinski @MKrzywinski
2,258 digits of `phi`, 3,855 digits of `e` and 3,628 digits of `pi` in 6 level treemaps. Uniform line thickness. Bauhaus prime colors in Piet Mondrian style. (2015 `pi` day posters)
Martin Krzywinski @MKrzywinski
All art posters are available for purchase.
I take custom requests.

the numbers π, φ and e

The consequence of the interesting number paradox is that all numbers are interesting. But some are more interesting than others—how Orwellian!

All animals are equal, but some animals are more equal than others.
—George Orwell (Animal Farm)

Numbers such as `pi` (or `tau` if you're a revolutionary), `phi`, `e`, `i = \sqrt{-1}`, and `0` have captivated imagination. Chances are at least one of them appears in the next physics equation you come across.

= 3.14159 26535 89793 23846 26433 83279 50288 41971 69399 ...
= 1.61803 39887 49894 84820 45868 34365 63811 77203 09179 ...
= 2.71828 18284 59045 23536 02874 71352 66249 77572 47093 ...

Of these three transcendental numbers, `\pi` (3.14159265...) is the most well known. It is the ratio of a circle's circumference to its diameter (`d = \pi r`) and appears in the formula for the area of the circle (`a = \pi r^2`).

The Golden Ratio (`\phi`, 1.61803398...) is the attractive proportion of values `a > b` that satisfy `{a+b}/2 = a/b`, which solves to `a/b = {1 + \sqrt{5}}/2`.

The last of the three numbers, `e` (2.71828182...) is Euler's number and also known as the base of the natural logarithm. It, too, can be defined geometrically—it is the unique real number, `e`, for which the function `f(x) = e^x` has a tangent of slope 1 at `x=0`. Like `\pi`, `e` appears throughout mathematics. For example, `e` is central in the expression for the normal distribution as well as the definition of entropy. And if you've ever heard of someone talking about log plots ... well, there's `e` again!

Two of these numbers can be seen together in mathematics' most beautiful equation, the Euler identity: `e^{i pi} = -1`. The tau-oists would argue that this is even prettier: `e^{i tau} = 1`.

accidentally similar

Did you notice how the 13th digit of all three numbers is the same (9)? This accidental similarity generates its own number—the Accidental Similarity Number (ASN).


news + thoughts

Play the Bacteria Game

Thu 19-11-2015

Choose your own dust adventure!

Nobody likes dusting but everyone should find dust interesting.

Working with Jeannie Hunnicutt and with Jen Christiansen's art direction, I created this month's Scientific American Graphic Science visualization based on a recent paper The Ecology of microscopic life in household dust.

Martin Krzywinski @MKrzywinski
An analysis of dust reveals how the presence of men, women, dogs and cats affects the variety of bacteria in a household. Appears on Graphic Science page in December 2015 issue of Scientific American.

This was my third information graphic for the Graphic Science page. Unlike the previous ones, it's visually simple and ... interactive. Or, at least, as interactive as a printed page can be.

More of my American Scientific Graphic Science designs

Barberan A et al. (2015) The ecology of microscopic life in household dust. Proc. R. Soc. B 282: 20151139.

Names for 5,092 colors

Tue 03-11-2015

A very large list of named colors generated from combining some of the many lists that already exist (X11, Crayola, Raveling, Resene, wikipedia, xkcd, etc).

Martin Krzywinski @MKrzywinski
Confused? So am I. That's why I made a list.

For each color, coordinates in RGB, HSV, XYZ, Lab and LCH space are given along with the 5 nearest, as measured with ΔE, named neighbours.

I also provide a web service. Simply call this URL with an RGB string.

Simple Linear Regression

Sat 07-11-2015

It is possible to predict the values of unsampled data by using linear regression on correlated sample data.

This month, we begin our column with a quote, shown here in its full context from Box's paper Science and Statistics.

In applying mathematics to subjects such as physics or statistics we make tentative assumptions about the real world which we know are false but which we believe may be useful nonetheless. The physicist knows that particles have mass and yet certain results, approximating what really happens, may be derived from the assumption that they do not. Equally, the statistician knows, for example, that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world.
Box, G. J. Am. Stat. Assoc. 71, 791–799 (1976).

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Simple Linear Regression. (read)

This column is our first in the series about regression. We show that regression and correlation are related concepts—they both quantify trends—and that the calculations for simple linear regression are essentially the same as for one-way ANOVA.

While correlation provides a measure of a specific kind of association between variables, regression allows us to fit correlated sample data to a model, which can be used to predict the values of unsampled data.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Simple Linear Regression Nature Methods 12:999-1000.

Background reading

Altman, N. & Krzywinski, M. (2015) Points of significance: Association, correlation and causation Nature Methods 12:899-900.

Krzywinski, M. & Altman, N. (2014) Points of significance: Analysis of variance (ANOVA) and blocking. Nature Methods 11:699-700.

...more about the Points of Significance column

Association, correlation and causation

Sat 07-11-2015

Correlation implies association, but not causation. Conversely, causation implies association, but not correlation.

This month, we distinguish between association, correlation and causation.

Association, also called dependence, is a very general relationship: one variable provides information about the other. Correlation, on the other hand, is a specific kind of association: an increasing or decreasing trend. Not all associations are correlations. Moreover, causality can be connected only to association.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Association, correlation and causation. (read)

We discuss how correlation can be quantified using correlation coefficients (Pearson, Spearman) and show how spurious corrlations can arise in random data as well as very large independent data sets. For example, per capita cheese consumption is correlated with the number of people who died by becoming tangled in bedsheets.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Association, correlation and causation Nature Methods 12:899-900.

...more about the Points of Significance column

Bayesian networks

Thu 01-10-2015

For making probabilistic inferences, a graph is worth a thousand words.

This month we continue with the theme of Bayesian statistics and look at Bayesian networks, which combine network analysis with Bayesian statistics.

In a Bayesian network, nodes represent entities, such as genes, and the influence that one gene has over another is represented by a edge and probability table (or function). Bayes' Theorem is used to calculate the probability of a state for any entity.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Bayesian networks. (read)

In our previous columns about Bayesian statistics, we saw how new information (likelihood) can be incorporated into the probability model (prior) to update our belief of the state of the system (posterior). In the context of a Bayesian network, relationships called conditional dependencies can arise between nodes when information is added to the network. Using a small gene regulation network we show how these dependencies may connect nodes along different paths.

Background reading

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayesian Statistics Nature Methods 12:277-278.

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayes' Theorem Nature Methods 12:277-278.

...more about the Points of Significance column

Unentangling complex plots

Fri 10-07-2015

The Points of Significance column is on vacation this month.

Meanwhile, we're showing you how to manage small multiple plots in the Points of View column Unentangling Complex Plots.

Martin Krzywinski @MKrzywinski
Nature Methods Points of View column: Unentangling complex plots. (download, more about Points of View)

Data in small multiples can vary in range, noise level and trend. Gregor McInerny and myself show you how you can deal with this by cropped and scaling the multiples to a different range to emphasize relative changes while preserving the context of the full data range to show absolute changes.

McInerny, G. & Krzywinski, M. (2015) Points of View: Unentangling complex plots. Nature Methods 12:591.

...more about the Points of View column