Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Twenty — minutes — maybe — more.Naomichoose four wordsmore quotes

In Silico Flurries: Computing a world of snow. Scientific American. 23 December 2017


visualization + design

Some of the images in this writeup are part of Ana Swanson's Wonk Blot post How a dog sees a rainbow, and 12 other images that explain how we see color at the Washington Post.

Color Palettes for Color Blindness

In an audience of 8 men and 8 women, chances are 50% that at least one has some degree of color blindness1,2. When encoding information or designing content, use colors that is color-blind safe.

1About 8% of males and 0.5% of females are affected with some kind of color blindness in populations of European descent (wikipedia, Worldwide prevalence of red-green color deficiency, JOSAA). The rate for other races is lower Asians and Africans is lower (Caucasian Boys Show Highest Prevalence of Color Blindness Among Preschoolers, AAO).

2The probability that among `N=8` men and `N=8` women at least one person is affected by color blindness is `P(men,women) = P(8,8) = 1 - (1-0.08)^8 * (1-0.005)^8 = 0.51`. For `N=34` (i.e., 68 people in total), this probability is `P(34,34)=0.95`. Because the rate of color blindness in women is so low, for most groups of mixed gender we can approximate the probability by only counting the men. For example, in a group of 17 women the probability that at least one of them is color blind is `P(0,17) = 0.082`, which is the same probability as for 1 man, `P(1,0)`.

Color Oracle is a good and free color blindness simulator for Windows, Mac and Linux.

color blindness RGB transformation tables

You can download the RGB transformation table for deuteranopia, protanopia and tritanopia. It is available for all (r,g,b) colors in steps of 5 in each of the channels. The mapping for all other RGB colors can be interpolated.

Transformation for all 16.8 million RGB colors (interpolated from the table above) are also available independently for each type of color blindness: deuteranopia, protanopia, and tritanopia.

color receptors are reduced or absent in color blindness

The normal human eye is a 3-channel color detector3. There are three types of photoreceptors, each sensitive to a different part of the spectrum. Their combined response to a given wavelength produces a unique response that is the basis of the perception of color.

3Compared to hearing, the color vision is a primitive detector. While we can hear thousands of distinct frequencies and process them simultaneously, we have only three independent color inputs. While the ear can distinguish pure tones from complex sounds that have multiple frequencies the eye is relatively unsophisticated in separating a color sensation into its three constituent primary stimuli.

People with color blindness have one of the photo receptor groups either reduced in number or entirely missing. With only two groups of photoreceptors, the perception of hue is drastically altered.

For example, in deuteranopia, the most common type of color blindness, the medium (M) wavelength photoreceptors are reduced in number or missing. This results in the loss of perceived difference between reds and greens because only one group of photoreceptors (L) are sensitive to the wavelengths of these colors. The spectrum appears to be split into two hues along the blue-green boundary (see figure below).

Color photoreceptor profile for color blindness and the appearance of art and objects. / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Each of the three kinds of color blindness are associated with reduced number of each of the three kinds of photoreceptors. In extreme cases, a given type of photoreceptors may be missing. To people with color blindness, objects appear very differently. Artwork is (left) Edvard Munch, Scream (Skrik), 1893, National Gallery, Oslo, Norway (right) Claude Monet, Coquelicots, La promenade (Poppies), 1873, Musée d'Orsay, Paris. (zoom)

Visible light is in the range of 390-700 nm. The exact definition of the upper limit varies, with some sources giving as high as 760 nm. Shorter wavelengths are absorbed by the cornea (<295nm) and lens (315-390nm). Some near infrared light also reaches the retina (760-1400nm).

super color vision

The opposite condition to color blindness exists too—tetrachromacy. In this case, an individual has an extra type of color receptor which improves discrimination in the red part of the spectrum. While the anatomy of their retina can be described, how true tetrachromats subjectively perceive color is unknown. And, perhaps, even unknowable.

Tetrachromacy is common in other animals, such as fish (e.g. goldfish, zebrafish) and birds (e.g. finch, starling). The dimensionality of the perceived color space isn't necessarily proportional to the number of different receptors. If the signal from 3 color receptors are combined by the brain and each processor has a weighted response to a broad range of wavelengths, then a color can be modeled by a point in 3-dimensional space, in which the receptors are the axes. This system can perceive a large number of colors.

In the extreme case where the receptors respond to a very narrow range, of which none overlap with the other, a color is one of three points in a 1-dimensional space. This sytem can perceive only 3 colors.

For example, although the mantis shrimp has 12 different color receptors, the receptors work independently, their color discrimination is poorer than ours.

it's all the same to me

Color palette for color blindness. / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Sets of representative hues and tones that are indistinguishable to individuals with different kinds of color blindness. The rectangle below the each color pair shows how the colors appear to someone with color blindness.

If you use Color Oracle to transform your screen colors to simulate color blindness, you can see that none of the equivalent swatches in one kind of color blindness are equivalent in another. This is particularly interesting when applied to a duotone image which is drawn using equivalent colors. In the figure below4, a row of Mr. Spocks disappears (or is difficult to see) to people with color blindness.

Spock, indistinguishable to people with color blindness. / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The likeness of Mr. Spock drawn using equivalent colors (see image above) for each of the three kinds of color blindness. Image from imagebuddy.

4In tribute to Leonard Nimoy, 1931–2015

conservative 7-color palettes for color blindness

To people with color blindness, some colors appear the same. This equivalence can be used to identify distinct colors which are unique to those with normal and color blind vision.

The seven colors (and black) in the figure below are perceived as reasonably distinct by both normal and color blind individuals. The table on the left is reproduced from Nature Method's Points of View: Color blindness by Bang Wong.

Color palette for color blindness. / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
(left) 7-color palette for color blindness, from Wong, B. (2011) Nature Methods 8:441. (right) The same palette ordered by hue and tone for a deuteranope. (zoom, PDF)

For more tips about designing with color blindness in mind, see Color Universal Design (CUD) — How to make figures and presentations that are friendly to Colorblind people.

12-color palettes for color blindness

The figure below shows the mapping of different colors to six different grades of each of the two hues seen by deuteranopes. It offers more distinct options than the 7-color palette above.

Color palette for color blindness. / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
(left) Colors grouped by equivalence of perception in deuteranopes. Each of the two hues is represented in six different brightness and chroma combinations. (right) One of the subsets of colors on the left that are reasonably distinct in both deuteranopia and protanopia. To tritanopes, three of the pairs are difficult to distinguish. (zoom, PDF)

15-color palettes for color blindness

Even more color choices for color blindess, including colors that map onto greys.

If you're looking to encode quantitative information, I suggest using the subset of Brewer palettes that are safe for color blindess (e.g. pink-yellow-green, brown-blue-green).

15-color palettes for color blindness / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
15-color palettes designed for each of the three types of color blindness: deuteranopia, protanopia and tritanopia. Palettes are shown as they appear to someone with normal vision as well as to someone affected with each of the three types of color blindness. Each palette contains three groups of swatches, matching to two of the color channels and greys. Within each group colors in the same row map onto the same color. (zoom, PDF)
VIEW ALL

news + thoughts

Curse(s) of dimensionality

Tue 05-06-2018
There is such a thing as too much of a good thing.

We discuss the many ways in which analysis can be confounded when data has a large number of dimensions (variables). Collectively, these are called the "curses of dimensionality".

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Curse(s) of dimensionality. (read)

Some of these are unintuitive, such as the fact that the volume of the hypersphere increases and then shrinks beyond about 7 dimensions, while the volume of the hypercube always increases. This means that high-dimensional space is "mostly corners" and the distance between points increases greatly with dimension. This has consequences on correlation and classification.

Altman, N. & Krzywinski, M. (2018) Points of significance: Curse(s) of dimensionality Nature Methods 15:399–400.

Statistics vs Machine Learning

Tue 03-04-2018
We conclude our series on Machine Learning with a comparison of two approaches: classical statistical inference and machine learning. The boundary between them is subject to debate, but important generalizations can be made.

Inference creates a mathematical model of the datageneration process to formalize understanding or test a hypothesis about how the system behaves. Prediction aims at forecasting unobserved outcomes or future behavior. Typically we want to do both and know how biological processes work and what will happen next. Inference and ML are complementary in pointing us to biologically meaningful conclusions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Statistics vs machine learning. (read)

Statistics asks us to choose a model that incorporates our knowledge of the system, and ML requires us to choose a predictive algorithm by relying on its empirical capabilities. Justification for an inference model typically rests on whether we feel it adequately captures the essence of the system. The choice of pattern-learning algorithms often depends on measures of past performance in similar scenarios.

Bzdok, D., Krzywinski, M. & Altman, N. (2018) Points of Significance: Statistics vs machine learning. Nature Methods 15:233–234.

Background reading

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: supervised methods. Nature Methods 15:5–6.

...more about the Points of Significance column

Happy 2018 `\pi` Day—Boonies, burbs and boutiques of `\pi`

Wed 14-03-2018

Celebrate `\pi` Day (March 14th) and go to brand new places. Together with Jake Lever, this year we shrink the world and play with road maps.

Streets are seamlessly streets from across the world. Finally, a halva shop on the same block!

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
A great 10 km run loop between Istanbul, Copenhagen, San Francisco and Dublin. Stop off for halva, smørrebrød, espresso and a Guinness on the way. (details)

Intriguing and personal patterns of urban development for each city appear in the Boonies, Burbs and Boutiques series.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
In the Boonies, Burbs and Boutiques of `\pi` we draw progressively denser patches using the digit sequence 159 to inform density. (details)

No color—just lines. Lines from Marrakesh, Prague, Istanbul, Nice and other destinations for the mind and the heart.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Roads from cities rearranged according to the digits of `\pi`. (details)

The art is featured in the Pi City on the Scientific American SA Visual blog.

Check out art from previous years: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, 2016 `\pi` Day and 2017 `\pi` Day.

Machine learning: supervised methods (SVM & kNN)

Thu 18-01-2018
Supervised learning algorithms extract general principles from observed examples guided by a specific prediction objective.

We examine two very common supervised machine learning methods: linear support vector machines (SVM) and k-nearest neighbors (kNN).

SVM is often less computationally demanding than kNN and is easier to interpret, but it can identify only a limited set of patterns. On the other hand, kNN can find very complex patterns, but its output is more challenging to interpret.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Machine learning: supervised methods (SVM & kNN). (read)

We illustrate SVM using a data set in which points fall into two categories, which are separated in SVM by a straight line "margin". SVM can be tuned using a parameter that influences the width and location of the margin, permitting points to fall within the margin or on the wrong side of the margin. We then show how kNN relaxes explicit boundary definitions, such as the straight line in SVM, and how kNN too can be tuned to create more robust classification.

Bzdok, D., Krzywinski, M. & Altman, N. (2018) Points of Significance: Machine learning: a primer. Nature Methods 15:5–6.

Background reading

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

...more about the Points of Significance column