Martin Krzywinski / Genome Sciences Center / Martin Krzywinski / Genome Sciences Center / - contact me Martin Krzywinski / Genome Sciences Center / on Twitter Martin Krzywinski / Genome Sciences Center / - Lumondo Photography Martin Krzywinski / Genome Sciences Center / - Pi Art Martin Krzywinski / Genome Sciences Center / - Hilbertonians - Creatures on the Hilbert Curve
Thoughts rearrange, familiar now strange.Holly Golightly & The Greenhornes break flowers

science: exciting

Workshop at Brain and Mind Symposium, Långvik Congress Center, Kirkkonummi, Sep 17–18 2015.

communication + science

Nature Methods: Points of View

Martin Krzywinski @MKrzywinski
The full collection of a 35 Points of View column is now available. (3 years of Points of View)

Practical Tips for Effective Figures

Points of View — History

In its 2.5 year history, the PoV column has established a significant legacy— it is one of the most frequently accessed parts of Nature Methods. The reason I think is clear: the community sees the value in clear and effective visual communication and acknowledges the need for a forum in which best practices in the field are presented practically and accessibly.

Bang Wong, in collaboration with visiting authors (Noam Shoresh, Nils Gehlenborg, Cydney Nielsen and Rikke Schmidt Kjærgaard), has penned 29 columns in the period of August 2010 to December 2012, covering broad topics such as salience, Gestalt principles, color, typography, negative space, layout, and data integration.

When it was A.C. Greyling's turn to speak at a debate in which Christopher Hitchens and Richard Dawkins already made their points, Greyling said

When one gets up to speak this late in a debate, one is a bit tempated to quote that Hungarian M.P. who after a long, long, long discussion in the parliament in Budapest stood up and said, "Everything has been said but not everybody said it yet." (watch on YouTube)

Indeed, this is quite how I feel after being offered to be the new author of Nature Methods Point of View column. Both Bang and Hitchens provide significant inspiration for me, so Greyling's words are particularly fitting.

To improve on the column is impossible. My challenge is to identify useful topics that have not yet been covered. I will be working closely with Nature Methods and Bang to ensure that the columns strike the right balance of topic, tone and timbre.

Don't hesitate to let me know whether PoV continues to hold your interest.

Nature Editors Announce Return of Points of View

The announcement of the return of the column, together with its history and a description of me, the new author, are available at the Nature Methods methagora blog.

Humor is kept by repeated reference to my now-dead-but-once-famous pet rat.

Points of View Collection now Open Access

For the month of August 2013, the entire set of 35 columns is available for free.

Common Challenges in Figure Design

Andreas Dahlin runs a figure making course at Uppsala University. He was kind to share with me common questions and concerns that his students have when creating figures (emphasis is mine).

I face problems for using the tools in power point to make nice illustration figures, and in addition how one can enhance the resolution of the figures to print it in a high quality mode.

In my opinion, the most difficult thing is how to draw the good-looking pictures and design the structure of slide to make it simple and substantial in content.

I find it difficult to find the right software to draw pictures.

The most difficult thing for me, when I make a figure, is to arrange the parts of the figure in a way they look nice and understandable.

I think the most difficult part is creating the concept, how to make a figure easy and fast to understand but not lacking all essential parts.

Stepping outside of my own knowledge of what the picture presents and viewing it as someone who sees it for the first time. It's easy to assume that some things are self evident and not making them clear enough in the pictures.

Figures that not are plots can also be tricky to get to look nice.

Anytime you have to draw something in paint, gimp, or other image program it requires a lot of work to make it look even slightly better than crap.

The most difficult thing (in general) is to include as much information as possible and display it in a way that is easy to understand. Figures should be intuitive for the reader, which is sometimes difficult to achieve. There might also be technical difficulties in achieving what you've visualized.

I think the most difficult part for me is to highlight the main idea I would like to express.

For me the most difficult part is making 3-D figures. Also while making figures its hard to decide on the good colors to choose for the figure.

In my opinion, the most difficult part when making a figure is don't know which software we can use and how to use.

The most difficult part for me is to start it! Because I am so meticulous and I am a painter, then it is not so easy to make decision about my figures and which one is better and so on, then finally I give up and put just one figure which of course I don't like...

I think it is difficult to put together my ideas to something that is connected and makes it easier for the viewer to understand.

It is so easy to just get an image from internet. I don’t know what is ok to do. There seems to be different rules in different communities.

To come up with a figure that does not simplify the concept too much at the same time as it does not overwhelm the viewer. To get some ideas for this is the reason why I take the course. ;-)

To me, how to make it easy to understand is the difficult part.

I think it is to save it in the correct format: Raster or vector, png or jpg or pdf... especially if I want to make some changes in the future to the figure.

I think is to choose the most appropriate figure that really help to transmit the information we want. Then, how many words can be good enough for been part of the message. At the beginning I used to use too many.

Apart from the difficulty of making the figure clear and easy to understand, the biggest problem I'm having is the captions. How long and detailed description is appropriate, so it neither steals attention from the figure nor leaves out too much important information.

I think the most difficult part is to have high resolution image once we want to save it. My experience is when finish with drawing, the file size sometimes to large for high quality image and if we downgrade it, the image becomes bad.

The most difficult part when i making a figure is the software using part, I'm not good at computer so that part is annoying for me all the time.

I think the most difficult is to find out how to condensate many ideas in one picture without making it difficult to understand.

The most difficult part is the get the image to not look too amateurish that people focus on that instead of the message.

The most difficult part when doing a figure is to let it speak for itself, i.e. to not have long caption text.

To be able to depict all the desirable results on a single figure is sometimes not that easy. It becomes more critical when a figure is to be fitted within a certain size frame. An exact placing of a figure in some text editors often comes along with difficulties.

The most difficult part when making a figure is to make it simple and still be informative.

Depends a lot on the kind of figure, but generally it is to get clarity in the design, such that the idea is conceived easily. This requires some good outline (usually an iterative process).

The most difficult part to make a figure is the need to express abstract concepts into drawings.

The compromise between include detailed information and at the same time be readable (figures in articles)

To compress all information and ideas you have in your head into short and clear message.

I feel the difficulty in choosing a right resolution of the picture and the angle that could visualize all the details. And also choosing right test/label colour, size, font. Another difficulty for me is continuation from one slide to another.

I believe that my biggest problem would be making nice flux charts. Generally the ones I draw look too crude, it does not look beautiful. I have no concern about making an image that can represent an idea, but making a beautiful image makes it more pleasing to the eyes of the people who will read my work.

It is very difficult to make the figure delicate. I am still not get used to put all the small components together to integrate the figure by the vector software, instead of drawing it out directly.

I think the most difficult part is to make the image simple but yet informative.

I find it very difficult to make an original clarity picture in a particular format after dimensioning it according to the requirement.

Some times it is difficult to limit the size (Bytes) of the picture when going for high clarity remake.

Making the figure as informative as you want while keeping it simple enough to grasp quickly.

For me, the more difficult part is to create a figure that contains or tells all the information that I want to transmit, but keeping the figure simple, clean and not overloaded.

The most difficult for me is make it easily to be understood meanwhile containing the essential information.

The most difficult thing when developing a figure is ... to remove the bloat but keep the message. (Besides the very most difficult: finding out what I want to tell.)

For me the most difficult part is to choose colors with right contrast and to make it more attractive and catchy.

points of view — bibliography

1Streit M, Gehlenborg N 2014 Bar charts and box plots Nat Methods 11:117.
2Krzywinski M, Cairo A 2013 Storytelling Nat Methods 10:687-687.
3Krzywinski M, Savig E 2013 Multidimensional Data Nat Methods 10:595-595.
4Krzywinski M, Wong B 2013 Plotting symbols Nat Methods 10:451.
5Krzywinski M 2013 Elements of visual style Nat Methods 10:371.
6Krzywinski M 2013 Labels and callouts Nat Methods 10:275.
7Krzywinski M 2013 Axes, ticks and grids Nat Methods 10:183.
8Wong B 2012 Visualizing biological data Nat Methods 9:1131.
9Wong B, Kjaegaard RS 2012 Pencil and paper Nat Methods 9:1037.
10Gehlenborg N, Wong B 2012 Power of the plane Nat Methods 9:935.
11Gehlenborg N, Wong B 2012 Into the third dimension Nat Methods 9:851.
12Gehlenborg N, Wong B 2012 Mapping quantitative data to color Nat Methods 9:769.
13Nielsen C, Wong B 2012 Representing genomic structural variation Nat Methods 9:631.
14Nielsen C, Wong B 2012 Managing deep data in genome browsers Nat Methods 9:521.
15Nielsen C, Wong B 2012 Representing the genome Nat Methods 9:423.
16Gehlenborg N, Wong B 2012 Integrating data Nat Methods 9:315.
17Gehlenborg N, Wong B 2012 Heat maps Nat Methods 9:213.
18Gehlenborg N, Wong B 2012 Networks Nat Methods 9:115.
19Shoresh N, Wong B 2012 Data exploration Nat Methods 9:5.
20Wong B 2011 The design process Nat Methods 8:987.
21Wong B 2011 Salience to relevance Nat Methods 8:889.
22Wong B 2011 Layout Nat Methods 8:783.
23Wong B 2011 Arrows Nat Methods 8:701.
24Wong B 2011 Simplify to clarify Nat Methods 8:611.
25Wong B 2011 Avoiding color Nat Methods 8:525.
26Wong B 2011 Color blindness Nat Methods 8:441.
27Wong B 2011 The overview figure Nat Methods 8:365.
28Wong B 2011 Typography Nat Methods 8:277.
29Wong B 2011 Points of review (part 2) Nat Methods 8:189.
30Wong B 2011 Points of review (part 1) Nat Methods 8:101.
31Wong B 2011 Negative space Nat Methods 8:5.
32Wong B 2010 Gestalt principles (Part 2) Nat Methods 7:941.
33Wong B 2010 Gestalt principles (part 1) Nat Methods 7:863.
34Wong B 2010 Salience Nat Methods 7:773.
35Wong B 2010 Design of data figures Nat Methods 7:665.
36Wong B 2010 Color coding Nat Methods 7:573.

Visualization + Design Resources


Circos — circular whole-genome information graphics

Circos table viewer — display of tabular data in circular form

Hive plots — rational, quantitative and reproducible network visualization

High dynamic time range photography (HDTR) — imaging the flow of time

Instruction, Tutorials and Talks

Visual Design Principles keynote at VIZBI 2013.

Science design talk at Bloomberg Design Conference 2013 (Bloomberg TV video, and video conversation with Alberto Cairo, moderated by Sam Grobart.)

Needles in stacks of needles keynote at IEEE International Conference on Data Mining 2012

20 imperatives of information design poster at Biovis 2012

Data Visualization: Communicating Clearly talk at Schloss Dahstuhl 2012 Data Visualization in Biology workshop

Brewer palettes — benefit of using perceptual color spaces

Color palettes matter (talk) — learn about color, color spaces and why they matter

Visualization principles tutorial at Vizbi 2012 — learn how we visually interpret and organize information and how to apply these principles to creating figures and software interfaces

Effect of resolution on sequence visualization (handout) — understand how output resolution affects display of highly textured genomic annotations

PSA Genomics Workshop 2011: Designing Effective Visualizations in Biology and Circos and Hive Plots: Challenging visualization paradigms in genomics and network analysis.

news + thoughts

Bayesian statistics

Thu 30-04-2015

Building on last month's column about Bayes' Theorem, we introduce Bayesian inference and contrast it to frequentist inference.

Given a hypothesis and a model, the frequentist calculates the probability of different data generated by the model, P(data|model). When this probability to obtain the observed data from the model is small (e.g. `alpha` = 0.05), the frequentist rejects the hypothesis.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Bayesian Statistics. (read)

In contrast, the Bayesian makes direct probability statements about the model by calculating P(model|data). In other words, given the observed data, the probability that the model is correct. With this approach it is possible to relate the probability of different models to identify one that is most compatible with the data.

The Bayesian approach is actually more intuitive. From the frequentist point of view, the probability used to assess the veracity of a hypothesis, P(data|model), commonly referred to as the P value, does not help us determine the probability that the model is correct. In fact, the P value is commonly misinterpreted as the probability that the hypothesis is right. This is the so-called "prosecutor's fallacy", which confuses the two conditional probabilities P(data|model) for P(model|data). It is the latter quantity that is more directly useful and calculated by the Bayesian.

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayes' Theorem Nature Methods 12:277-278.

Background reading

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayes' Theorem Nature Methods 12:277-278.

...more about the Points of Significance column

Bayes' Theorem

Wed 22-04-2015

In our first column on Bayesian statistics, we introduce conditional probabilities and Bayes' theorem

P(B|A) = P(A|B) × P(B) / P(A)

This relationship between conditional probabilities P(B|A) and P(A|B) is central in Bayesian statistics. We illustrate how Bayes' theorem can be used to quickly calculate useful probabilities that are more difficult to conceptualize within a frequentist framework.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Bayes' Theorem. (read)

Using Bayes' theorem, we can incorporate our beliefs and prior experience about a system and update it when data are collected.

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayes' Theorem Nature Methods 12:277-278.

Background reading

Oldford, R.W. & Cherry, W.H. Picturing probability: the poverty of Venn diagrams, the richness of eikosograms. (University of Waterloo, 2006)

...more about the Points of Significance column

Happy 2015 Pi Day—can you see `pi` through the treemap?

Sat 14-03-2015

Celebrate `pi` Day (March 14th) with splitting its digit endlessly. This year I use a treemap approach to encode the digits in the style of Piet Mondrian.

Martin Krzywinski @MKrzywinski
Digits of `pi`, `phi` and `e`. (details)

The art has been featured in Ana Swanson's Wonkblog article at the Washington Post—10 Stunning Images Show The Beauty Hidden in `pi`.

I also have art from 2013 `pi` Day and 2014 `pi` Day.

Split Plot Design

Tue 03-03-2015

The split plot design originated in agriculture, where applying some factors on a small scale is more difficult than others. For example, it's harder to cost-effectively irrigate a small piece of land than a large one. These differences are also present in biological experiments. For example, temperature and housing conditions are easier to vary for groups of animals than for individuals.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Split plot design. (read)

The split plot design is an expansion on the concept of blocking—all split plot designs include at least one randomized complete block design. The split plot design is also useful for cases where one wants to increase the sensitivity in one factor (sub-plot) more than another (whole plot).

Altman, N. & Krzywinski, M. (2015) Points of Significance: Split Plot Design Nature Methods 12:165-166.

Background reading

1. Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

2. Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of variance (ANOVA) and blocking Nature Methods 11:699-700.

3. Blainey, P., Krzywinski, M. & Altman, N. (2014) Points of Significance: Replication Nature Methods 11:879-880.

...more about the Points of Significance column

Color palettes for color blindness

Tue 03-03-2015

In an audience of 8 men and 8 women, chances are 50% that at least one has some degree of color blindness1. When encoding information or designing content, use colors that is color-blind safe.

Martin Krzywinski @MKrzywinski
A 12-color palette safe for color blindness