Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography
Where am I supposed to go? Where was I supposed to know?Violet Indiana


Circos at British Library Beautiful Science exhibit—Feb 20–May 26


communication + science

For the month of August 2013, the entire set of 35 columns is available for free.

Nature Methods: Points of View

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The full collection of a 35 Points of View column is now available. (3 years of Points of View)

Practical Tips for Effective Figures

Points of View — History

In its 2.5 year history, the PoV column has established a significant legacy— it is one of the most frequently accessed parts of Nature Methods. The reason I think is clear: the community sees the value in clear and effective visual communication and acknowledges the need for a forum in which best practices in the field are presented practically and accessibly.

Bang Wong, in collaboration with visiting authors (Noam Shoresh, Nils Gehlenborg, Cydney Nielsen and Rikke Schmidt Kjærgaard), has penned 29 columns in the period of August 2010 to December 2012, covering broad topics such as salience, Gestalt principles, color, typography, negative space, layout, and data integration.

When it was A.C. Greyling's turn to speak at a debate in which Christopher Hitchens and Richard Dawkins already made their points, Greyling said

When one gets up to speak this late in a debate, one is a bit tempated to quote that Hungarian M.P. who after a long, long, long discussion in the parliament in Budapest stood up and said, "Everything has been said but not everybody said it yet." (watch on YouTube)

Indeed, this is quite how I feel after being offered to be the new author of Nature Methods Point of View column. Both Bang and Hitchens provide significant inspiration for me, so Greyling's words are particularly fitting.

To improve on the column is impossible. My challenge is to identify useful topics that have not yet been covered. I will be working closely with Nature Methods and Bang to ensure that the columns strike the right balance of topic, tone and timbre.

Don't hesitate to let me know whether PoV continues to hold your interest.

Nature Editors Announce Return of Points of View

The annoucement of the return of the column, together with its history and a description of me, the new author, are available at the Nature Methods methagora blog.

Humor is kept by repeated reference to my now-dead-but-once-famous pet rat.

Points of View Collection now Open Access

For the month of August 2013, the entire set of 35 columns is available for free.

Common Challenges in Figure Design

Andreas Dahlin runs a figure making course at Uppsala University. He was kind to share with me common questions and concerns that his students have when creating figures (emphasis is mine).

I face problems for using the tools in power point to make nice illustration figures, and in addition how one can enhance the resolution of the figures to print it in a high quality mode.

In my opinion, the most difficult thing is how to draw the good-looking pictures and design the structure of slide to make it simple and substantial in content.

I find it difficult to find the right software to draw pictures.

The most difficult thing for me, when I make a figure, is to arrange the parts of the figure in a way they look nice and understandable.

I think the most difficult part is creating the concept, how to make a figure easy and fast to understand but not lacking all essential parts.

Stepping outside of my own knowledge of what the picture presents and viewing it as someone who sees it for the first time. It's easy to assume that some things are self evident and not making them clear enough in the pictures.

Figures that not are plots can also be tricky to get to look nice.

Anytime you have to draw something in paint, gimp, or other image program it requires a lot of work to make it look even slightly better than crap.

The most difficult thing (in general) is to include as much information as possible and display it in a way that is easy to understand. Figures should be intuitive for the reader, which is sometimes difficult to achieve. There might also be technical difficulties in achieving what you've visualized.

I think the most difficult part for me is to highlight the main idea I would like to express.

For me the most difficult part is making 3-D figures. Also while making figures its hard to decide on the good colors to choose for the figure.

In my opinion, the most difficult part when making a figure is don't know which software we can use and how to use.

The most difficult part for me is to start it! Because I am so meticulous and I am a painter, then it is not so easy to make decision about my figures and which one is better and so on, then finally I give up and put just one figure which of course I don't like...

I think it is difficult to put together my ideas to something that is connected and makes it easier for the viewer to understand.

It is so easy to just get an image from internet. I don’t know what is ok to do. There seems to be different rules in different communities.

To come up with a figure that does not simplify the concept too much at the same time as it does not overwhelm the viewer. To get some ideas for this is the reason why I take the course. ;-)

To me, how to make it easy to understand is the difficult part.

I think it is to save it in the correct format: Raster or vector, png or jpg or pdf... especially if I want to make some changes in the future to the figure.

I think is to choose the most appropriate figure that really help to transmit the information we want. Then, how many words can be good enough for been part of the message. At the beginning I used to use too many.

Apart from the difficulty of making the figure clear and easy to understand, the biggest problem I'm having is the captions. How long and detailed description is appropriate, so it neither steals attention from the figure nor leaves out too much important information.

I think the most difficult part is to have high resolution image once we want to save it. My experience is when finish with drawing, the file size sometimes to large for high quality image and if we downgrade it, the image becomes bad.

The most difficult part when i making a figure is the software using part, I'm not good at computer so that part is annoying for me all the time.

I think the most difficult is to find out how to condensate many ideas in one picture without making it difficult to understand.

The most difficult part is the get the image to not look too amateurish that people focus on that instead of the message.

The most difficult part when doing a figure is to let it speak for itself, i.e. to not have long caption text.

To be able to depict all the desirable results on a single figure is sometimes not that easy. It becomes more critical when a figure is to be fitted within a certain size frame. An exact placing of a figure in some text editors often comes along with difficulties.

The most difficult part when making a figure is to make it simple and still be informative.

Depends a lot on the kind of figure, but generally it is to get clarity in the design, such that the idea is conceived easily. This requires some good outline (usually an iterative process).

The most difficult part to make a figure is the need to express abstract concepts into drawings.

The compromise between include detailed information and at the same time be readable (figures in articles)

To compress all information and ideas you have in your head into short and clear message.

I feel the difficulty in choosing a right resolution of the picture and the angle that could visualize all the details. And also choosing right test/label colour, size, font. Another difficulty for me is continuation from one slide to another.

I believe that my biggest problem would be making nice flux charts. Generally the ones I draw look too crude, it does not look beautiful. I have no concern about making an image that can represent an idea, but making a beautiful image makes it more pleasing to the eyes of the people who will read my work.

It is very difficult to make the figure delicate. I am still not get used to put all the small components together to integrate the figure by the vector software, instead of drawing it out directly.

I think the most difficult part is to make the image simple but yet informative.

I find it very difficult to make an original clarity picture in a particular format after dimensioning it according to the requirement.

Some times it is difficult to limit the size (Bytes) of the picture when going for high clarity remake.

Making the figure as informative as you want while keeping it simple enough to grasp quickly.

For me, the more difficult part is to create a figure that contains or tells all the information that I want to transmit, but keeping the figure simple, clean and not overloaded.

The most difficult for me is make it easily to be understood meanwhile containing the essential information.

The most difficult thing when developing a figure is ... to remove the bloat but keep the message. (Besides the very most difficult: finding out what I want to tell.)

For me the most difficult part is to choose colors with right contrast and to make it more attractive and catchy.

points of view — bibliography

1Streit M, Gehlenborg N 2014 Bar charts and box plots Nat Methods 11:117.
2Krzywinski M, Cairo A 2013 Storytelling Nat Methods 10:687-687.
3Krzywinski M, Savig E 2013 Multidimensional Data Nat Methods 10:595-595.
4Krzywinski M, Wong B 2013 Plotting symbols Nat Methods 10:451.
5Krzywinski M 2013 Elements of visual style Nat Methods 10:371.
6Krzywinski M 2013 Labels and callouts Nat Methods 10:275.
7Krzywinski M 2013 Axes, ticks and grids Nat Methods 10:183.
8Wong B 2012 Visualizing biological data Nat Methods 9:1131.
9Wong B, Kjaegaard RS 2012 Pencil and paper Nat Methods 9:1037.
10Gehlenborg N, Wong B 2012 Power of the plane Nat Methods 9:935.
11Gehlenborg N, Wong B 2012 Into the third dimension Nat Methods 9:851.
12Gehlenborg N, Wong B 2012 Mapping quantitative data to color Nat Methods 9:769.
13Nielsen C, Wong B 2012 Representing genomic structural variation Nat Methods 9:631.
14Nielsen C, Wong B 2012 Managing deep data in genome browsers Nat Methods 9:521.
15Nielsen C, Wong B 2012 Representing the genome Nat Methods 9:423.
16Gehlenborg N, Wong B 2012 Integrating data Nat Methods 9:315.
17Gehlenborg N, Wong B 2012 Heat maps Nat Methods 9:213.
18Gehlenborg N, Wong B 2012 Networks Nat Methods 9:115.
19Shoresh N, Wong B 2012 Data exploration Nat Methods 9:5.
20Wong B 2011 The design process Nat Methods 8:987.
21Wong B 2011 Salience to relevance Nat Methods 8:889.
22Wong B 2011 Layout Nat Methods 8:783.
23Wong B 2011 Arrows Nat Methods 8:701.
24Wong B 2011 Simplify to clarify Nat Methods 8:611.
25Wong B 2011 Avoiding color Nat Methods 8:525.
26Wong B 2011 Color blindness Nat Methods 8:441.
27Wong B 2011 The overview figure Nat Methods 8:365.
28Wong B 2011 Typography Nat Methods 8:277.
29Wong B 2011 Points of review (part 2) Nat Methods 8:189.
30Wong B 2011 Points of review (part 1) Nat Methods 8:101.
31Wong B 2011 Negative space Nat Methods 8:5.
32Wong B 2010 Gestalt principles (Part 2) Nat Methods 7:941.
33Wong B 2010 Gestalt principles (part 1) Nat Methods 7:863.
34Wong B 2010 Salience Nat Methods 7:773.
35Wong B 2010 Design of data figures Nat Methods 7:665.
36Wong B 2010 Color coding Nat Methods 7:573.

Visualization + Design Resources

Projects

Circos — circular whole-genome information graphics

Circos table viewer — display of tabular data in circular form

Hive plots — rational, quantitative and reproducible network visualization

High dynamic time range photography (HDTR) — imaging the flow of time

Instruction, Tutorials and Talks

Visual Design Principles keynote at VIZBI 2013.

Science design talk at Bloomberg Design Conference 2013 (Bloomberg TV video, and video conversation with Alberto Cairo, moderated by Sam Grobart.)

Needles in stacks of needles keynote at IEEE International Conference on Data Mining 2012

20 imperatives of information design poster at Biovis 2012

Data Visualization: Communicating Clearly talk at Schloss Dahstuhl 2012 Data Visualization in Biology workshop

Brewer palettes — benefit of using perceptual color spaces

Color palettes matter (talk) — learn about color, color spaces and why they matter

Visualization principles tutorial at Vizbi 2012 — learn how we visually interpret and organize information and how to apply these principles to creating figures and software interfaces

Effect of resolution on sequence visualization (handout) — understand how output resolution affects display of highly textured genomic annotations

PSA Genomics Workshop 2011: Designing Effective Visualizations in Biology and Circos and Hive Plots: Challenging visualization paradigms in genomics and network analysis.

news + thoughts

Mind your p's and q's

Sat 29-03-2014

In the April Points of Significance Nature Methods column, we continue our and consider what happens when we run a large number of tests.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Comparing Samples — Part II — Multiple Testing. (read)

Observing statistically rare test outcomes is expected if we run enough tests. These are statistically, not biologically, significant. For example, if we run N tests, the smallest P value that we have a 50% chance of observing is 1–exp(–ln2/N). For N = 10k this P value is Pk=10kln2 (e.g. for 104=10,000 tests, P4=6.9×10–5).

We discuss common correction schemes such as Bonferroni, Holm, Benjamini & Hochberg and Storey's q and show how they impact the false positive rate (FPR), false discovery rate (FDR) and power of a batch of tests.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part II — Multiple Testing Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — t-tests Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Happy Pi Day— go to planet π

Fri 21-03-2014

Celebrate Pi Day (March 14th) with the art of folding numbers. This year I take the number up to the Feynman Point and apply a protein folding algorithm to render it as a path.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Digits of Pi form landmass and shoreline. (details)

For those of you who liked the minimalist and colorful digit grid, I've expanded on the concept to show stacked ring plots of frequency distributions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frequency distribution of digits of Pi in groups of 6 up to the Feynman Point. (details)

And if spirals are your thing...

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frequency distribution of digits of Pi in groups of 4 up to digit 4,988. (details)

Have data, will compare

Fri 07-03-2014

In the March Points of Significance Nature Methods column, we continue our discussion of t-tests from November (Significance, P values and t-tests).

We look at what happens how uncertainty of two variables combines and how this impacts the increased uncertainty when two samples are compared and highlight the differences between the two-sample and paired t-tests.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Comparing Samples — Part I. (read)

When performing any statistical test, it's important to understand and satisfy its requirements. The t-test is very robust with respect to some of its assumptions, but not others. We explore which.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Circos at British Library Beautiful Science Exhibit

Thu 06-03-2014

Beautiful Science explores how our understanding of ourselves and our planet has evolved alongside our ability to represent, graph and map the mass data of the time. The exhibit runs 20 February — 26 May 2014 and is free to the public. There is a good Nature blog writeup about it, a piece in The Guardian, and a great video that explains the the exhibit narrated by Johanna Kieniewicz, the curator.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Circos at the British Library Beautiful Science exhibit. (about exhibit)
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Mailed invitation to the exhibit features my science art. (zoom)

I am privileged to contribute an information graphic to the exhibit in the Tree of Life section. The piece shows how sequence similarity varies across species as a function of evolutionary distance. The installation is a set of 6 30x30 cm backlit panels. They look terrific.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Circos Circles of Life installation at Beautiful Science exhibit at the British Library. (zoom)

Think outside the bar—box plots

Fri 31-01-2014

Quick, name three chart types. Line, bar and scatter come to mind. Perhaps you said pie too—tsk tsk. Nobody ever thinks of the box plot.

Box plots reveal details about data without overloading a figure with a full frequency distribution histogram. They're easy to compare and now easy to make with BoxPlotR (try it). In our fifth Points of Significance column, we take a break from the theory to explain this plot type and—I hope— convince you that they're worth thinking about.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Visualizing samples with box plots. (read)

The February issue of Nature Methods kicks the bar chart two more times: Dan Evanko's Kick the Bar Chart Habit editorial and a Points of View: Bar charts and box plots column by Mark Streit and Nils Gehlenborg.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Visualizing samples with box plots Nature Methods 11:119-120.