Martin Krzywinski / Genome Sciences Center / Martin Krzywinski / Genome Sciences Center / - contact me Martin Krzywinski / Genome Sciences Center / on Twitter Martin Krzywinski / Genome Sciences Center / - Lumondo Photography Martin Krzywinski / Genome Sciences Center / - Pi Art Martin Krzywinski / Genome Sciences Center / - Hilbertonians - Creatures on the Hilbert Curve
syncopation & accordionCafe de Flore (Doctor Rockit)like France, but no dog poopmore quotes

visualization: beautiful

Functional annotation of gene sequences—a visualization workshop. Poznan, Poland. Dec 12, 2015

communication + science

Nature Methods: Points of View

Martin Krzywinski @MKrzywinski
Points of View column in Nature Methods. (Points of View)

Guidelines for Effective Figures

Practical and concise advise on the visual presentation of data for researchers. One topic and one page at a time.

Common Challenges in Figure Design

Andreas Dahlin runs a figure making course at Uppsala University. He was kind to share with me common questions and concerns that his students have when creating figures (emphasis is mine).

I face problems for using the tools in power point to make nice illustration figures, and in addition how one can enhance the resolution of the figures to print it in a high quality mode.

In my opinion, the most difficult thing is how to draw the good-looking pictures and design the structure of slide to make it simple and substantial in content.

I find it difficult to find the right software to draw pictures.

The most difficult thing for me, when I make a figure, is to arrange the parts of the figure in a way they look nice and understandable.

I think the most difficult part is creating the concept, how to make a figure easy and fast to understand but not lacking all essential parts.

Stepping outside of my own knowledge of what the picture presents and viewing it as someone who sees it for the first time. It's easy to assume that some things are self evident and not making them clear enough in the pictures.

Figures that not are plots can also be tricky to get to look nice.

Anytime you have to draw something in paint, gimp, or other image program it requires a lot of work to make it look even slightly better than crap.

The most difficult thing (in general) is to include as much information as possible and display it in a way that is easy to understand. Figures should be intuitive for the reader, which is sometimes difficult to achieve. There might also be technical difficulties in achieving what you've visualized.

I think the most difficult part for me is to highlight the main idea I would like to express.

For me the most difficult part is making 3-D figures. Also while making figures its hard to decide on the good colors to choose for the figure.

In my opinion, the most difficult part when making a figure is don't know which software we can use and how to use.

The most difficult part for me is to start it! Because I am so meticulous and I am a painter, then it is not so easy to make decision about my figures and which one is better and so on, then finally I give up and put just one figure which of course I don't like...

I think it is difficult to put together my ideas to something that is connected and makes it easier for the viewer to understand.

It is so easy to just get an image from internet. I don’t know what is ok to do. There seems to be different rules in different communities.

To come up with a figure that does not simplify the concept too much at the same time as it does not overwhelm the viewer. To get some ideas for this is the reason why I take the course. ;-)

To me, how to make it easy to understand is the difficult part.

I think it is to save it in the correct format: Raster or vector, png or jpg or pdf... especially if I want to make some changes in the future to the figure.

I think is to choose the most appropriate figure that really help to transmit the information we want. Then, how many words can be good enough for been part of the message. At the beginning I used to use too many.

Apart from the difficulty of making the figure clear and easy to understand, the biggest problem I'm having is the captions. How long and detailed description is appropriate, so it neither steals attention from the figure nor leaves out too much important information.

I think the most difficult part is to have high resolution image once we want to save it. My experience is when finish with drawing, the file size sometimes to large for high quality image and if we downgrade it, the image becomes bad.

The most difficult part when i making a figure is the software using part, I'm not good at computer so that part is annoying for me all the time.

I think the most difficult is to find out how to condensate many ideas in one picture without making it difficult to understand.

The most difficult part is the get the image to not look too amateurish that people focus on that instead of the message.

The most difficult part when doing a figure is to let it speak for itself, i.e. to not have long caption text.

To be able to depict all the desirable results on a single figure is sometimes not that easy. It becomes more critical when a figure is to be fitted within a certain size frame. An exact placing of a figure in some text editors often comes along with difficulties.

The most difficult part when making a figure is to make it simple and still be informative.

Depends a lot on the kind of figure, but generally it is to get clarity in the design, such that the idea is conceived easily. This requires some good outline (usually an iterative process).

The most difficult part to make a figure is the need to express abstract concepts into drawings.

The compromise between include detailed information and at the same time be readable (figures in articles)

To compress all information and ideas you have in your head into short and clear message.

I feel the difficulty in choosing a right resolution of the picture and the angle that could visualize all the details. And also choosing right test/label colour, size, font. Another difficulty for me is continuation from one slide to another.

I believe that my biggest problem would be making nice flux charts. Generally the ones I draw look too crude, it does not look beautiful. I have no concern about making an image that can represent an idea, but making a beautiful image makes it more pleasing to the eyes of the people who will read my work.

It is very difficult to make the figure delicate. I am still not get used to put all the small components together to integrate the figure by the vector software, instead of drawing it out directly.

I think the most difficult part is to make the image simple but yet informative.

I find it very difficult to make an original clarity picture in a particular format after dimensioning it according to the requirement.

Some times it is difficult to limit the size (Bytes) of the picture when going for high clarity remake.

Making the figure as informative as you want while keeping it simple enough to grasp quickly.

For me, the more difficult part is to create a figure that contains or tells all the information that I want to transmit, but keeping the figure simple, clean and not overloaded.

The most difficult for me is make it easily to be understood meanwhile containing the essential information.

The most difficult thing when developing a figure is ... to remove the bloat but keep the message. (Besides the very most difficult: finding out what I want to tell.)

For me the most difficult part is to choose colors with right contrast and to make it more attractive and catchy.


news + thoughts

Play the Bacteria Game

Thu 19-11-2015

Choose your own dust adventure!

Nobody likes dusting but everyone should find dust interesting.

Working with Jeannie Hunnicutt and with Jen Christiansen's art direction, I created this month's Scientific American Graphic Science visualization based on a recent paper The Ecology of microscopic life in household dust.

Martin Krzywinski @MKrzywinski
An analysis of dust reveals how the presence of men, women, dogs and cats affects the variety of bacteria in a household. Appears on Graphic Science page in December 2015 issue of Scientific American.

This was my third information graphic for the Graphic Science page. Unlike the previous ones, it's visually simple and ... interactive. Or, at least, as interactive as a printed page can be.

More of my American Scientific Graphic Science designs

Barberan A et al. (2015) The ecology of microscopic life in household dust. Proc. R. Soc. B 282: 20151139.

Names for 5,092 colors

Tue 03-11-2015

A very large list of named colors generated from combining some of the many lists that already exist (X11, Crayola, Raveling, Resene, wikipedia, xkcd, etc).

Martin Krzywinski @MKrzywinski
Confused? So am I. That's why I made a list.

For each color, coordinates in RGB, HSV, XYZ, Lab and LCH space are given along with the 5 nearest, as measured with ΔE, named neighbours.

I also provide a web service. Simply call this URL with an RGB string.

Simple Linear Regression

Sat 07-11-2015

It is possible to predict the values of unsampled data by using linear regression on correlated sample data.

This month, we begin our column with a quote, shown here in its full context from Box's paper Science and Statistics.

In applying mathematics to subjects such as physics or statistics we make tentative assumptions about the real world which we know are false but which we believe may be useful nonetheless. The physicist knows that particles have mass and yet certain results, approximating what really happens, may be derived from the assumption that they do not. Equally, the statistician knows, for example, that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world.
Box, G. J. Am. Stat. Assoc. 71, 791–799 (1976).

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Simple Linear Regression. (read)

This column is our first in the series about regression. We show that regression and correlation are related concepts—they both quantify trends—and that the calculations for simple linear regression are essentially the same as for one-way ANOVA.

While correlation provides a measure of a specific kind of association between variables, regression allows us to fit correlated sample data to a model, which can be used to predict the values of unsampled data.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Simple Linear Regression Nature Methods 12:999-1000.

Background reading

Altman, N. & Krzywinski, M. (2015) Points of significance: Association, correlation and causation Nature Methods 12:899-900.

Krzywinski, M. & Altman, N. (2014) Points of significance: Analysis of variance (ANOVA) and blocking. Nature Methods 11:699-700.

...more about the Points of Significance column

Association, correlation and causation

Sat 07-11-2015

Correlation implies association, but not causation. Conversely, causation implies association, but not correlation.

This month, we distinguish between association, correlation and causation.

Association, also called dependence, is a very general relationship: one variable provides information about the other. Correlation, on the other hand, is a specific kind of association: an increasing or decreasing trend. Not all associations are correlations. Moreover, causality can be connected only to association.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Association, correlation and causation. (read)

We discuss how correlation can be quantified using correlation coefficients (Pearson, Spearman) and show how spurious corrlations can arise in random data as well as very large independent data sets. For example, per capita cheese consumption is correlated with the number of people who died by becoming tangled in bedsheets.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Association, correlation and causation Nature Methods 12:899-900.

...more about the Points of Significance column

Bayesian networks

Thu 01-10-2015

For making probabilistic inferences, a graph is worth a thousand words.

This month we continue with the theme of Bayesian statistics and look at Bayesian networks, which combine network analysis with Bayesian statistics.

In a Bayesian network, nodes represent entities, such as genes, and the influence that one gene has over another is represented by a edge and probability table (or function). Bayes' Theorem is used to calculate the probability of a state for any entity.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Bayesian networks. (read)

In our previous columns about Bayesian statistics, we saw how new information (likelihood) can be incorporated into the probability model (prior) to update our belief of the state of the system (posterior). In the context of a Bayesian network, relationships called conditional dependencies can arise between nodes when information is added to the network. Using a small gene regulation network we show how these dependencies may connect nodes along different paths.

Background reading

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayesian Statistics Nature Methods 12:277-278.

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayes' Theorem Nature Methods 12:277-278.

...more about the Points of Significance column