Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Trance opera—Spente le Stellebe dramaticmore quotes


UCD Computational and Molecular Biology Symposium, Dublin, Ireland. 1-2 Dec 2016.


communication + science

Nature Methods: Points of View

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Points of View column in Nature Methods. (Points of View)

Guidelines for Effective Figures

Practical and concise advise on the visual presentation of data for researchers. One topic and one page at a time.

Common Challenges in Figure Design

Andreas Dahlin runs a figure making course at Uppsala University. He was kind to share with me common questions and concerns that his students have when creating figures (emphasis is mine).

I face problems for using the tools in power point to make nice illustration figures, and in addition how one can enhance the resolution of the figures to print it in a high quality mode.

In my opinion, the most difficult thing is how to draw the good-looking pictures and design the structure of slide to make it simple and substantial in content.

I find it difficult to find the right software to draw pictures.

The most difficult thing for me, when I make a figure, is to arrange the parts of the figure in a way they look nice and understandable.

I think the most difficult part is creating the concept, how to make a figure easy and fast to understand but not lacking all essential parts.

Stepping outside of my own knowledge of what the picture presents and viewing it as someone who sees it for the first time. It's easy to assume that some things are self evident and not making them clear enough in the pictures.

Figures that not are plots can also be tricky to get to look nice.

Anytime you have to draw something in paint, gimp, or other image program it requires a lot of work to make it look even slightly better than crap.

The most difficult thing (in general) is to include as much information as possible and display it in a way that is easy to understand. Figures should be intuitive for the reader, which is sometimes difficult to achieve. There might also be technical difficulties in achieving what you've visualized.

I think the most difficult part for me is to highlight the main idea I would like to express.

For me the most difficult part is making 3-D figures. Also while making figures its hard to decide on the good colors to choose for the figure.

In my opinion, the most difficult part when making a figure is don't know which software we can use and how to use.

The most difficult part for me is to start it! Because I am so meticulous and I am a painter, then it is not so easy to make decision about my figures and which one is better and so on, then finally I give up and put just one figure which of course I don't like...

I think it is difficult to put together my ideas to something that is connected and makes it easier for the viewer to understand.

It is so easy to just get an image from internet. I don’t know what is ok to do. There seems to be different rules in different communities.

To come up with a figure that does not simplify the concept too much at the same time as it does not overwhelm the viewer. To get some ideas for this is the reason why I take the course. ;-)

To me, how to make it easy to understand is the difficult part.

I think it is to save it in the correct format: Raster or vector, png or jpg or pdf... especially if I want to make some changes in the future to the figure.

I think is to choose the most appropriate figure that really help to transmit the information we want. Then, how many words can be good enough for been part of the message. At the beginning I used to use too many.

Apart from the difficulty of making the figure clear and easy to understand, the biggest problem I'm having is the captions. How long and detailed description is appropriate, so it neither steals attention from the figure nor leaves out too much important information.

I think the most difficult part is to have high resolution image once we want to save it. My experience is when finish with drawing, the file size sometimes to large for high quality image and if we downgrade it, the image becomes bad.

The most difficult part when i making a figure is the software using part, I'm not good at computer so that part is annoying for me all the time.

I think the most difficult is to find out how to condensate many ideas in one picture without making it difficult to understand.

The most difficult part is the get the image to not look too amateurish that people focus on that instead of the message.

The most difficult part when doing a figure is to let it speak for itself, i.e. to not have long caption text.

To be able to depict all the desirable results on a single figure is sometimes not that easy. It becomes more critical when a figure is to be fitted within a certain size frame. An exact placing of a figure in some text editors often comes along with difficulties.

The most difficult part when making a figure is to make it simple and still be informative.

Depends a lot on the kind of figure, but generally it is to get clarity in the design, such that the idea is conceived easily. This requires some good outline (usually an iterative process).

The most difficult part to make a figure is the need to express abstract concepts into drawings.

The compromise between include detailed information and at the same time be readable (figures in articles)

To compress all information and ideas you have in your head into short and clear message.

I feel the difficulty in choosing a right resolution of the picture and the angle that could visualize all the details. And also choosing right test/label colour, size, font. Another difficulty for me is continuation from one slide to another.

I believe that my biggest problem would be making nice flux charts. Generally the ones I draw look too crude, it does not look beautiful. I have no concern about making an image that can represent an idea, but making a beautiful image makes it more pleasing to the eyes of the people who will read my work.

It is very difficult to make the figure delicate. I am still not get used to put all the small components together to integrate the figure by the vector software, instead of drawing it out directly.

I think the most difficult part is to make the image simple but yet informative.

I find it very difficult to make an original clarity picture in a particular format after dimensioning it according to the requirement.

Some times it is difficult to limit the size (Bytes) of the picture when going for high clarity remake.

Making the figure as informative as you want while keeping it simple enough to grasp quickly.

For me, the more difficult part is to create a figure that contains or tells all the information that I want to transmit, but keeping the figure simple, clean and not overloaded.

The most difficult for me is make it easily to be understood meanwhile containing the essential information.

The most difficult thing when developing a figure is ... to remove the bloat but keep the message. (Besides the very most difficult: finding out what I want to tell.)

For me the most difficult part is to choose colors with right contrast and to make it more attractive and catchy.

VIEW ALL

news + thoughts

Intuitive Design

Thu 03-11-2016

Appeal to intuition when designing with value judgments in mind.

Figure clarity and concision are improved when the selection of shapes and colors is grounded in the Gestalt principles, which describe how we visually perceive and organize information.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
One of the Gestalt principles tells us that the magenta and green shapes will be perceived as as two groups, overriding the fact that the shapes within the group might be different. What the principle does not tell us is how the reader is likely to value each group. (read)

The Gestalt principles are value free. For example, they tell us how we group objects but do not speak to any meaning that we might intuitively infer from visual characteristics.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of View column: Intuitive Design. (read)

This month, we discuss how appealing to such intuitions—related to shapes, colors and spatial orientation— can help us add information to a figure as well as anticipate and encourage useful interpretations.

Krzywinski, M. (2016) Points of View: Intuitive Design. Nature Methods 13:895.

...more about the Points of View column

Regularization

Fri 04-11-2016

Constraining the magnitude of parameters of a model can control its complexity.

This month we continue our discussion about model selection and evaluation and address how to choose a model that avoids both overfitting and underfitting.

Ideally, we want to avoid having either an underfitted model, which is usually a poor fit to the training data, or an overfitted model, which is a good fit to the training data but not to new data.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Regularization (read)

Regularization is a process that penalizes the magnitude of model parameters. This is done by not only minimizing the SSE, `\mathrm{SSE} = \sum_i (y_i - \hat{y}_i)^2 `, as is done normally in a fit, but adding to this minimized quantity the sum of the mode's squared parameters, `\mathrm{SSE} + \lambda \sum_i \hat{\beta}^2_i`.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Regularization. Nature Methods 13:803-804.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. Nature Methods 13:703-704.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

...more about the Points of Significance column

Model Selection and Overfitting

Fri 04-11-2016

With four parameters I can fit an elephant and with five I can make him wiggle his trunk. —John von Neumann.

By increasing the complexity of a model, it is easy to make it fit to data perfectly. Does this mean that the model is perfectly suitable? No.

When a model has a relatively large number of parameters, it is likely to be influenced by the noise in the data, which varies across observations, as much as any underlying trend, which remains the same. Such a model is overfitted—it matches training data well but does not generalize to new observations.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Model Selection and Overfitting (read)

We discuss the use of training, validation and testing data sets and how they can be used, with methods such as cross-validation, to avoid overfitting.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. Nature Methods 13:703-704.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

...more about the Points of Significance column

Classifier Evaluation

Tue 13-09-2016

It is important to understand both what a classification metric expresses and what it hides.

We examine various metrics use to assess the performance of a classifier. We show that a single metric is insufficient to capture performance—for any metric, a variety of scenarios yield the same value.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Classifier Evaluation (read)

We also discuss ROC and AUC curves and how their interpretation changes based on class balance.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

...more about the Points of Significance column

Happy 2016 `\pi` Approximation, roughly speaking

Sun 24-07-2016

Today is the day and it's hardly an approximation. In fact, `22/7` is 20% more accurate of a representation of `\pi` than `3.14`!

Time to celebrate, graphically. This year I do so with perfect packing of circles that embody the approximation.

By warping the circle by 8% along one axis, we can create a shape whose ratio of circumference to diameter, taken as twice the average radius, is 22/7.

If you prefer something more accurate, check out art from previous `\pi` days: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, and 2016 `\pi` Day.