Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Where am I supposed to go? Where was I supposed to know?Violet Indianaget lost in questionsmore quotes

EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.

design + visualization

VIZBI 2012

Visualization Principles Tutorial


This tutorial took place on Monday Mar 5th 2012 at VIZBI 2012 in Heidelberg Germany.

Introduction

Jessie Kennedy · We will present fundamental principles of graphic design and visual communication that will help you create more effective interactive and print visualizations. You will learn how the purposeful use of salience, color, consistency and layout can help communicate large data sets and complex ideas with greater immediacy and clarity.

Cydney Nielsen · We will illustrate how these principles were implemented in ABySS-Explorer to visualize genome assemblies, an example to show you ways to apply design ideas to your own project.

Martin Krzywinski · At the end of the tutorial, you will apply what you have learned in an interactive group session in which you will design a figure illustrating a biological process.

Agenda

Download agenda + participant list

9:30 – 10:15 45 min Jessie Kennedy
Principles
10:15 – 10:25 10 min break
10:25 – 11:10 45 min Cydney Nielsen
Design Process
11:10 – 11:20 10 min form teams + select figure to critique
11:20 – 11:30 10 min break
11:30 – 12:00 30 min Martin Krzywinski
Practical — Breakout session
download papers
12:00 – 13:00 60 min team presentations
Interactive
suggested solutions

It is not necessary to read the paper from which your figure was selected. I have included the papers only if you are interested in learning about the figure's context.

Visualization and Design Resources

Effect of resolution on sequence visualization

Principles of effective color selection

Designing effective visualizations in the biological sciences (PSA Genomics Workshop, Seattle, 12 July 2011)

Circos and Hive Plots: Challenging visualization paradigms in genomics and network analysis (PSA Genomics Workshop, Seattle, 12 July 2011)

Designing effective visualizations in the biological sciences (Genome Sciences Center bioinformatics seminar, 26 August 2011)

Drawing Data: Creaing information-rich, informative and appealing figures for publication and presentation (BCCA workshop, 8 Jun 2011)

Behind a great figure is a design principle (BCB Spring Seminar, Iowa State, 27 Feb 2012)

Visualizing Quantitative Information (Genome Sciences Center bioinformatics seminar)

Blast from the past

Linux and Genomics: Two Revolutions (USENIX 2004)

Visualization Principles VIZBI Book Chapter

Look for my chapter on visualization principles in the upcoming Visualizing Biological Data — a Practical Guide. This book is being written by VIZBI 2011 participants and edited by Sean O'Donoghue and Jim Procter.

VIEW ALL

news + thoughts

Machine learning: a primer

Tue 05-12-2017
Machine learning extracts patterns from data without explicit instructions.

In this primer, we focus on essential ML principles— a modeling strategy to let the data speak for themselves, to the extent possible.

The benefits of ML arise from its use of a large number of tuning parameters or weights, which control the algorithm’s complexity and are estimated from the data using numerical optimization. Often ML algorithms are motivated by heuristics such as models of interacting neurons or natural evolution—even if the underlying mechanism of the biological system being studied is substantially different. The utility of ML algorithms is typically assessed empirically by how well extracted patterns generalize to new observations.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Machine learning: a primer. (read)

We present a data scenario in which we fit to a model with 5 predictors using polynomials and show what to expect from ML when noise and sample size vary. We also demonstrate the consequences of excluding an important predictor or including a spurious one.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.",

...more about the Points of Significance column

Snowflake simulation

Tue 14-11-2017
Symmetric, beautiful and unique.

Just in time for the season, I've simulated a snow-pile of snowflakes based on the Gravner-Griffeath model.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
A few of the beautiful snowflakes generated by the Gravner-Griffeath model. (explore)

Gravner, J. & Griffeath, D. (2007) Modeling Snow Crystal Growth II: A mesoscopic lattice map with plausible dynamics.

Genes that make us sick

Thu 02-11-2017
Where disease hides in the genome.

My illustration of the location of genes in the human genome that are implicated in disease appears in The Objects that Power the Global Economy, a book by Quartz.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The location of genes implicated in disease in the human genome, shown here as a spiral. (more...)

Ensemble methods: Bagging and random forests

Mon 16-10-2017
Many heads are better than one.

We introduce two common ensemble methods: bagging and random forests. Both of these methods repeat a statistical analysis on a bootstrap sample to improve the accuracy of the predictor. Our column shows these methods as applied to Classification and Regression Trees.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Ensemble methods: Bagging and random forests. (read)

For example, we can sample the space of values more finely when using bagging with regression trees because each sample has potentially different boundaries at which the tree splits.

Random forests generate a large number of trees by not only generating bootstrap samples but also randomly choosing which predictor variables are considered at each split in the tree.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Ensemble methods: bagging and random forests. Nature Methods 14:933–934.

Background reading

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

...more about the Points of Significance column