Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Twenty — minutes — maybe — more.Naomichoose four wordsmore quotes

art: beautiful


EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.


breaking + ideas

Recent & Upcoming

upcoming

19–24 November 2017
visualization
workshop

Boehringer Ingelheim Fonds Communicating Science Workshop, Mainz
26–29 November 2017
visualization
workshop

4th Canadian Conference on Epigenetics
Mechanisms of disease
4–7 February 2018
conference
visualization
workshop

The Australian Poultry Science Symposium (APSS).
Sydney, Australia. 4–7 February 2018.
15–20 April 2018
conference
visualization
workshop

Visualization of Biological Data - Crossroads
Schloss Dagstuhl - Leibniz-Zentrum fur Informatik GmbH
11 October 2017
hive plot
network
visualization

Differential Hive Plots: Seeing Networks Change
Martin Krzywinski, Ka Ming Nip, Inanc Birol, Marco Marra (2017) Leonardo, 50:5, p.504
6–8 October 2017
visualization
workshop

iNANO Autumn School
DOWNLOAD SLIDES
Himmerland Golf & Spa Resort, Denmark
24 September–1 October 2017
visualization
workshop

Spetses Summer School
DOWNLOAD SLIDES
Spetses, Greece
13–18 August 2017
conference
visualization

Scalable Set Visualizations
Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
6–11 August 2017
conference
visualization

Gordon Research Conference, Bates College, Maine
Visualization in Science and Education
3 Aug 2017
art
typography

Intertypes of Helvetica Neue
Intertypes are the spaces between letters
1 Aug 2017
art
typography

Last lines of the 37 plays of Shakespeare.
26 July 2017
art
pog
visualization

Art for the 5 year anniversary of the Personal Oncogenomics Program (POG)
The Personal Oncogenomics Program (POG) is a collaborative research study including many BC Cancer Agency oncologists, pathologists and other clinicians along with Canada's Michael Smith Genome Sciences Centre with support from BC Cancer Foundation.
29 June 2017
book
chapter
visualization

Krzywinski, M. Scientific data visualization: Aesthetic for diagrammatic clarity.
DOWNLOAD EXCERPT
Chapter 3 in Scientific Data Visualization: Aesthetic for Diagrammatic Clarity. More Than Pretty Pictures (2017) Edited by Rikke Schmidt Kjærgaard & Lotte Philipsen. Routledge, NY.
13 June 2017
google
maps

Google maps challenge — longest routes
Added 3-point route challenge.
7 June 2017
benchmark
crossfit
fitness

`k` index fitness benchmark
Perform `k` reps at `k`% 1RM. Expected value is `k=26` and higher values are progressively more difficult to achieve.
5–17 June 2017
visualization
workshop

EMBO Practical Course - Bioinformatics and Genome Analysis
DOWNLOAD SLIDES
Centre for Research & Technology – Hellas. Tessalonica, Greece
3 June 2017
statistics
words

Dark matter of the English language—the unwords
Brand new words for the language generated from a neural network. New places, drugs, names and diseases.
31 May 2017
genome
statistics

Dark matter of the genome—the nullomers
Nullomers are the unwords of the genome, sequences that are not there.
25 May 2017
case study
visualization

What's wrong with pie charts?
21 March 2017
art
color
visualization

Proportions of colors in country flags
Country flags rendered as concentric rings in proportion to their color composition and sorted by similarity.
16 March 2017
visualization
workshop

UBC & CIHR Skin Research Day
DOWNLOAD SLIDES
Practical Data Visualization: Thinking about drawing data and communicating science.
14 March 2017
art
math
pi
piday
visualization

2017 Pi Day art
This year, I imagine the digits of `\pi` as a star catalogue and populate the sky with constellations based on extinct species.
8 March 2017
art
visualization

BD Genomics stereoscopic art exhibit at AGBT
An article about the visualization and design process behind the art.
21 February 2017
art
chemistry
math
physics
typography

Snellen Optotype Font
The font gets its dedicated page.
20 February 2017
clock
time

Ptolemaic Clock
Updated and named the clock, with contributions from Rodrigo Goya.
18 February 2017
art
chemistry
math
physics
typography

Snellen Eye chart typographical posters
Insist that you look at physics, math and chemistry
8 February 2017
visualization
workshop

Improving Your Visual Science Communication: Plots & Figures
DOWNLOAD SLIDES
A Westgrid workshop
7 February 2017
google
questions

Updated the World's Most Popular Questions site and added a section all about Trump.
Is Trump a Manchurian candidate?
24 January 2017
lecture
visualization

Essentials of Data Visualization: Thinking about drawing data and communicating science.
MBB462, Simon Fraser University
16 January 2017
resource
visualization

Essentials of Data Visualization: Thinking about drawing data and communicating science.
8-part video series.
17 November 2016
epigenetics
resource
visualization

Hirst, M. & Krzywinski, M. (2016) Snapshot: Epigenomic Assays. Cell 167:1430.e1
10 November 2016
talk
visualization

Fitting Big Science on a Small Page, VizUM2016, University of Miami, Florida
DOWNLOAD SLIDES
Remember, the genome is not a blueprint.
7 November 2016
media
science

On the origins of scientists.
Interview with The Ubbyssey by Nivretta Thatra. Photos by Joshua Medicoff.
4 November 2016
color
resource
visualization

Compilation of my color resources.
3 November 2016
nature methods
points of view
visualization

Krzywinski, M. (2016) Points of View: Intuitive Design. Nature Methods 13:895.
25 October 2016
language
visualization

Tripsum—Donald Trump Lorem Ipsum
Neverending speeches, haikus and promises by Trump and Clinton.
4 October 2016
talk
visualization

How to communicate information clearly and beautifully
Central European Institute of Technology, Brno, Chech Republic
26 September 2016
language
visualization

Word analysis of Clinton-Trump 2016 presidential debates
Who said and didn't say what.
22 September 2016
communication
education
science

SCIE 113 first-year seminar in science
DOWNLOAD SLIDES
A talk about thinking: scientifically and artistically
19 September 2016
interview
visualization

The Art of Big Data
Interview with Biotechniques' Sarah Webb about the processes, challenges, and skills involved in visualizing large biological data sets. Including Nils Gehlenborg, Jan Aerts and Lennart Martens. Biotechniques 61:107-112 (2016).
30 August 2016
talk
visualization

The Quality of Quantity
University of Sydney
26 August 2016
talk
visualization

Communicating Your Science
University of Sydney
12 August 2016
talk
visualization

A casual talk on data visualisation & design for science communication
Biotechnology Institute, Ankara University, Turkey
22 July 2016
art
math
pi
piday
visualization

2016 Pi Approximation Day art
Perfect packing of warped circles that embody the approximation.
22 June 2016
talk
visualization

Seeing Networks Change
DOWNLOAD SLIDES
CANHEIT-HPCS2016 Conference: Shaping the Digital Landscape, Edmonton
21 June 2016
visualization
workshop

Sense and Sensibility—Visual Design Principles for Scientific Data
DOWNLOAD SLIDES
CANHEIT-HPCS2016 Conference: Shaping the Digital Landscape, Edmonton
4 May 2016
visualization
workshop

EMBO Practical Course: Bioinformatics and genome analysis
Izmir Biomedicine and Genome Center, Dokuz Eylul University, Izmir, Turkey
21 April 2016
design
science
visualization

Design + Science mixer
10–noon, Art Building, Room 247, University of Washington, Seattle
20 April 2016
talk
visualization

UW Free Public Lecture
7pm, Kane Hall, Room 110, University of Washington, Seattle
20 April 2016
talk
visualization

Data Science Seminar
University of Washington, Seattle
7 April 2016
visualization
workshop

Creating better scientific figures—a workshop
9–11am, AW210, Western Washington University, Bellingham, WA
6 April 2016
art
talk
visualization

Fraser lecture series: The Big Data Revolution in Human and Environmental Health
Mt Baker Theater, Western Washington University, Bellingham
22 March 2016
art
birds
song
typography

Typographical posters of bird songs
Against a background of the bird's plumage color, its song.
14 March 2016
art
math
pi
piday
visualization

2016 Pi Day art
This year, I imagine the digits of Pi as physical masses and collapse them under gravity. The work is featured in Scientific American.
11 March 2016
seminar
ubc
visualization

B.I.G. Retreat 2016: Art and Science of Data Visualization Workshop
DOWNLOAD SLIDES
1:15pm AMS Student NEST, UBC
7 March 2016
seminar
ubc
visualization

Ecoscope seminar: Visual design principles for scientific data.
DOWNLOAD SLIDES
12:30-1:30pm, LSC 3, Life Science Institute, 2350 Health Sciences Mall, UBC
1 February 2016
party
petabase
sequencing

Petabase party
When you sequence a petabase (1015 bases), you have a party. And it looks like this.
21 January 2016
sanctuary
space

Sanctuary
Our Sanctuary project site goes beta.
21 January 2016
communication
education
science

SCIE 113 first-year seminar in science
DOWNLOAD SLIDES
A talk about thinking: scientifically and artistically
4 January 2016
nature methods
points of view
visualization

Hunnicutt, B.J. & Krzywinski, M. (2016) Points of View: Pathways. Nature Methods 13:5.
15 December 2015
hitchmas

Happy Hitchmas
Official Hitchmas party at Copenhagen's Library Bar. Today at 8pm.
7 December 2015
graphic science
making of
scientific american
visualization

Making of our Scientific American December 2015 Graphic Science visualization.
Dust, bacteria and genomes.
3 December 2015
benchmarking
cluster

I've resurrected the web page of an ancient project—Clusterpunch.
It was published in Jul 2003 Sys Admin Journal.
19 November 2015
graphic science
scientific american
visualization

Collection of my Scientific American Graphic Science visualizations.
Dust, bacteria and genomes.
18 November 2015
bacteria
dust
graphic science
scientific american
visualization

Information graphic in this month's Scientific American Graphic Science.
Men and Women Alter a Home's Bacteria Differently—an analysis of dust reveals how the presence of men, women, dogs and cats affects the variety of bacteria in a household.
11 November 2015
center for inquiry
hitchmas
party

Center for Inquiry Canada promotes my Hitchmas 2015 party in Copenhagen.
11 November 2015
music
numbers
pi

More songs with numbers in their lyrics added.
9 November 2015
art
color
resource

Updated named color list (v0.3) to 8,292 colors.
7 November 2015
art
curios
math

Debunking bad math about `\pi`.
3 November 2015
art
color
resource

Ultimate list of 5,000+ named colors.
1 November 2015
hitchmas
party

Hitchmas 2015 is in Copenhagen.
Sit down. Object the objectionable
VIEW ALL

news + thoughts

Ensemble methods: Bagging and random forests

Mon 16-10-2017
Many heads are better than one.

We introduce two common ensemble methods: bagging and random forests. Both of these methods repeat a statistical analysis on a bootstrap sample to improve the accuracy of the predictor. Our column shows these methods as applied to Classification and Regression Trees.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Ensemble methods: Bagging and random forests. (read)

For example, we can sample the space of values more finely when using bagging with regression trees because each sample has potentially different boundaries at which the tree splits.

Random forests generate a large number of trees by not only generating bootstrap samples but also randomly choosing which predictor variables are considered at each split in the tree.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Ensemble methods: bagging and random forests. Nature Methods 14:933–934.

Background reading

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

...more about the Points of Significance column

Classification and regression trees

Mon 16-10-2017
Decision trees are a powerful but simple prediction method.

Decision trees classify data by splitting it along the predictor axes into partitions with homogeneous values of the dependent variable. Unlike logistic or linear regression, CART does not develop a prediction equation. Instead, data are predicted by a series of binary decisions based on the boundaries of the splits. Decision trees are very effective and the resulting rules are readily interpreted.

Trees can be built using different metrics that measure how well the splits divide up the data classes: Gini index, entropy or misclassification error.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Classification and decision trees. (read)

When the predictor variable is quantitative and not categorical, regression trees are used. Here, the data are still split but now the predictor variable is estimated by the average within the split boundaries. Tree growth can be controlled using the complexity parameter, a measure of the relative improvement of each new split.

Individual trees can be very sensitive to minor changes in the data and even better prediction can be achieved by exploiting this variability. Using ensemble methods, we can grow multiple trees from the same data.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Multiple Linear Regression Nature Methods 12:1103-1104.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. Nature Methods 13:703-704.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Regularization. Nature Methods 13:803-804.

...more about the Points of Significance column

Personal Oncogenomics Program 5 Year Anniversary Art

Wed 26-07-2017

The artwork was created in collaboration with my colleagues at the Genome Sciences Center to celebrate the 5 year anniversary of the Personalized Oncogenomics Program (POG).

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
5 Years of Personalized Oncogenomics Program at Canada's Michael Smith Genome Sciences Centre. The poster shows 545 cancer cases. (left) Cases ordered chronologically by case number. (right) Cases grouped by diagnosis (tissue type) and then by similarity within group.

The Personal Oncogenomics Program (POG) is a collaborative research study including many BC Cancer Agency oncologists, pathologists and other clinicians along with Canada's Michael Smith Genome Sciences Centre with support from BC Cancer Foundation.

The aim of the program is to sequence, analyze and compare the genome of each patient's cancer—the entire DNA and RNA inside tumor cells— in order to understand what is enabling it to identify less toxic and more effective treatment options.

Principal component analysis

Thu 06-07-2017
PCA helps you interpret your data, but it will not always find the important patterns.

Principal component analysis (PCA) simplifies the complexity in high-dimensional data by reducing its number of dimensions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Principal component analysis. (read)

To retain trend and patterns in the reduced representation, PCA finds linear combinations of canonical dimensions that maximize the variance of the projection of the data.

PCA is helpful in visualizing high-dimensional data and scatter plots based on 2-dimensional PCA can reveal clusters.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Principal component analysis. Nature Methods 14:641–642.

Background reading

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.

...more about the Points of Significance column