latest newsbuy art
Twenty — minutes — maybe — more.Naomichoose four wordsmore quotes
very clickable
visualization + design

Genome Informatics 2010 cover

Genome Informatics, September 15-19, 2010 / Hinxton, UK

1 · The conference program cover

The program cover shows sequences of some of the genes and viruses that appear in the 2010 Genome Informatics conference's abstracts.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
GENOME INFORMATICS 2010 FRONT COVER | The conference program cover shows sequences of some of the proteins and genes reported in the abstracts drawn as paths
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
GENOME INFORMATICS 2010 BACK COVER | The conference program cover shows sequences of some of the proteins and genes reported in the abstracts drawn as paths

The booklet was published with a black cover background. Below is an inverted and pinkish take on the cover.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
GENOME INFORMATICS 2010 FRONT AND BACK COVER | The conference program cover shows sequences of some of the proteins and genes reported in the abstracts drawn as paths

2 · Design of the cover

2.1 · Sequence as a path

Each sequence is represented by a continuous path. The length of the path is proportional to the length of the sequence.

2.2 · Path color — GC Content

At each point on the path, color is used to show the GC content computed over a window of 20 bases at that position.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
GC CONTENT ENCODING | GC content is encoded by color

Because the GC content doesn't vary greatly, values in the range 0.2–0.6 are mapped onto hues 0–300, with GC values outside that range assigned to the start and end hues. To smooth the color mpaping, a running average is calculated across 10 adjacent samples.

2.3 · Path direction — relative GC content

Direction of the curvature of the path is determined by the GC content relative to the average GC content of the human genome.

2.4 · Path curvature — Repeat content

The magnitude of path curvature is informed by the repeat content near that location, which is calculated by determining the average frequency of 10-mers sampled within a window of 200 bases relative to their frequency in the human exon sequence.

This quantity is expressed relative to the chance of observing these 10-mers randomly and used to inform the angle of the path. Regions that are composed of 10-mers that are relatively rare are straighter than those which contain repetitive regions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
CURVATURE SHOWS REPEATS | The degree to which the path turns is informed by how much of the sequence at that position is repeated.

The path is confined within a circular area to keep it compact, at the cost of losing translational and rotational invariance of the representation. This limitation is due to the fact that the segments of the path depend on the angle and position at which the path approaches the circular boundary.

2.5 · Interpreting structure

For genes, the transcribed sequence is shown, which includes both introns and exons.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
GENES ARE HIGH-INFORMATION AREAS | Areas of high information are more straight (fewer repeats). Where sequence for areas outside genes and in repeats tend to curl up on themselves.

The overall effect of the path encoding is a qualitative, artistic interpretation of local sequence structure. Two paths can be directly compared to interrogate differences in their corresponding sequence.

3 · Deadly genome series

The Deadly Genomes poster demonstrates how entire genomes appear when encoded as paths. The poster compares the incidence rates and mortality of harmful viruses and bacteria, such as malaria, syphilis, AIDS and SARS.

Discover all the things that are not trying to make you stronger.
The cover design uses the same approach to depicting genomes as the Deadly Genomes poster.

As on the conference covers, on the poster each genome is drawn as a path. The length of the path is proportional to the size of the genome. Every fifth base is drawn as a circle whose color is based on the GC content (fraction of guanines and cytosines). The path curvature is proportional to the repeat content and the direction of curvature is determined by whether the GC content is lower or higher than average. Genomes are labeled by disease, organism, size (in bases) and GC content. Updated with the genome of SARS-CoV-2 (Wuhan-Hu-1 isolate) and COVID-19 case statistics as of 3 March 2020."

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
DEADLY GENOMES | Genomes of harmful bacteria and viruses.

The poster was a finalist in the 2009 National Science Foundation Visualization Challenge.

news + thoughts

Regression modeling of time-to-event data with censoring

Mon 21-11-2022

If you sit on the sofa for your entire life, you’re running a higher risk of getting heart disease and cancer. —Alex Honnold, American rock climber

In a follow-up to our Survival analysis — time-to-event data and censoring article, we look at how regression can be used to account for additional risk factors in survival analysis.

We explore accelerated failure time regression (AFTR) and the Cox Proportional Hazards model (Cox PH).

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Regression modeling of time-to-event data with censoring. (read)

Dey, T., Lipsitz, S.R., Cooper, Z., Trinh, Q., Krzywinski, M & Altman, N. (2022) Points of significance: Regression modeling of time-to-event data with censoring. Nature Methods 19.

Music video for Max Cooper's Ascent

Tue 25-10-2022

My 5-dimensional animation sets the visual stage for Max Cooper's Ascent from the album Unspoken Words. I have previously collaborated with Max on telling a story about infinity for his Yearning for the Infinite album.

I provide a walkthrough the video, describe the animation system I created to generate the frames, and show you all the keyframes

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frame 4897 from the music video of Max Cooper's Asent.

The video recently premiered on YouTube.

Renders of the full scene are available as NFTs.

Gene Cultures exhibit — art at the MIT Museum

Tue 25-10-2022

I am more than my genome and my genome is more than me.

The MIT Museum reopened at its new location on 2nd October 2022. The new Gene Cultures exhibit featured my visualization of the human genome, which walks through the size and organization of the genome and some of the important structures.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
My art at the MIT Museum Gene Cultures exhibit tells shows the scale and structure of the human genome. Pay no attention to the pink chicken.

Annals of Oncology cover

Wed 14-09-2022

My cover design on the 1 September 2022 Annals of Oncology issue shows 570 individual cases of difficult-to-treat cancers. Each case shows the number and type of actionable genomic alterations that were detected and the length of therapies that resulted from the analysis.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
An organic arrangement of 570 individual cases of difficult-to-treat cancers showing genomic changes and therapies. Apperas on Annals of Oncology cover (volume 33, issue 9, 1 September 2022).

Pleasance E et al. Whole-genome and transcriptome analysis enhances precision cancer treatment options (2022) Annals of Oncology 33:939–949.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
My Annals of Oncology 570 cancer cohort cover (volume 33, issue 9, 1 September 2022). (more)

Browse my gallery of cover designs.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
A catalogue of my journal and magazine cover designs. (more)

Survival analysis—time-to-event data and censoring

Fri 05-08-2022

Love's the only engine of survival. —L. Cohen

We begin a series on survival analysis in the context of its two key complications: skew (which calls for the use of probability distributions, such as the Weibull, that can accomodate skew) and censoring (required because we almost always fail to observe the event in question for all subjects).

We discuss right, left and interval censoring and how mishandling censoring can lead to bias and loss of sensitivity in tests that probe for differences in survival times.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Survival analysis—time-to-event data and censoring. (read)

Dey, T., Lipsitz, S.R., Cooper, Z., Trinh, Q., Krzywinski, M & Altman, N. (2022) Points of significance: Survival analysis—time-to-event data and censoring. Nature Methods 19:906–908.

3,117,275,501 Bases, 0 Gaps

Sun 21-08-2022

See How Scientists Put Together the Complete Human Genome.

My graphic in Scientific American's Graphic Science section in the August 2022 issue shows the full history of the human genome assembly — from its humble shotgun beginnings to the gapless telomere-to-telomere assembly.

Read about the process and methods behind the creation of the graphic.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
3,117,275,501 Bases, 0 Gaps. Text by Clara Moskowitz (Senior Editor), art direction by Jen Christiansen (Senior Graphics Editor), source: UCSC Genome Browser.

See all my Scientific American Graphic Science visualizations.


© 1999–2022 Martin Krzywinski | contact | Canada's Michael Smith Genome Sciences CentreBC Cancer Research CenterBC CancerPHSA