In your hiding, you're alone. Kept your treasures with my bones.crawl somewhere bettermore quotes

art: revealing

The Outbreak Poems — artistic emissions in a pandemic

data visualization + art

The BC Cancer Agency’s Personalized Oncogenomics Program (POG) is a clinical research initiative applying genomic sequencing to the diagnosis and treatment of patients with incurable cancers.

Art of the Personalized Oncogenomics Program

Nature uses only the longest threads to weave her patterns, so that each small piece of her fabric reveals the organization of the entire tapestry.
— Richard Feynman

Art is Science in Love
— E.F. Weisslitz

what do the circles mean?

The legend can be printed at 4" × 6". The bitmap resolution is 600 dpi.

Quick legend. 5 Years of Personalized Oncogenomics Project at Canada's Michael Smith Genome Sciences Centre. The poster shows 545 cancer cases. (zoom)

a case for a visual case summary

For every case, we sequence the DNA to study the genome structure and the RNA to discover which genes are expressed and to what extent. The analysis is quite complex and brings together many steps: sequence alignment, structural variation detection, expression profiling, pathway analysis and so on. Every case is "summarized" by a lengthy report, such as the one below, which can run to over 40 pages.

A report for a typical POG case is about 40–50 pages.

One of the goals of the 5-year anniversary art was to represent the cases in a way to clearly show their number, classification as well as diversity. There are many metrics that can be used and I decided to choose the case's correlation to other cancer types.

correlation to TCGA cancer database

For every POG case, the gene expression of 1,744 key genes is compared to that of 1,000's of cases in the TCGA database of cancer samples. For a given cancer type in the TCGA database (e.g. BRCA), we visualize the correlations using box plots. The box plot is ideal for showing the distribution of values in a sample.

Every case is compared to a database of 1,000's of cases. Shown here are box plots for the Spearman correlation coefficient between the gene expression of the POG case and cancers of a specific type (e.g. BRCA, LUAD, etc). (zoom)

The 10 largest Spearman correlation coefficients for the case shown above are

$case corr type tissue ----------------------------------------------- POG661 0.436 BRCA Breast POG661 0.371 PRAD Urologic POG661 0.295 OV Gynecologic POG661 0.257 UCEC Gynecologic POG661 0.244 LUAD Thoracic POG661 0.235 CESC_CAD Gynecologic POG661 0.225 MB_Adult Central Nervous System POG661 0.222 KICH Urologic POG661 0.219 THCA Endocrine POG661 0.208 UCS Gynecologic$

In the figure below I show how the final encoding of the correlations is done. First, the top three correlations are taken—using more generates a busy look and diminishes visual impact. The correlations are encoded as concentric rings.

Because in most cases the differences in the top 3 correlations are relatively small, differences are emphasized by non-linearly scaling the encoding (the correlations are first scaled $r^3$).

Case POG661. Median gene expression correlations with different cancer types from TCGA database. (A) Top 10 correlations shown as a bar plot. Color coding is by source tissue associated with the cancer type. (B) Top 10 correlations encoded as concentric rings. The width of the ring is proportional to the correlation. (C) Top 3 correlations. (D) Top 3 correlations scaled with a power to emphasize differences. (zoom)

The type face is Proxima Nova. The colors for each tissue source are

$Gastrointestinal ● 234,62,144 Breast ● 237,75,51 Thoracic ● 242,130,56 Gynecologic ● 253,188,61 Soft tissue ● 244,217,59 Skin ● 193,216,51 Urologic ● 114,197,49 Hematologic ● 29,166,68 Head and neck ● 43,168,224 Endocrine ● 71,82,178 Central nervous system ● 127,65,146 Other ● 150,150,150$

Virus Mutations Reveal How COVID-19 Really Spread

Mon 04-05-2020

Genetic sequences of the coronavirus tell story of when the virus arrived in each country and where it came from.

Our graphic in Scientific American's Graphic Science section in the June 2020 issue shows a phylogenetic tree based on a snapshot of the data model from Nextstrain as of 31 March 2020.

Virus Mutations Reveal How COVID-19 Really Spread. Text by Mark Fischetti (Senior Editor), art direction by Jen Christiansen (Senior Graphics Editor), source: Nextstrain (enabled by data from GISAID).

Cover of Nature Cancer April 2020

Mon 27-04-2020

Our design on the cover of Nature Cancer's April 2020 issue shows mutation spectra of patients from the POG570 cohort of 570 individuals with advanced metastatic cancer.

Each ellipse system represents the mutation spectrum of an individual patient. Individual ellipses in the system correspond to the number of base changes in a given class and are layered by mutation count. Ellipse angle is controlled by the proportion of mutations in a class within the sample and its size is determined by a sigmoid mapping of mutation count scaled within the layer. The opacity of each system represents the duration since the diagnosis of advanced disease. (read more)

The cover design accompanies our report in the issue Pleasance, E., Titmuss, E., Williamson, L. et al. (2020) Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes. Nat Cancer 1:452–468.

Modeling infectious epidemics

Wed 06-05-2020

Every day sadder and sadder news of its increase. In the City died this week 7496; and of them, 6102 of the plague. But it is feared that the true number of the dead this week is near 10,000 ....
—Samuel Pepys, 1665

This month, we begin a series of columns on epidemiological models. We start with the basic SIR model, which models the spread of an infection between three groups in a population: susceptible, infected and recovered.

Nature Methods Points of Significance column: Modeling infectious epidemics. (read)

We discuss conditions under which an outbreak occurs, estimates of spread characteristics and the effects that mitigation can play on disease trajectories. We show the trends that arise when "flattenting the curve" by decreasing $R_0$.

Nature Methods Points of Significance column: Modeling infectious epidemics. (read)

This column has an interactive supplemental component that allows you to explore how the model curves change with parameters such as infectious period, basic reproduction number and vaccination level.

Nature Methods Points of Significance column: Modeling infectious epidemics. (Interactive supplemental materials)

Bjørnstad, O.N., Shea, K., Krzywinski, M. & Altman, N. (2020) Points of significance: Modeling infectious epidemics. Nature Methods 17:455–456.

The Outbreak Poems

Sat 04-04-2020

I'm writing poetry daily to put my feelings into words more often during the COVID-19 outbreak.

$That moment when you know a moment.$
$Branch to branch, flit, look everywhere, chirp.$
$Memory, scent of thought fleeting.$
$Distant pasts all ways in plural form.$

Deadly Genomes: Genome Structure and Size of Harmful Bacteria and Viruses

Tue 17-03-2020

A poster full of epidemiological worry and statistics. Now updated with the genome of SARS-CoV-2 and COVID-19 case statistics as of 3 March 2020.

Deadly Genomes: Genome Structure and Size of Harmful Bacteria and Viruses (zoom)

Bacterial and viral genomes of various diseases are drawn as paths with color encoding local GC content and curvature encoding local repeat content. Position of the genome encodes prevalence and mortality rate.

The deadly genomes collection has been updated with a posters of the genomes of SARS-CoV-2, the novel coronavirus that causes COVID-19.

Genomes of 56 SARS-CoV-2 coronaviruses that causes COVID-19.
Ball of 56 SARS-CoV-2 coronaviruses that causes COVID-19.
The first SARS-CoV-2 genome (MT019529) to be sequenced appears first on the poster.

Using Circos in Galaxy Australia Workshop

Wed 04-03-2020

A workshop in using the Circos Galaxy wrapper by Hiltemann and Rasche. Event organized by Australian Biocommons.

Using Circos in Galaxy Australia workshop. (zoom)

Galaxy wrapper training materials, Saskia Hiltemann, Helena Rasche, 2020 Visualisation with Circos (Galaxy Training Materials).