Where am I supposed to go? Where was I supposed to know?get lost in questionsmore quotes

# numbers: beautiful

EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.

# visualization + design

The 2017 Pi Day art imagines the digits of Pi as a star catalogue with constellations of extinct animals and plants. The work is featured in the article Pi in the Sky at the Scientific American SA Visual blog.

# $\pi$ Day 2017 Art Posters - Star charts and extinct animals and plants

2017 $\pi$ day
2016 $\pi$ approximation day
2016 $\pi$ day
2015 $\pi$ day
2014 $\pi$ approx day
2014 $\pi$ day
2013 $\pi$ day
Circular $\pi$ art

On March 14th celebrate $\pi$ Day. Hug $\pi$—find a way to do it.

For those who favour $\tau=2\pi$ will have to postpone celebrations until July 26th. That's what you get for thinking that $\pi$ is wrong.

If you're not into details, you may opt to party on July 22nd, which is $\pi$ approximation day ($\pi$ ≈ 22/7). It's 20% more accurate that the official $\pi$ day!

Finally, if you believe that $\pi = 3$, you should read why $\pi$ is not equal to 3.

All art posters are available for purchase.
I take custom requests.

Caelum non animum mutant qui trans mare currunt.
—Horace

This year: creatures that don't exist, but once did, in the skies.

And a poem Of Black Body.

This year's $\pi$ day song is Exploration by Karminsky Experience Inc. Why? Because "you never know what you'll find on an exploration".

## create myths and contribute!

Want to contribute to the mythology behind the constellations in the $\pi$ in the sky? Many already have a story, but others still need one. Please submit your stories!

This year's $\pi$ day art goes to space and there finds creatures that once called Earth home.

Well, I don't know if they called it home—at least a few thought it was a terribly inhospitable. All thought it was unforgivingly Darwinian.

buy artwork
Star chart of the first 12,000,000 digits of $\pi$. The 80 constellations honor extinct animals and plants. Azimuthal equidistant projection. (BUY ARTWORK)

## the digits of $\pi$ as a star catalogue

I asked the question: what happens if you interpret the digits of $\pi$ as a catalogue of stars? What would the patterns in the sky look like? And, would they have stories? A couple of weeks later—and a few adventures down the rabbit holes of topographical projections—I found the answer.

The digits of Pi are interpreted as a star catalogue. Starting from the beginning of $\pi$, each block of 12 digits is taken to be the $(x, y, z)$ coordinates of a star and its absolute magnitude, which defines its brightness at a fixed distance. After sampling 12 million digits, which yields 1,000,000 stars, the stars are projected onto the celestial sphere to generate longitude and latitude coordinates from the perspective of an observer who is placed at center of the stars. The distance to the star is used to calculate the apparent magnitude—how bright it will appear on the sky chart.

buy artwork
Star chart of the first 12,000,000 digits of $\pi$. The 80 constellations honor extinct animals and plants. Plate carrée projection. (BUY ARTWORK)
buy artwork
Star chart of the first 12,000,000 digits of $\pi$. The 80 constellations honor extinct animals and plants. Hammer/Aitoff elliptical projection. (BUY ARTWORK)

Of course, a star chart would not be complete without constellations. While the digits of $\pi$ are taken to be a universal catalog of stars, it is up to the observer to subjectively interpret the patterns and find stories in the sky. Among the 40,000 stars drawn in the chart, 80 extinct species are honored as constellations—creatures range across time periods and geographical location and play together in the sky. And much like the constellations that you might have seen in the real sky (the hunter Orion and winged-horse Pegasus) the $\pi$ in the sky patterns represent creatures that do not exist.

The twist? They all once did!

## chart variations

Charts are available in blue, black and white and sepia.

buy artwork
Star chart of the first 12,000,000 digits of $\pi$. The 80 constellations honor extinct animals and plants. Azimuthal equidistant projection. (BUY ARTWORK)
buy artwork
Star chart of the first 12,000,000 digits of $\pi$. The 80 constellations honor extinct animals and plants. Azimuthal equidistant projection. (BUY ARTWORK)

## poetry in the sky

This year's art is accompanied by a poem, Of Black Body, by Paolo Marcazzan. The poem is named after the idealized black body, an idealized material that absorbs all incident radiation.

There's plenty of nothing to see in space, Paolo insists, "there is nothing to see and you are seeing it. A truth that likes it here" and he hopes that you'll be find "drawn into our constellation, and cooling."

## stories in the sky

There are indeed stories in the sky and many of them haven't yet been told. Below are just some of them.

More stories are available in the details about each constellation. There, for each animal you can find the common name, Latin name, when it was extinct and a link to Wikipedia to learn more about the animal. Many have stories which they would love to share with you.

### raphus — egg guardian extraordinaire

Raphus and Pecatonica

The Dodo bird (Raphus cucullatus) is vigilantly guarding his eggs—the clusters of stars just north of &alpha Raphus (the first brightest star in the constellation) and south of β Raphus (the second brightest star). He hardly seems to care about the pestering River Mayfly (Acanthometropus pecatonica), who is trying to draw his attention.

### desmodus — nightly escape attempt

Desmodus

Desmodus, the giant vampire bat (Desmodus draculae) is said to be trying to escape the dome of the sky each night. You might see his shape fluttering up into the top of the northern hemisphere.

### basilosaurus — chasing the light in the deep

Basilosaurus

The king lizard Basilosaurus (Basilosaurus cetoides) dives deeper and deeper at the bottom of the south hemisphere, as he chases the light of the bright magnitude 1.8 star α Basilosaurus, which sits at the very bottom of the chart.

### quagga and aurochs — figuring out the stripes

Quagga and Aurochs

There is plenty of land mass in teh north, where huge creatures like the mammoth (Mammuthus primigenius), sturdy aurochs (Bos primigenius) and the comical Quagga (Equus quagga quagga) run again. The Quagga cannot figure out his stripes and is seen frequently talking to the Aurochs, seeking his advice about the predicament of his patterns.

### megalodon, rodhocetus and tecopa — the terror and the terrorized

Megalodon, Rodhocetus and Tecopa

At the tip of the southern hemisphere drama unfolds as the monster shark Megalodon (Carcharodon megalodon) gives chase to the Tecopa pupfish (Cyprinodon nevadensis calidae) and Rodhocetus (Rodhocetus kasrani). Rodhocetus was an early whale that possessed land mammal characteristics and the story goes that he escaped Megalodon and lived out his life on the land, never returning to the sea.

### camptor, mariana, alaotra and tadorna — I'm not a duck!

Camptor, Mariana, Alaotra and Tadorna

The south hemisphere also has plenty of calmer waters, where all manners of floating birds come for a swim and chat. These are represented by the triangular constellations of Camptor (Camptorhynchus labradorius, the Labrador duck), Alaotra grebe (Tachybaptus rufolavatus), Mariana mallard (Anas oustaleti) and the Korean crested shelduck (Tadorna cristata). Rumor has it Tadorna may have snuck into the sky without permission—while not seen since the 1960’s, some say the duck isn't actually extinct! Alaotra is frustrated that Tadorna seems to get all the attention. Often confused for a duck, Alaotra would love you to know that she's in fact a grebe. She's very proud of this fact, despite of being prone to falls due to some biomechanical issues having to do with foot placement.

### yersinia — don't come close

Yersinia

Don't let Yersinia's small size fool you. The Black Death (Yersinia pestis) may be the smallest creature in the sky, but she'll liquify your insides before you can memorize the 80 constellations. Perhaps out of all the creatures in the sky, this is the one we're happy to see go. But, because it's small, you can never be quite sure Yersinia isn't extinct but merely hiding. Or waiting.

### pipilo — flocking across

Pipilo

Pipilo is a rare flocking constellation and breaks the rule that a constellation should be a single connected component. Its stars are connected a loose pattern of pairs and show a flock of Bermuda towhees (Pipilo naufragus) crossing from the north to south hemispheres.

### o'ahu 'akepa — two is such a lonely number

O'ahu 'akepa

Like Pipilo, the O'ahu 'akepa (Loxops wolstenholmei) is the only other multi-part constellation. Here, a pair of akepas are chatting and spreading rumors that Tadorna isn't actually extinct.

### bron and compsognathus — big brothers and little friends

Bron and Compsognathus

It's hard to be bigger than Bron (Brontosaurus excelsus), who must pay great attention not to step on his frolicking friend Compsognathus (Compsognathus longipes), who seeks to find protection in Bron's shadow. Some believe that if Bron stretches his neck, he can look above the sky!

### pinta and cylindraspis — a friend for the end of the earth

Pinta and Cylindraspis

Ever since Cylindraspis (Cylindraspis indica) heard that Pinta (Chelonoidis abingdonii) was the last of his kind, he decided to keep him company. They can be seen going for a very slow walk at the bottom of the sky, kept their by their heavy shells. The last Pinta—a turtle named Lonesome George—died in 2012.

### xerces and Palaeoaldrovanda — flying flowers and hungry flowers

Xerces and Palaeoaldrovanda

Xerces (Glaucopsyche xerces) is the only thing that is more brilliant than the blue sky itself. Some say that butterflies are flying flowers and Xerces is never seen far from Palaeoaldrovanda. He must be careful though. Rumor has it Palaeoaldrovanda was related to the carnivorous plant genus Aldrovanda! Nobody wants to take that chance.

### araucaria — shelter for flying customers

Araucaria

Xerces (Glaucopsyche xerces) is the only thing that is more brilliant than the blue sky itself. Some say that butterflies are flying flowers and Xerces is never seen far from Palaeoaldrovanda. He must be careful though. Rumor has it Palaeoaldrovanda was related to the carnivorous plant genus Aldrovanda! Nobody wants to take that chance.

### araucaria — too large for the sky

Araucaria

Araucaria (Araucaria mirabilis) is truly a marvel and it is too big to fit in the sky! The constellation only shows the canopy and does not include the tree trunk—which was known to reach a height of 100 m. Araucaria offers plenty of protection and has many flying friends all around, including Urania, Moho and WhĒkau. Just a little further are the ducks (and a grebe), Camptor, Mariana, Tadorna and Alaotra. They would love to visit Araucaria but worry that they are too heavy to perch on her branches.

### ardea and aepyronis — who can see beyond the sky first?

Ardea and Aepyronis

The story goes that Camelops (Camelops kansansus) played a joke on Ardea (Ardea bennuides) and Aepyronis (Aepyornis maximus). He said "You both have long necks. But who has the longest? The first who can stretch far enough and tell me what is beyond the sky wins." To this day, Both Ardea and Aepyronis are seen straining their necks trying to win the bet.

VIEW ALL

# Snowflake simulation

Tue 14-11-2017
Symmetric, beautiful and unique.

Just in time for the season, I've simulated a snow-pile of snowflakes based on the Gravner-Griffeath model.

A few of the beautiful snowflakes generated by the Gravner-Griffeath model. (explore)

Gravner, J. & Griffeath, D. (2007) Modeling Snow Crystal Growth II: A mesoscopic lattice map with plausible dynamics.

# Genes that make us sick

Thu 02-11-2017
Where disease hides in the genome.

My illustration of the location of genes in the human genome that are implicated in disease appears in The Objects that Power the Global Economy, a book by Quartz.

The location of genes implicated in disease in the human genome, shown here as a spiral. (more...)

# Ensemble methods: Bagging and random forests

Mon 16-10-2017
Many heads are better than one.

We introduce two common ensemble methods: bagging and random forests. Both of these methods repeat a statistical analysis on a bootstrap sample to improve the accuracy of the predictor. Our column shows these methods as applied to Classification and Regression Trees.

Nature Methods Points of Significance column: Ensemble methods: Bagging and random forests. (read)

For example, we can sample the space of values more finely when using bagging with regression trees because each sample has potentially different boundaries at which the tree splits.

Random forests generate a large number of trees by not only generating bootstrap samples but also randomly choosing which predictor variables are considered at each split in the tree.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Ensemble methods: bagging and random forests. Nature Methods 14:933–934.

### Background reading

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

# Classification and regression trees

Mon 16-10-2017
Decision trees are a powerful but simple prediction method.

Decision trees classify data by splitting it along the predictor axes into partitions with homogeneous values of the dependent variable. Unlike logistic or linear regression, CART does not develop a prediction equation. Instead, data are predicted by a series of binary decisions based on the boundaries of the splits. Decision trees are very effective and the resulting rules are readily interpreted.

Trees can be built using different metrics that measure how well the splits divide up the data classes: Gini index, entropy or misclassification error.

Nature Methods Points of Significance column: Classification and decision trees. (read)

When the predictor variable is quantitative and not categorical, regression trees are used. Here, the data are still split but now the predictor variable is estimated by the average within the split boundaries. Tree growth can be controlled using the complexity parameter, a measure of the relative improvement of each new split.

Individual trees can be very sensitive to minor changes in the data and even better prediction can be achieved by exploiting this variability. Using ensemble methods, we can grow multiple trees from the same data.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

### Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Multiple Linear Regression Nature Methods 12:1103-1104.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. Nature Methods 13:703-704.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Regularization. Nature Methods 13:803-804.

# Personal Oncogenomics Program 5 Year Anniversary Art

Wed 26-07-2017

The artwork was created in collaboration with my colleagues at the Genome Sciences Center to celebrate the 5 year anniversary of the Personalized Oncogenomics Program (POG).

5 Years of Personalized Oncogenomics Program at Canada's Michael Smith Genome Sciences Centre. The poster shows 545 cancer cases. (left) Cases ordered chronologically by case number. (right) Cases grouped by diagnosis (tissue type) and then by similarity within group.

The Personal Oncogenomics Program (POG) is a collaborative research study including many BC Cancer Agency oncologists, pathologists and other clinicians along with Canada's Michael Smith Genome Sciences Centre with support from BC Cancer Foundation.

The aim of the program is to sequence, analyze and compare the genome of each patient's cancer—the entire DNA and RNA inside tumor cells— in order to understand what is enabling it to identify less toxic and more effective treatment options.

# Principal component analysis

Thu 06-07-2017
PCA helps you interpret your data, but it will not always find the important patterns.

Principal component analysis (PCA) simplifies the complexity in high-dimensional data by reducing its number of dimensions.

Nature Methods Points of Significance column: Principal component analysis. (read)

To retain trend and patterns in the reduced representation, PCA finds linear combinations of canonical dimensions that maximize the variance of the projection of the data.

PCA is helpful in visualizing high-dimensional data and scatter plots based on 2-dimensional PCA can reveal clusters.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Principal component analysis. Nature Methods 14:641–642.

### Background reading

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.