Mad about you, orchestrally.feel the vibe, feel the terror, feel the pain

# a rat: exciting

Bioinformatics and Genome Analysis Course. Izmir International Biomedicine and Genome Institute, Izmir, Turkey. May 2–14, 2016

# Alex — Internet's Most Popular Rat

## Poster Rat for Rat Genome Sequencing

The rat genome sequencing project at the Baylor College of Medicine Human Genome Sequencing Centre is complete. The genome has been analyzed and published.

I'd like to introduce you one of the faces of the project: Alex, the genomics rat idol.

Arguably, Alex is the most popular rat on the internet. For the justification of this strong statement, read on.

Alex, the rat. Rattus norvegicus on an ABI 3700 genome sequencer. (zoom)
Alex, the rat. Rattus norvegicus on an ABI 3700 genome sequencer. (zoom)

## Alex's Biography

Alex was born in May 2000. It's well known that a rat's cuteness reaches maximum at about 3-4 weeks. After this critical time, a pet store rat is less likely to be purchased and may be asked to act as snake food. In Alex's case, she was perilously close to her deadline. Luckily for her, we paid a ransom of \$6.99 to the Noah's Ark pet shop in Vancouver. She was on her last cute leg.

Portrait of Alex, the genome rat (Rattus norvegicus). Here, she is seen in a forced portrait position (zoom)

From May 2000 Alex spent most of her time hoarding food pellets and riding on shoulders.

Portrait of Alex, the genome rat (Rattus norvegicus). Riding on shoulder.

Alex liked to bite. And rats only bite hard — they don't nibble. Her contention for this unattractive behaviour was the uncanny similarity between a finger and a pellet of food.

Other than unpredictable bouts of biting (by far the most exciting aspect of her personality), Alex lacked other distinguishing characteristics.

Alex died of a seizure in late 2002. She was buried outside of the Museum of Anthropology. A ratty pair of underwear served as a burial shroud.

And I hope you got that last pun.

Photos are for public use. Use, modification and distribution of these photos is unrestricted.

## Alex's Popularity

Despite my best efforts at meaningful work, this web page continues to be the most popular of all my online offerings, making for a somewhat embarrassing achievement.

Alex's images consistently show up first in Google's web search for 'rat', 'rat image' and image search for 'rat'.

Alex image is the first for Google's 'rat' search query (retrieved 16 Mar 2013). (rat Google search)
Alex image is the first for Google's 'rat image' search query (retrieved 16 Mar 2013). (rat Google search)

Finally, Alex appears as the first entry in Google images for 'rat'.

Alex image is the first for Google's 'rat image' search query (retrieved 16 Mar 2013). (rat Google search)

## Alex's Public Appearances

Alex is neither without modesty nor public fame. Her first cover-ratgirl appearance was on the April 2004 issue of Genome Research.

Alex the rat appeared on the cover of Genome Research (April 2004). (zoom)

More recently, she's appeared on the cover of Ethnologie Francaise (Jan-Mar 2009 issue).

Alex the rat appeared on the cover of Ethnologie Francaise (1/2009). (zoom)

The topic of the issue was the relationship between animals and humans. It is fitting therefore to recount here the relationship I shared with Alex during her sojourn with us.

# Gene Volume Control

Thu 11-06-2015

I was commissioned by Scientific American to create an information graphic based on Figure 9 in the landmark Nature Integrative analysis of 111 reference human epigenomes paper.

The original figure details the relationships between more than 100 sequenced epigenomes and genetic traits, including disease like Crohn's and Alzheimer's. These relationships were shown as a heatmap in which the epigenome-trait cell depicted the P value associated with tissue-specific H3K4me1 epigenetic modification in regions of the genome associated with the trait.

Figure 9 from Integrative analysis of 111 reference human epigenomes (Nature (2015) 518 317–330). (details)

As much as I distrust network diagrams, in this case this was the right way to show the data. The network was meticulously laid out by hand to draw attention to the layered groups of diseases of traits.

Network diagram redesign of the heatmap for a select set of traits. Only relationships with –log P > 3.9 are displayed. Appears on Graphic Science page in June 2015 issue of Scientific American. (details)

This was my second information graphic for the Graphic Science page. Last year, I illustrated the extent of differences in the gene sequence of humans, Denisovans, chimps and gorillas.

# Sampling distributions and the bootstrap

Thu 11-06-2015

The bootstrap is a computational method that simulates new sample from observed data. These simulated samples can be used to determine how estimates from replicate experiments might be distributed and answer questions about precision and bias.

Nature Methods Points of Significance column: Sampling distributions and the bootstrap. (read)

We discuss both parametric and non-parametric bootstrap. In the former, observed data are fit to a model and then new samples are drawn using the model. In the latter, no model assumption is made and simulated samples are drawn with replacement from the observed data.

Kulesa, A., Krzywinski, M., Blainey, P. & Altman, N (2015) Points of Significance: Sampling distributions and the bootstrap Nature Methods 12:477-478.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Importance of being uncertain. Nature Methods 10:809-810.

# Bayesian statistics

Thu 30-04-2015

Building on last month's column about Bayes' Theorem, we introduce Bayesian inference and contrast it to frequentist inference.

Given a hypothesis and a model, the frequentist calculates the probability of different data generated by the model, P(data|model). When this probability to obtain the observed data from the model is small (e.g. alpha = 0.05), the frequentist rejects the hypothesis.

Nature Methods Points of Significance column: Bayesian Statistics. (read)

In contrast, the Bayesian makes direct probability statements about the model by calculating P(model|data). In other words, given the observed data, the probability that the model is correct. With this approach it is possible to relate the probability of different models to identify one that is most compatible with the data.

The Bayesian approach is actually more intuitive. From the frequentist point of view, the probability used to assess the veracity of a hypothesis, P(data|model), commonly referred to as the P value, does not help us determine the probability that the model is correct. In fact, the P value is commonly misinterpreted as the probability that the hypothesis is right. This is the so-called "prosecutor's fallacy", which confuses the two conditional probabilities P(data|model) for P(model|data). It is the latter quantity that is more directly useful and calculated by the Bayesian.

Puga, J.L, Krzywinski, M. & Altman, N. (2015) Points of Significance: Bayes' Theorem Nature Methods 12:277-278.