listen; there's a hell of a good universe next door: let's go.go theremore quotes

# design: beautiful

EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.

# Visual Design Principles—Communicating Effectively

This talk happened on Thursday, Mar 21st 2013 at VIZBI 2013 at the Broad Institute in Boston.

How often people speak of art and science as though they were two entirely different things, with no interconnection. An artist is emotional, they think, and uses only his intuition; he sees all at once and has no need of reason. A scientist is cold, they think, and uses only his reason; he argues carefully step by step, and needs no imagination. That is all wrong. The true artist is quite rational as well as imaginative and knows what he is doing; if he does not, his art suffers. The true scientist is quite imaginative as well as rational, and sometimes leaps to solutions where reason can follow only slowly; if he does not, his science suffers. —Isaac Asimov (The Roving Mind)

For more visualization and design resources, see my VIZBI 2012 tutorials, Nature Methods Points of View column, and rant about colors.

Do not allow encoding or other design choices to hijaack your message. Each element on the page must meaningfully contribute to your figure.

## presentation video

The video will be posted at vizbi.org.

## presentation slides

Slides are available as PDF and keynote (zipped).

1/144

A poet is, after all, a sort of scientist, but engaged in a qualitative science in which nothing is measurable. He lives with data that cannot be numbered, and his experiments can be done only once. The information in a poem is, by definition, not reproducible. He becomes an equivalent of scientist, in the act of examining and sorting the things popping in [to his head], finding the marks of remote similarity, points of distant relationship, tiny irregularities that indicate that this one is really the same as that one over there only more important. Gauging the fit, he can meticulously place pieces of the universe together, in geometric configurations that are as beautiful and balanced as crystals. —Lewis Thomas (The Medusa and the Snail: More Notes of a Biology Watcher)

## breakout session—making good figures

Sketch notes by the inimitable Francis Rowland from our breakout group. The question was: what do you need to make good figures? (PDF)

## small, medium and big data visualization

If you're asking how to visualize big data, first make sure you're doing a good job on small and medium data. Each scale requires good design.

Do not expect to use one way
to tell many stories

Also consider that there is a very large number of combinations of data sets, hypotheses and possible patterns. Because of this, you cannot expect to use one way to tell many stories. There is no Holy Grail of big data visualization. But there are many good questions to ask and practices to follow that make up a process which can help you get there.

Medium data visualization. This is what happens when you show the data (a strategy that works for small data), instead of explaining it. Yup, we need to work on this too. (A) Qi X et al. J Biotech 144:43 (2012) (Saturation-Mutagenesis in Two Positions Distant from Active Site of a Klebsiella pneumoniae Glycerol Dehydratase Identifies Some Highly Active Mutants) (B) Alekseyev, M.A. et al. Genome Res 19:943 (2009) (Breakpoint graphs and ancestral genome reconstructions)
Big data visualization. Yes, data sets are growing but are visual and cognitive abilities are not. There are many data sets, each requiring its own approach. You cannot use one way to tell many stories. Lewis SN et al. PLoS ONE 6:e27175 (2011) (Prediction of Disease and Phenotype Associations from Genome-Wide Association Studies)
VIEW ALL

# $k$ index: a weightlighting and Crossfit performance measure

Wed 07-06-2017

Similar to the $h$ index in publishing, the $k$ index is a measure of fitness performance.

To achieve a $k$ index for a movement you must perform $k$ unbroken reps at $k$% 1RM.

The expected value for the $k$ index is probably somewhere in the range of $k = 26$ to $k=35$, with higher values progressively more difficult to achieve.

In my $k$ index introduction article I provide detailed explanation, rep scheme table and WOD example.

# Dark Matter of the English Language—the unwords

Wed 07-06-2017

I've applied the char-rnn recurrent neural network to generate new words, names of drugs and countries.

The effect is intriguing and facetious—yes, those are real words.

But these are not: necronology, abobionalism, gabdologist, and nonerify.

These places only exist in the mind: Conchar and Pobacia, Hzuuland, New Kain, Rabibus and Megee Islands, Sentip and Sitina, Sinistan and Urzenia.

And these are the imaginary afflictions of the imagination: ictophobia, myconomascophobia, and talmatomania.

And these, of the body: ophalosis, icabulosis, mediatopathy and bellotalgia.

Want to name your baby? Or someone else's baby? Try Ginavietta Xilly Anganelel or Ferandulde Hommanloco Kictortick.

When taking new therapeutics, never mix salivac and labromine. And don't forget that abadarone is best taken on an empty stomach.

And nothing increases the chance of getting that grant funded than proposing the study of a new –ome! We really need someone to looking into the femome and manome.

# Dark Matter of the Genome—the nullomers

Wed 31-05-2017

An exploration of things that are missing in the human genome. The nullomers.

Julia Herold, Stefan Kurtz and Robert Giegerich. Efficient computation of absent words in genomic sequences. BMC Bioinformatics (2008) 9:167

# Clustering

Wed 31-05-2017
Clustering finds patterns in data—whether they are there or not.

We've already seen how data can be grouped into classes in our series on classifiers. In this column, we look at how data can be grouped by similarity in an unsupervised way.

Nature Methods Points of Significance column: Clustering. (read)

We look at two common clustering approaches: $k$-means and hierarchical clustering. All clustering methods share the same approach: they first calculate similarity and then use it to group objects into clusters. The details of the methods, and outputs, vary widely.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.