Art is science in love.
— E.F. Weisslitz
Science cannot move forward without storytelling. While we learn about the world and its patterns through science, it is through stories that we can organize and sort through the observations and conclusions that drive the generation of scientific hypotheses.
With Alberto Cairo, I've written about the importance of storytelling as a tool to explain and narrate in Storytelling (2013) Nat. Methods 10:687. There we suggest that instead of "explain, not merely show," you should seek to "narrate, not merely explain."
Our account received support (Should scientists tell stories. (2013) Nat. Methods 10:1037) but not from all (Against storytelling of scientific results. (2013) Nat. Methods 10:1045).
A good science story must present facts and conclusions within a hierarchy—a bag of unsorted observations isn't likely to engage your readers. But while a story must always inform, it should also delight (as much as possible), and inspire. It should make the complexity of the problem accessible—or, at least, approachable—without simplifications that preclude insight into how concepts connect (they always do).
Just like science, explaining science is a process—one that can be more vexing than the science itself!
In science one tries to tell people, in such a way as to be understood by everyone, something that no one ever knew before. But in poetry, it’s the exact opposite.
—Paul Dirac, Mathematical Circles Adieu by H. Eves [quoted]
I have previously written about the process of taking a scientific statement (Creating Scientific American Graphic Science graphics) and turning it into a data visualization or, more broadly, visual story.
The process of the creation of one of these visual stories is itself a story. A story about how the genome is not a blueprint, a discovery of Hilbertonians, which are creatures that live on the Hilbert curve, how algorithms for protein folding can be used to generate art based on the digits of `\pi`, or how we can make human genome art by humans with genomes. I've also written about my design process in creating the cover for Genome Research and the cover of PNAS. As always, not everything works out all the time—read about the EMBO Journal covers that never made it.
Here, I'd like to walk you through the process and sketches of creating a story based on the idea of differences in data and how the story can be used to understand the function of cells and disease.
The visual story is a creative collaboration with Becton Dickinson and The Linus Group and its creation began with the concept of differences. The art was on display at AGBT 2017 conference and accompanies BD's launch of the Resolve platform and "Difference of One in Genomics".
Starting with the idea of the "difference of one", our goal was to create artistic representations of data sets generated using the BD Resolve platform, which generates single-cell transcriptomes, that captured a variety of differences that are relevant in genomics research.
The data art pieces were installed in a gallery style, with data visualization and artistic expression in equal parts.
The art itself is an old school take on virtual reality. Unlike modern VR, which isolates the participants from one another, we chose a low-tech route that not only brings the audience closer to the data but also to each other.
The data were generated using the BD Resolve single-cell transcriptomics platform. For each of the three art pieces, we identified a data set that captured a variety of differences.
The real surprise and insight is in difference that ultimately advance our thinking (Data visualization: amgibuity as a fellow traveller. (2013) Nat. Methods 10:613-615).
Figuring out which differences are of this kind requires that instead of "What's new?" we ask "What's different?"
I've previously taken a more fine-art approach to cover design, such for those of Nature, Genome Research and Trends in Genetics. I've used microscopy images to create a cover for PNAS—the one that made biology look like astrophysics—and thought that this is kind of material I'd start with for the MCS cover.
A map of the nearby superclusters and voids in the Unvierse.
By "nearby" I mean within 6,000 million light-years.
It was now time to design my first ... pair of socks.
In collaboration with Flux Socks, the design features the colors and relative thicknesses of Rogue olympic weightlifting plates. The first four plates in the stack are the 55, 45, 35, and 25 competition plates. The top 4 plates are the 10, 5, 2.5 and 1.25 lb change plates.
The perceived weight of each sock is 178.75 lb and 357.5 lb for the pair.
The actual weight is much less.
Find patterns behind gene expression and disease.
Expression, correlation and network module membership of 11,000+ genes and 5 psychiatric disorders in about 6" x 7" on a single page.
Design tip: Stay calm.
Gandal M.J. et al. Shared Molecular Neuropathology Across Major Psychiatric Disorders Parallels Polygenic Overlap Science 359 693–697 (2018)
We discuss the many ways in which analysis can be confounded when data has a large number of dimensions (variables). Collectively, these are called the "curses of dimensionality".
Some of these are unintuitive, such as the fact that the volume of the hypersphere increases and then shrinks beyond about 7 dimensions, while the volume of the hypercube always increases. This means that high-dimensional space is "mostly corners" and the distance between points increases greatly with dimension. This has consequences on correlation and classification.