How often people speak of art and science as though they were two entirely different things, with no interconnection. An artist is emotional, they think, and uses only his intuition; he sees all at once and has no need of reason. A scientist is cold, they think, and uses only his reason; he argues carefully step by step, and needs no imagination. That is all wrong. The true artist is quite rational as well as imaginative and knows what he is doing; if he does not, his art suffers. The true scientist is quite imaginative as well as rational, and sometimes leaps to solutions where reason can follow only slowly; if he does not, his science suffers. —Isaac Asimov (The Roving Mind)
The video will be posted at vizbi.org.
A poet is, after all, a sort of scientist, but engaged in a qualitative science in which nothing is measurable. He lives with data that cannot be numbered, and his experiments can be done only once. The information in a poem is, by definition, not reproducible. He becomes an equivalent of scientist, in the act of examining and sorting the things popping in [to his head], finding the marks of remote similarity, points of distant relationship, tiny irregularities that indicate that this one is really the same as that one over there only more important. Gauging the fit, he can meticulously place pieces of the universe together, in geometric configurations that are as beautiful and balanced as crystals. —Lewis Thomas (The Medusa and the Snail: More Notes of a Biology Watcher)
If you're asking how to visualize big data, first make sure you're doing a good job on small and medium data. Each scale requires good design.
Also consider that there is a very large number of combinations of data sets, hypotheses and possible patterns. Because of this, you cannot expect to use one way to tell many stories. There is no Holy Grail of big data visualization. But there are many good questions to ask and practices to follow that make up a process which can help you get there.
One of my color tools, the
colorsnap application snaps colors in an image to a set of reference colors and reports their proportion.
Below is Times Square rendered using the colors of the MTA subway lines.
Drugs could be more effective if taken when the genetic proteins they target are most active.
Design tip: rediscover CMYK primaries.
Ruben et al. A database of tissue-specific rhythmically expressed human genes has potential applications in circadian medicine Science Translational Medicine 10 Issue 458, eaat8806.
We focus on the important distinction between confidence intervals, typically used to express uncertainty of a sampling statistic such as the mean and, prediction and tolerance intervals, used to make statements about the next value to be drawn from the population.
Confidence intervals provide coverage of a single point—the population mean—with the assurance that the probability of non-coverage is some acceptable value (e.g. 0.05). On the other hand, prediction and tolerance intervals both give information about typical values from the population and the percentage of the population expected to be in the interval. For example, a tolerance interval can be configured to tell us what fraction of sampled values (e.g. 95%) will fall into an interval some fraction of the time (e.g. 95%).
Altman, N. & Krzywinski, M. (2018) Points of significance: Predicting with confidence and tolerance Nature Methods 15:843–844.
Krzywinski, M. & Altman, N. (2013) Points of significance: Importance of being uncertain. Nature Methods 10:809–810.