Martin Krzywinski / Canada's Michael Smith Genome Sciences Centre / Martin Krzywinski / Canada's Michael Smith Genome Sciences Centre / - contact me Martin Krzywinski / Canada's Michael Smith Genome Sciences Centre / on Twitter Martin Krzywinski / Canada's Michael Smith Genome Sciences Centre / - Lumondo Photography Martin Krzywinski / Canada's Michael Smith Genome Sciences Centre / - Pi Art Martin Krzywinski / Canada's Michael Smith Genome Sciences Centre / - Hilbertonians - Creatures on the Hilbert CurveMartin Krzywinski / Canada's Michael Smith Genome Sciences Centre / - Pi Day 2020 - Piku
Trance opera—Spente le Stellebe dramaticmore quotes

aleph: large

Scientific graphical abstracts — design guidelines

data visualization + art

Aleph 2 Music video from Max Cooper's Yearning for the Infinite album. Music by Max Cooper. Animation by Martin Krzywinski.
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi
Telling a story about infinity is tricky.
Especially a short one.

Infinity in Just Over Six Minutes

To see a World in a grain of sand,
And Heaven in a wild flower,
Hold Infinity in the palm of your hand
And Eternity in an hour.
—William Blake

Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Max Cooper at the London Barbican Hall performing Yearning for the Infinite. (Michal Augustini) (Watch Barbican Hall performance)

making of the video

The video was created with a custom-made kinetic typography system that emulates a low-fi terminal display. The system is controlled by a plain-text configuration file that defines scenes and timings—there is no GUI. There is also no post-processing of any kind, such as After Effects. Everything that you see in the final video was generated programatically. The code was written over a period of about a month.

This page describes this system's design in detail. Fair warning: it gets into the weeds quite a bit.

video length and format

The original music score for Aleph 2 is 6 minutes and 34 seconds in length.

The score tempo is 118 bpm (1.967 beats per second, 0.082 beats per frame). There are 194.1 measures in the video (29.5 measures per minute, 0.492 measures per second, 0.020 measures per frame).

The video format is 24 fps, so the track comprises 9,473 frames (1440 frames per minute, 48.814 frames per measure, 12.203 frames per beat, 3.051 frames per 16th note).

The typography uses the Classic Console font, expanded by me to include set theory characters such as `\aleph`, `\mathbb{N}`, `\mathbb{R}`, `\notin`, `\varnothing` and so on.

The video is 16:9 and each frame is rendered at 1,920 × 1,080. Text is set on a grid of 192 columns and 83 rows (maximum of 15,936 characters per frame).


The entire video is first initialized as an `(x,y,z)` matrix of size 192 × 83 × 9,473 (150,961,728 elements). The `z` dimension is the time dimension and each matrix slice, for example `(x,y,1)`, corresponds to the first frame.

The video is then built up from a series of scenes. Briefly, this process is single-threaded and takes about 30 minutes and during this time the matrix is populated with text characters. As the scenes are built up, each element in the matrix stays blank or has a colored character assigned to it. Periodically, effects are added such as random glitches. All elements are synchronized to the tempo of the score and transitions can be triggered from drum score midi files. For example, in parts, the background for each frame flashes to the kick and snare. I get into the detail of this below.

Once the matrix has been fully populated, each frame is output to a PNG file (e.g. 000000.png ... 009472.png). Below you can see 10 frames in the range 2490–2499 from the Bijection 2 section of the video at 1:42.

Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2490 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2491 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2492 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2493 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2494 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2495 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2496 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2497 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2498 of Aleph 2. (zoom)
Yearning for the Infinite - Max Cooper and Martin Krzywinski - Visualization of Infinity and Pi / Martin Krzywinski @MKrzywinski
Frame 2499 of Aleph 2. (zoom)

The frames were then stitched into a PNG movie in 24-bit RGB color space using ffmpeg.

ffmpeg -thread_queue_size 32 -r 24 -y -i "frames/aleph/%06d.png" $offset $seek -i wav/aleph.wav -pix_fmt rgb24 -avoid_negative_ts make_zero -shortest -r 24 -c:v png -f mov

The .mov file is then converted into DXV format used by the Resolume VJ software that Max Cooper uses in his shows to control the displays.

If you watch the video on YouTube, please know that the YouTube temporal and chroma compression greatly reduces the quality of the original 24-bit RGB master, which was used for the Barbican Hall performance. The YouTube compression bleaches out the vibrant red, which really pops out in the master version, and blurs frames during fast strobing, which neuters parts of the video that are designed to be overwhelming, such as the drop transition to 24 fps strobing during the natural number power set.

why a custom system?

Here is Max's original direction for the video's narrative and art style.

I want to show ever growing lists of numbers, then split the lists somehow (left/right of screen) and show how you can pair subsets of Aleph 0 / integers with themselves, then show how Cantor's diagonal argument can be used to pair the fractions with the natural numbers, then show his other diagonal argument for proving the reals are greater than Aleph 0, then show the process (very roughly and with maximum artistic licence if necessary) of taking the power set of infinite sets to create larger infinities to get up to Aleph 1, Aleph 2, and maybe further if the system allows. In the end it should just be complete text/number chaos on screen along with the intense chaos of the audio.

We'd have to be very careful to avoid any Matrix reference visually! ... but that low-fi / command line style would be suitable I think.

In terms of animation, the technical requirements were relatively simple. Everything would be rendered with a fixed-width font with no anti-aliasing and the color palette would be very limited (e.g. black, grey, white and a red for emphasis). Nothing other than characters would be drawn (no lines, circles or other geometrical shapes) and there would be no gradients. Basically, very lo-fi and 8-bit.

Despite the fact that I've never made any kind of animation before, I thought that these requirements could be relatively easy to achieve. After all, if worse came to worst, I told myself that I could always generate the video frame-by-frame.

However, it quickly became obvious that traditional keyframe systems could not be easily used to tell our story. Tools like Adobe After effects rely on interpolation between keyframes. But in our video every frame is essentially a keyframe. This meant that any kind of interpolation between scene points would have to be programmed—while After Effects makes it easy to move things around on the screen, it requires scripting to generate content based on lists of numbers, set elements, and so on. And since I have no experience in After Effects, I thought it would be faster to code my own system than to learn After Effects' expression language to (possibly) later discover that what I wanted to do was either hard or practically impossible for me to achieve within our time frame (a month).

I knew that I was reasonably good at prototyping and generating custom visualizations, so it felt safer to create something from scratch.

The final version of the system, which is very much a prototype, is about 6,000 lines of Perl. The Aleph 2 video is built out from about 2,000 lines of plain-text configuration that defines scenes, timings and effects.

system architecture

It took about a week to figure out how to design the system. As we were building out the video, I iterated between creating the story and creating the system to tell the story. This was iterative and felt very much like trying to simultaneously building and flying a plane.

At times, the entire process would crash down on me because some tiny tweak fundamentally changed how everything worked.

pesky timing notation

For example, one extremely nagging aspect of the code that I patched only half-way through the entire process had to do with how time was specified. From the start, I made use of measure:beat:16note notation, such as for scene starts and ends (e.g. 2:2 to 4:1 meant a scene started on measure 2 beat 2 and stopped at measure 4 beat 1).

This notation used 1-indexing (e.g. 1:1:1 is the first 16th note of the score). 0-indexing would have been unintuitive because musically one counts beats as 1, 2, 3, 4 and not 0, 1, 2, 3. Importantly, I wanted the way we referred to timings in conversation (e.g. on the "and of 2" of the 4th measure) to be directly reflected in the code.

All this made sense until I needed a notation to express duration. When I started using 1:1 to indicate a duration of 1 measure and 1 beat, I had to reconcile the difference between 1:1 as a point in time and as a duration—the former specified the beginning of the interval (e.g. frame=0) and the latter the end (frame=61). It also took me forever to decide whether the duration of 5 beats should be expressed as 1:2 (e.g. end is start of beat 2) or 1:1 (e.g. end is end of beat 1).

The fact that the video frame rate is 24 fps made things even more complicated. At this frame rate, there are 3.051 frames per 16th note. This meant that any duration of 1 frame (which happens during fast strobing) couldn't be expressed by the integer notation of measure:note:16note. I didn't want to have to write 0:0:0.3278, which seemed a tedious way of saying "1 frame". Furthermore, because frames didn't neatly match up to 16th note boundaries, quite a lot of time is spent checking that timing definitions don't suffer from rounding issues.

In the final video, you'll see the measure timer in the upper right corner. The : between the measure and beat flashes as a red + on the beat (118 bpm). Next to the measure timestamp you'll see a min:sec:frame timestamp and hexadecimal readout of the frame number.

definition of a scene

The video is composed of a series of scenes, each with a specific start and end time, as well as other parameters. The start and end of scenes was typically informed by changes in the music, such as breaks.

A scene is a short clip of the video in which similar content is being shown. For example, at the start where we count the natural numbers, each filling scene is a scene. Below are the timing definitions for the first three screens of natural counts.

name       = nc1
timing     = count1 
ease       = 0.6    
ease_step  = 0      
start      = 4m     # starts at measure 4
len        = 13m    # lasts for 13 measures
k_start    = 1      # start counting at 1

name       = nc2
timing     = nc_timing
ease       = 0.8
start      = 17m    # starts at measure 17
len        = 6m     # lasts for 6 measures

name       = nc3
timing     = nc_timing
ease       = 0.7
start      = 23m
len        = 2m

For each scene, I generate the numbers that will appear in the scene and then distribute their time of appearance across the duration of the scene, subject to some easing.

Some scenes are more complicated and consist of several screens worth of content. Once the screen is filled, it's cleared and then more content is shown. For example, in the bijection section the natural numbers down the center take 2 measures to fill at first but next time the fill takes less time at 2 * 0.8 measures. We keep multiplying the fill time by 0.8 until we reach a minimum of 6 frames.

<timing bijection-n>
time_to_fill_start = 2m
time_to_fill_mult  = 0.8
time_to_fill_end   = 6
time_pause_start   = 2
time_blank_start   = 2


A lot of the effects appear as set theory characters that display on the screen briefly in sync with the music.

text decay

Through the video, there are a lot of transitions and fades that look like text is decaying.

For example, a 9 might decay into an 8 then a 7 and all the way down to 0 and then be wiped from the screen. At other times, a 9 might decay by randomly selecting the next character from a set of lists, such as


So a 9 might decay as 9%]+ across some number of frames, depending on whether the decay is fast or slow.

The timing and speed of this decay process took a long time to tweak. There are many parts in the video where the decay speeds up, or slows down and many times the decay is triggered by percussion. For example, the snare can be made to initiate some characters on the screen to rapidly decay.

The decay is processed as an effect after the basic scene is laid out. During this process, we can look ahead in time to know what is coming at a given screen position and terminate the decay if, for example, a new character appears. Alternatively, the decay can bleed into the next scene, if the rate at which it happens is slow and crosses scene boundaries.

glitches and blips

Percussion, particularly the snare, are used to trigger characters to flash and decay.

For example, between measures 52 and 65, we flash one of the characters from the set braces = {, }. The braces appear only where characters are already on the screen (writeonchar = 1) with a probability of flash_rate = 0.1 at any given character position.

Immediately upon being drawn, these braces decay (see below) into the list of characters defined by decay_set = setrapid.

name        = snare
action      = midi_trigger
position    = full_screen
midi_sets   = snare
midi_action = cascade_red_star_subtle
start       = 52m
end         = 65m

<midi_action cascade_red_braces_subtle>
action = char_flash(flash_char_set=>"braces",writeonchar=>1,

syncing to the music

Because the drum component of the video was a live studio recording, I generated a midi file for the percussions by demarcating the wave form of their sound. This transcription process took a while and care had to be taken to identify the instruments correctly (e.g. snare vs rim shot) to have a list of events that matched the volume.

Once I had a list of all the percussion events, I could create effects that would appear to coincide with these times. For example, every time a snare was hit an effect could trigger 50% of the time.

By far, the bulk of the time in making this video was spent on tweaking the density, duration and decay of effects. Towards the end, between measures 160 and 167, when powersets of naturals and reals are being shown many effects are stacked. For example, the kick drum is used to decay and fade the natural powersets on the left of the screen while the snare does the same for the real powersets on the right.

In modern animation systems you would have an interface that could help in defining timings and effect curves, because my system is controlled by plain text inputs, of the kind that you see above, each set of effects was lovingly tuned by hand by adjusting measure starts and ends and rates in the configuration file.

These configuration files got to be quite long. For example, here are the scene definitions (storyboard.v3.conf), effect scenes (effects.conf) and the midi and decay timings (aleph.conf).

I don't expect that these will make total sense to anyone and post them here to give you more of a flavour of what is happening under the hood.


news + thoughts

Music for the Moon: Flunk's 'Down Here / Moon Above'

Sat 29-05-2021

The Sanctuary Project is a Lunar vault of science and art. It includes two fully sequenced human genomes, sequenced and assembled by us at Canada's Michael Smith Genome Sciences Centre.

The first disc includes a song composed by Flunk for the (eventual) trip to the Moon.

But how do you send sound to space? I describe the inspiration, process and art behind the work.

Martin Krzywinski @MKrzywinski
The song 'Down Here / Moon Above' from Flunk's new album History of Everything Ever is our song for space. It appears on the Sanctuary genome discs, which aim to send two fully sequenced human genomes to the Moon. (more)

Browse the genome discs.

Happy 2021 `\pi` Day—
A forest of digits

Sun 14-03-2021

Celebrate `\pi` Day (March 14th) and finally see the digits through the forest.

Martin Krzywinski @MKrzywinski
The 26th tree in the digit forest of `\pi`. Why is there a flower on the ground?. (details)

This year is full of botanical whimsy. A Lindenmayer system forest – deterministic but always changing. Feel free to stop and pick the flowers from the ground.

Martin Krzywinski @MKrzywinski
The first 46 digits of `\pi` in 8 trees. There are so many more. (details)

And things can get crazy in the forest.

Martin Krzywinski @MKrzywinski
A forest of the digits of '\pi`, by ecosystem. (details)

Check out art from previous years: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, 2016 `\pi` Day, 2017 `\pi` Day, 2018 `\pi` Day and 2019 `\pi` Day.

Testing for rare conditions

Sun 30-05-2021

All that glitters is not gold. —W. Shakespeare

The sensitivity and specificity of a test do not necessarily correspond to its error rate. This becomes critically important when testing for a rare condition — a test with 99% sensitivity and specificity has an even chance of being wrong when the condition prevalence is 1%.

We discuss the positive predictive value (PPV) and how practices such as screen can increase it.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Testing for rare conditions. (read)

Altman, N. & Krzywinski, M. (2021) Points of significance: Testing for rare conditions. Nature Methods 18:224–225.

Standardization fallacy

Tue 09-02-2021

We demand rigidly defined areas of doubt and uncertainty! —D. Adams

A popular notion about experiments is that it's good to keep variability in subjects low to limit the influence of confounding factors. This is called standardization.

Unfortunately, although standardization increases power, it can induce unrealistically low variability and lead to results that do not generalize to the population of interest. And, in fact, may be irreproducible.

Martin Krzywinski @MKrzywinski
Nature Methods Points of Significance column: Standardization fallacy. (read)

Not paying attention to these details and thinking (or hoping) that standardization is always good is the "standardization fallacy". In this column, we look at how standardization can be balanced with heterogenization to avoid this thorny issue.

Voelkl, B., Würbel, H., Krzywinski, M. & Altman, N. (2021) Points of significance: Standardization fallacy. Nature Methods 18:5–6.

Graphical Abstract Design Guidelines

Fri 13-11-2020

Clear, concise, legible and compelling.

Making a scientific graphical abstract? Refer to my practical design guidelines and redesign examples to improve organization, design and clarity of your graphical abstracts.

Martin Krzywinski @MKrzywinski
Graphical Abstract Design Guidelines — Clear, concise, legible and compelling.

"This data might give you a migrane"

Tue 06-10-2020

An in-depth look at my process of reacting to a bad figure — how I design a poster and tell data stories.

Martin Krzywinski @MKrzywinski
A poster of high BMI and obesity prevalence for 185 countries.