Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Thoughts rearrange, familiar now strange.Holly Golightly & The Greenhornes break flowers

science: fun



More than Pretty Pictures—Aesthetics of Data Representation, Denmark, April 13–16, 2015


The distinctive Perl camel is (c) O'Reilly
Perl Workshop Home Page
Home of the Bioinformatics Perl Workshop perl workshop > courses > introduction to perl (1.0.1.8)

course 1.0.1.8

Level: beginner
1.0.1.8 | beginner | 8 sessions
This eight session course will introduce you to basic Perl. Topics will include scalars, arrays, hashes, context, control structures, functions, reading and writing files, and look at Perl idioms. We'll also touch upon modules and the Comprehensive Perl Archive Network (CPAN).

legend

course code

cat.course.level.sessions.session

e.g. 1.0.1.8

categories

0 | introduction and orientation

1 | perl fundamentals

2 | shell and prompt tools

3 | web development

4 | CPAN Modules

5 | Ruby

levels

level: all all ( 0 )

level: beginner beginner ( 1 )

level: intermediate intermediate ( 2 )

level: advanced advanced ( 3 )

[ use while/until and if/unless to draw attention to positive/negative conditions ]
1.0.1.8 Introduction to Perl

course home

As course designed for absolute beginners, this will be your first formal introduction to Perl. We'll ease you into Perl by covering fundamentals like variables (scalars, arrays, hashes), control structures (for,while,do), and functions. You'll see how to read and write data to and from files and create scripts to help you automate your work. There will be readings from the "Learning Perl" book for each session and assignments which will help you practise and gain experience.


The camel is the distinctive Perl mascot.

other in this category

1.1.2.8 | Intermediate Perl

1.2.2.1 | Effective use of map, sort and grep in Perl

other by same level

3.0.1.3 | Introduction to CGI

3.1.1.1 | Introduction to mod_perl

5.1.1.5 | Introduction to Ruby

other by same instructor

Other courses by Martin Krzywinski.

0.0.0.1 | Orientation Session

0.1.0.1 | Two Problems

2.2.2.2 | Prompt Tools

1.1.2.8 | Intermediate Perl

2.1.2.4 | Data Mining and Analysis at the Command Line

4.0.2.1 | Spans and Sets

1.2.2.1 | Effective use of map, sort and grep in Perl

4.1.2.2 | Random Numbers and Distributions

news + thoughts

Two Factor Designs

Tue 09-12-2014

We've previously written about how to analyze the impact of one variable in our ANOVA column. Complex biological systems are rarely so obliging—multiple experimental factors interact and producing effects.

ANOVA is a natural way to analyze multiple factors. It can incorporate the possibility that the factors interact—the effect of one factor depends on the level of another factor. For example, the potency of a drug may depend on the subject's diet.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Two Factor Designs. (read)

We can increase the power of the analysis by allowing for interaction, as well as by blocking.

Krzywinski, M., Altman, (2014) Points of Significance: Two Factor Designs Nature Methods 11:1187-1188.

Background reading

Blainey, P., Krzywinski, M. & Altman, N. (2014) Points of Significance: Replication Nature Methods 11:879-880.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of variance (ANOVA) and blocking Nature Methods 11:699-700.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

...more about the Points of Significance column

Nested Designs—Assessing Sources of Noise

Mon 29-09-2014

Sources of noise in experiments can be mitigated and assessed by nested designs. This kind of experimental design naturally models replication, which was the topic of last month's column.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Nested designs. (read)

Nested designs are appropriate when we want to use the data derived from experimental subjects to make general statements about populations. In this case, the subjects are random factors in the experiment, in contrast to fixed factors, such as we've seen previously.

In ANOVA analysis, random factors provide information about the amount of noise contributed by each factor. This is different from inferences made about fixed factors, which typically deal with a change in mean. Using the F-test, we can determine whether each layer of replication (e.g. animal, tissue, cell) contributes additional variation to the overall measurement.

Krzywinski, M., Altman, N. & Blainey, P. (2014) Points of Significance: Nested designs Nature Methods 11:977-978.

Background reading

Blainey, P., Krzywinski, M. & Altman, N. (2014) Points of Significance: Replication Nature Methods 11:879-880.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of variance (ANOVA) and blocking Nature Methods 11:699-700.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

...more about the Points of Significance column

Replication—Quality over Quantity

Tue 02-09-2014

It's fitting that the column published just before Labor day weekend is all about how to best allocate labor.

Replication is used to decrease the impact of variability from parts of the experiment that contribute noise. For example, we might measure data from more than one mouse to attempt to generalize over all mice.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Replication. (read)

It's important to distinguish technical replicates, which attempt to capture the noise in our measuring apparatus, from biological replicates, which capture biological variation. The former give us no information about biological variation and cannot be used to directly make biological inferences. To do so is to commit pseudoreplication. Technical replicates are useful to reduce the noise so that we have a better chance to detect a biologically meaningful signal.

Blainey, P., Krzywinski, M. & Altman, N. (2014) Points of Significance: Replication Nature Methods 11:879-880.

Background reading

Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of variance (ANOVA) and blocking Nature Methods 11:699-700.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

...more about the Points of Significance column

Monkeys on a Hilbert Curve—Scientific American Graphic

Tue 19-08-2014

I was commissioned by Scientific American to create an information graphic that showed how our genomes are more similar to those of the chimp and bonobo than to the gorilla.

I had about 5 x 5 inches of print space to work with. For 4 genomes? No problem. Bring out the Hilbert curve!

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Our genomes are much more similar to the chimp and bonobo than to the gorilla. And, we're practically still Denisovans. (details)

To accompany the piece, I will be posting to the Scientific American blog about the process of creating the figure. And to emphasize that the genome is not a blueprint!

As part of this project, I created some Hilbert curve art pieces. And while exploring, found thousands of Hilbertonians!

Happy Pi Approximation Day— π, roughly speaking 10,000 times

Wed 13-08-2014

Celebrate Pi Approximation Day (July 22nd) with the art of arm waving. This year I take the first 10,000 most accurate approximations (m/n, m=1..10,000) and look at their accuracy.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Accuracy of the first 10,000 m/n approximations of Pi. (details)

I turned to the spiral again after applying it to stack stacked ring plots of frequency distributions in Pi for the 2014 Pi Day.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frequency distribution of digits of Pi in groups of 4 up to digit 4,988. (details)

Analysis of Variance (ANOVA) and Blocking—Accounting for Variability in Multi-factor Experiments

Mon 07-07-2014

Our 10th Points of Significance column! Continuing with our previous discussion about comparative experiments, we introduce ANOVA and blocking. Although this column appears to introduce two new concepts (ANOVA and blocking), you've seen both before, though under a different guise.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Analysis of variance (ANOVA) and blocking. (read)

If you know the t-test you've already applied analysis of variance (ANOVA), though you probably didn't realize it. In ANOVA we ask whether the variation within our samples is compatible with the variation between our samples (sample means). If the samples don't all have the same mean then we expect the latter to be larger. The ANOVA test statistic (F) assigns significance to the ratio of these two quantities. When we only have two-samples and apply the t-test, t2 = F.

ANOVA naturally incorporates and partitions sources of variation—the effects of variables on the system are determined based on the amount of variation they contribute to the total variation in the data. If this contribution is large, we say that the variation can be "explained" by the variable and infer an effect.

We discuss how data collection can be organized using a randomized complete block design to account for sources of uncertainty in the experiment. This process is called blocking because we are blocking the variation from a known source of uncertainty from interfering with our measurements. You've already seen blocking in the paired t-test example, in which the subject (or experimental unit) was the block.

We've worked hard to bring you 20 pages of statistics primers (though it feels more like 200!). The column is taking a month off in August, as we shrink our error bars.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Analysis of Variance (ANOVA) and Blocking Nature Methods 11:699-700.

Background reading

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — t-tests Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

...more about the Points of Significance column

Designing Experiments—Coping with Biological and Experimental Variation

Thu 29-05-2014

This month, Points of Significance begins a series of articles about experimental design. We start by returning to the two-sample and paired t-tests for a discussion of biological and experimental variability.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Designing Comparative Experiments. (read)

We introduce the concept of blocking using the paired t-test as an example and show how biological and experimental variability can be related using the correlation coefficient, ρ, and how its value imapacts the relative performance of the paired and two-sample t-tests.

We also emphasize that when reporting data analyzed with the paired t-test, differences in sample means (and their associated 95% CI error bars) should be shown—not the original samples—because the correlation in the samples (and its benefits) cannot be gleaned directly from the sample data.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Designing Comparative Experiments Nature Methods 11:597-598.

Background reading

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — t-tests Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Have skew, will test

Wed 28-05-2014

Our May Points of Significance Nature Methods column jumps straight into dealing with skewed data with Non Parametric Tests.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Non Parametric Testing. (read)

We introduce non-parametric tests and simulate data scenarios to compare their performance to the t-test. You might be surprised—the t-test is extraordinarily robust to distribution shape, as we've discussed before. When data is highly skewed, non-parametric tests perform better and with higher power. However, if sample sizes are small they are limited to a small number of possible P values, of which none may be less than 0.05!

Krzywinski, M. & Altman, N. (2014) Points of Significance: Non Parametric Testing Nature Methods 11:467-468.

Background reading

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — t-tests Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Mind your p's and q's

Sat 29-03-2014

In the April Points of Significance Nature Methods column, we continue our and consider what happens when we run a large number of tests.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Comparing Samples — Part II — Multiple Testing. (read)

Observing statistically rare test outcomes is expected if we run enough tests. These are statistically, not biologically, significant. For example, if we run N tests, the smallest P value that we have a 50% chance of observing is 1–exp(–ln2/N). For N = 10k this P value is Pk=10kln2 (e.g. for 104=10,000 tests, P4=6.9×10–5).

We discuss common correction schemes such as Bonferroni, Holm, Benjamini & Hochberg and Storey's q and show how they impact the false positive rate (FPR), false discovery rate (FDR) and power of a batch of tests.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part II — Multiple Testing Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I — t-tests Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Happy Pi Day— go to planet π

Fri 21-03-2014

Celebrate Pi Day (March 14th) with the art of folding numbers. This year I take the number up to the Feynman Point and apply a protein folding algorithm to render it as a path.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Digits of Pi form landmass and shoreline. (details)

For those of you who liked the minimalist and colorful digit grid, I've expanded on the concept to show stacked ring plots of frequency distributions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frequency distribution of digits of Pi in groups of 6 up to the Feynman Point. (details)

And if spirals are your thing...

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Frequency distribution of digits of Pi in groups of 4 up to digit 4,988. (details)

Have data, will compare

Fri 07-03-2014

In the March Points of Significance Nature Methods column, we continue our discussion of t-tests from November (Significance, P values and t-tests).

We look at what happens how uncertainty of two variables combines and how this impacts the increased uncertainty when two samples are compared and highlight the differences between the two-sample and paired t-tests.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Comparing Samples — Part I. (read)

When performing any statistical test, it's important to understand and satisfy its requirements. The t-test is very robust with respect to some of its assumptions, but not others. We explore which.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Comparing Samples — Part I Nature Methods 11:215-216.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Circos at British Library Beautiful Science Exhibit

Thu 06-03-2014

Beautiful Science explores how our understanding of ourselves and our planet has evolved alongside our ability to represent, graph and map the mass data of the time. The exhibit runs 20 February — 26 May 2014 and is free to the public. There is a good Nature blog writeup about it, a piece in The Guardian, and a great video that explains the the exhibit narrated by Johanna Kieniewicz, the curator.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Circos at the British Library Beautiful Science exhibit. (about exhibit)
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Mailed invitation to the exhibit features my science art. (zoom)

I am privileged to contribute an information graphic to the exhibit in the Tree of Life section. The piece shows how sequence similarity varies across species as a function of evolutionary distance. The installation is a set of 6 30x30 cm backlit panels. They look terrific.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Circos Circles of Life installation at Beautiful Science exhibit at the British Library. (zoom)

Think outside the bar—box plots

Fri 31-01-2014

Quick, name three chart types. Line, bar and scatter come to mind. Perhaps you said pie too—tsk tsk. Nobody ever thinks of the box plot.

Box plots reveal details about data without overloading a figure with a full frequency distribution histogram. They're easy to compare and now easy to make with BoxPlotR (try it). In our fifth Points of Significance column, we take a break from the theory to explain this plot type and—I hope— convince you that they're worth thinking about.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Visualizing samples with box plots. (read)

The February issue of Nature Methods kicks the bar chart two more times: Dan Evanko's Kick the Bar Chart Habit editorial and a Points of View: Bar charts and box plots column by Mark Streit and Nils Gehlenborg.

Krzywinski, M. & Altman, N. (2014) Points of Significance: Visualizing samples with box plots Nature Methods 11:119-120.

Wired Data|Life 2013 talk

Thu 05-12-2013

I recently presented at the Wired Data|Life 2013 conference, sharing my thoughts on The Art and Science of Data Visualization.

For specialists, visualizations should expose detail to allow for exploration and inspiration. For enthusiasts, they should provide context, integrate facts and inform. For the layperson, they should capture the essence of the topic, narrate a story and deligt.

Wired's Brandon Keim wrote up a short article about me and some of my work—Circle of Life: The Beautiful New Way to Visualize Biological Data.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The Art and Science of Data Visualization (PDF)

Power and Sample Size

Fri 31-01-2014

Experimental designs that lack power cannot reliably detect real effects. Power of statistical tests is largely unappreciated and many underpowered studies continue to be published.

This month, Naomi and I explain what power is, how it relates to Type I and Type II errors and sample size. By understanding the relationship between these quantities you can design a study that has both low false positive rate and high power.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Power and Sample Size. (read)

Krzywinski, M. & Altman, N. (2013) Points of Significance: Power and Sample Size Nature Methods 10:1139-1140.

20 imperatives of science—limits of evidence

Fri 22-11-2013

20 Tips for Interpreting Scientific Claims is a wonderful comment in Nature warning us about the limits of evidence.

I've made a poster (download hires PDF, PNG) of this list, grouping them into categories that are my own. Thrust this into everyone's hands, including your own.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
20 tips for interpreting scientific claims. From Sutherland et al, Nature 2013. (PDF, PNG, read article)

Sutherland WJ, Spiegelhalter D & Burgman M (2013) Policy: Twenty tips for interpreting scientific claims. Nature 503:335–337.

Significance, P values and t-tests

Fri 31-01-2014

Have you wondered how statistical tests work? Why does everyone want such a small P value?

This month, Naomi and I explain how significance is measured in statistics and remind you that it does not imply biological significance. You'll also learn why the t-distribution is so important and why its shape is similar to that of a normal distribution, but not quite.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Significance, P values and t-tests. (read)

Krzywinski, M. & Altman, N. (2013) Points of Significance: Significance, P values and t-tests Nature Methods 10:1041-1042.

Drinks & Science Workshop: Effective Presentations and Slides

Thu 10-10-2013

Your slides are not your presentation. They are a representation of your presentation.

Effective presentations require that you have a clear narrative—control detail and emphasis to deliver your message. Engage the audience early. Don't dump on them.

Effective slides are visual cues. Show only what you can't easily say. Text should acts as emphasis. Don't read.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Drinks & Science Workshop: Effective Presentations and Slides. Science Online Vancouver. (workshop slides)

A workshop I gave on Oct 8th at Science Online Vancouver at Science World.

Error Bars

Mon 30-09-2013

Error bar overlap does not imply significance. Error bar gap does not imply lack of significance. Chances are you find these statements surprising.

You've seen and used error bars. But do you understand how to interpret them in the context of statistical signifiance? This month we address the most common (and commonly misunderstood) method of visualizing uncertainty.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Error Bars. (read)

We discuss error bars based on standard deviation, standard error of the mean and confidence intervals. It turns out that none of these behave as our intuition would wish.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Error Bars Nature Methods 10:921-922.

Launch of Nature Methods Statistics Column

Mon 30-09-2013

This month, Nature Method is launching Points of Significance a new column to educate, enlighten and, if possible, entertaining bench scientists about statistics.

I will be working closely with with Naomi Altman from The Pennsylvania State University and Dan Evanko, the Chief Editor at Nature Methods, to make the column engaging and useful.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Importance of Being Uncertain. (read)

Our first publication — The Importance of Being Uncertain — acknowledges not only the imperative of being right about how we're wrong, but also our appreciation for Oscar Wilde.

Krzywinski, M. & Altman, N. (2013) Points of Significance: Importance of Being Uncertain Nature Methods 10:809-810.

Points of View — The Collection

Tue 30-07-2013

Interested in data visualization? The Points of View columns are an excellent way to learn practical tips and design principles that help you communicate clearly. All the columns are now available as a collection, and open access during August 2013.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The full collection of Nature Methods Points of View columns is now available for free for the month of August. (collection, more about Points of View)

The columns were written by Bang Wong, Martin Krzywinski, Nils Gehlenborg, Cydney Nielsen, Noam Shoresh, Rikke Schmidt Kjærgaard, Erica Savig and Alberto Cairo.

Storytelling with Graphics

Tue 30-07-2013

This month, Alberto Cairo and I examine the importance of storytelling in presenting data. A strong narrative captures the reader's attention, informs and inspires.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of View column: Storytelling. (download, more about Points of View)

Instead of "explain, not merely show," seek to "narrate, not merely explain."

Analyze as a specialist, present as a communicator

Thu 25-07-2013

The distinction between the specialist and the communicator was made by Albert Cairo at 2013 Bloomberg Design Conference. I have used this principle to structure my talk to the UBC Tableau Users Group.

Design is algorithmics for the page. Use its principles to inform how to choose from among the options offered by your software. Recognize the limitations of your tool, as well as those features that are ineffective.

Don't practise visual intuitics—use shapes whose size and proportion can be well judged.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
What we see isn't always what it is. The luminance effect powerfully affects our interpretation of tone and color. (download talk)

Real Human Genome Art

Tue 16-07-2013

A collaboration of science and art with Joanna Rudnick and Aaron De La Cruz.

The science of cancer genomics will be interpreted by individuals whose lives are affected by genomic mutations using the art style of Aaron De La Cruz.

Beautiful, meaningful and personal.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
A day of collaboration between science, art and people affected with cancer-causing mutations. (...more)

Multidimensional data

Thu 27-06-2013

This month, Erica Savig and I look at the design process for a figure from her paper Multiplexed mass cytometry profiling of cellular states perturbed by small-molecule regulators. The underlying data set has 1.2 billion individual observations, categorized by drug, cell line, protein and stimulation condition.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of View column: Multidimensional data. (download, more about Points of View)

Bodenmiller B, Zunder ER, Finck R et al. 2012 Multiplexed mass cytometry profiling of cellular states perturbed by small-molecule regulators Nature Biotechnology 30:858-867.

Although spatial encoding is the most perceptually accurate, in this case it's not the best channel to display quantitative information. Instead, the x/y position on the page is used to organize small multiples of the network of affected proteins.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Data meets pointilism. The full data set was used to create the cover of the September 2012 issue of Nature Biotechnology. (about the cover)

Choosing Plotting Symbols

Thu 30-05-2013

In this months column, Bang and I consider how to choose effective plotting symbols in the Points of View column Plotting Symbols.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of View column: Plotting Symbols. (download, more about Points of View)

Choose symbols that overlap without ambiguity and communicate relationships in data.

Figure Design and Writing — Two Goals, One Process

Mon 29-04-2013

This month I look at how creating effective figures is similar to the process of writing well in the Points of View column Elements of Visual Style.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of View column: Elements of Visual Style. (download, more about Points of View)

Using Strunk's Elements of Style as an example of writing guidelines, I look how these can be translated to creating figures.

VIZBI 2013 Keynote—Visual Design Principles

Wed 27-03-2013

When we create figures, we must communicate and design. In my talk I discuss some of the rules that turn graphical improvisation into a structured and reproducible process.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Try to focus on a spot in these posters that celebrate Pi day. (download talk)

The fractal tree was created with OneZoom, which received the best poster award at the conference.

Happy Pi Day— 3.14

Thu 14-03-2013

Celebrate Pi Day (March 14th) with a funky modern posters. Transcend, don't repeat, yourself and watch the dots shimmer.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Try to focus on a spot in these posters that celebrate Pi day. (download posters)

The posters were inspired by the beautiful AIDS posters by Elena Miska.

For the Love of Type

Thu 07-03-2013

I am always drawn to type and periodically I must do something about it.

If you were a type, what type would you be? Me, Gill Sans on weekdays and Perpetua on the weekend.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Finding intrigue and consensus within and among letters. (typography posters)

Return of Nature Methods Points of View

Tue 26-02-2013

I take over from Bang Wong as primary contributor to the Points of View column, a monthly advice and opinion piece about data visualization and information and figure design in molecular biology.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of View column returns. (read more, Nature Methods blog)

Nature Encode Explorer

Tue 26-02-2013
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature uses Circos motif on cover and interactive ENCODE data explorer. (read more)

Nature's special issue dedicated to the Encode Project uses the Circos motif on its cover as well as the interactive Encode Explorer, which is available as an app at iTunes.

Bloomberg Businessweek Design Conference

Wed 23-01-2013

Together with Alberto Cairo, and then in conversation with Sam Grobart, I presented about science and design at Bloomberg's Businessweek Design Conference in San Francisco.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Science loves design, but doesn't always know it. (download talk)

ICDM2012 Keynote — Needles in Stacks of Needles

Thu 13-12-2012

My ICDM2012 keynote on genomics and data mining: Needles in Stacks of Needles.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Computers compute but humans are ultimately responsible for identifying what is relevant and useful. (abstract, download talk, ICDM2012)

Genome Research cover

Wed 14-11-2012

Creating strings of genome jewelery. Read about how it was done.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Cover image accompanying Spark: A navigational paradigm for genomic data exploration. Genome Research 22 (11). (details, Genome Research)

The design accompanies Cydney Nielsen's Spark manuscript, which appeared in Genome Research.

Biovis 2012 — Getting into Visualization of Large Biological Data Sets

Tue 16-10-2012

Guidelines for data encoding and visualization in biology, presented presented at Biovis 2012 (Visweek 2012).

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
20 imperatives of information design. (Krzywinski et al biovis2012)

2012 Presidential Debates — a Lexical Analysis

Thu 04-10-2012

Building on the method I used to analyze the 2008 debates, I look at the 2012 Debates between Obama and Romney, lexically speaking. Obama speaks to "folks", while Romney fearmongers with "kill" and "hurt".

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Analysis of word usage by parts of speech for Obama and Romney reveals insight into each candidate.

Trends in Genetics cover

Fri 28-09-2012

Making things round, not square. Read about how it was done.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature and human life are as various as our several constitutions. Who shall say what prospect life offers to another? —Henry David Thoreau

A Circos-based design for the cover of the human genetics special issue of Trends in Genetics (Trends in Genetics October 2012, 28 (10)).

Science needs words

Thu 03-05-2012

And usually, really long and funny ones.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Scientists love new words, when the old ones aren't long enough.

My neologisms were picked up by James Gorman of the New York Times in an article Ome, the sound of the scientific universe expanding.

PNAS cover

Tue 01-05-2012

Biology or astrophysics? Read about how it was done.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Hint: biology.

The image was published on the cover of PNAS (PNAS 1 May 2012; 109 (18))

the art of numbers

Sat 14-04-2012

Numerology is bogus but art based on numbers has a beautiful random quality. Oh, and none of the metaphysical baggage.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Distribution of the first 3,422, 13,689 and 123,201 digits of π.
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Progression and transition probabilities of digits in e, φ and π.

accidental similarity number

Tue 20-03-2012

The quantity formed by the overlap of two or more numbers.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The accidental similarity number of π, φ and e.

the 4ness of pi

Fri 13-04-2012

How much 4ness does π have?

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The iness of each digit of π generalizes 4ness. It measures the similarity of the digit to its neighbours.

Compare the iness of π to that of the other famous transcendental number, e, and the mysterious but attractive Golden Ratio, φ.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The iness of e and φ.

ASCII Illustration—Outer Space, Sequence and Typography

Mon 23-01-2012

I have found a way to combine my curiosity about space, fear of large sequence assemblies and love of typography in a single illustration. Inspired by typographical portraits, I wanted to automate representing an image with multiple font weights, while sampling characters from a quote or debate transcripts.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Part of the Pioneer plaque rendered with the sequence of human chromosome 1, using 4 layers of sizes (17pt, 33pt, 59pt and 93pt) and 8 weights of Gotham.

Tangering Tango—Color of 2012

Tue 17-01-2012

If you made widgets, you could be justified in campaigning a widget of the year. Business acumen suggests it should be one of your widgets. Pantone has done exactly that, naming their 17-1463 color (tangerine tango), as color of the year 2012.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Tangerine Tango - Pantone's color of the year.

I prefer green—green jive.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Green Jive - My own color of the year.

World's Most Expensive Photograph

Thu 10-11-2011

I really like the world's most expensive photograph, Rhein II by Andreas Gursky. Cautious use of the word "expensive" should be practised — in this case, merely meaning that only one person saw the $4.3 million price tag. Others saw lower prices, or no price tag at all.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Rhein II by Andreas Gursky. $4.3 million.

Here's my own attempt at such compositions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Near Jokulsarlon on the way to Hofn, Iceland.

Adobe Swatches for Brewer Palettes

Fri 28-10-2011

I could not find Illustrator swatch files for this awesome color resource, so I created them myself.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Brewer palettes are ideal for information design. Download Illustrator swatch files (.ase .ai).

If you're interested in color and design and don't know about Brewer palettes, see my presentation.

Global Visualization of Google Searches by Language

Fri 28-10-2011

World-wide Google searches, categorized by one of 21 languages, are visualized with WebGL, available from Chrome Experiments. The data offers some fascinating insights such as (a) in what two places in the US are Google searches in Chinese are performed? (b) what are the most remote locations are from which Google searches were detected? (c) Why is Istanbul the 3rd top location for searches? Why is Miami in the top 10?

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Global visualization of google searches by language reveals English dominates (42% searches) with Spanish a distant second (14%) and German and French third (7% each).

Download geotagged data.

PSA Genomics Workshop Slides

Fri 28-10-2011

Designing effective visualizations in the biological sciences.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Neither communication nor design are purely subjective.

Circos and Hive Plots: Challenging visualization paradigms in genomics and network analysis.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Neither communication nor design are purely subjective.

Tor tor & Loa Loa — 546 Organisms with Same Genus and Species

Sat 09-07-2011

In a recent conversation, I was challenged to name as many organisms with the same genus and species as I could. Neither a biologist, and especially not a taxonomist, my responses were limited to organisms with sequenced genomes I had come across in the literature. Immediately to mind sprung Gallus gallus (chicken) and ... nothing else. Well, that was embarrassing.

I was suddently taken up by the urge to find all instances of this occurrence. Using resources at the NCBI Taxonomy Browser I downloaded the NCBI taxonomy table which contains 1,097,405 entries in the names.dmp file (not all of these are unique genus/species combinations).

To my suprise I discovered that my performance in this challenge was beyond dysmal. In fact, there are 380 genuses which contain organisms that have the same genus and species name. Most of them (317) include a single organism, but some have many. For example the genus Salamandra has 14 organisms with the species salamandra, including Salamandra salamandra, Salamandra salamandra crespoi and Salamandra salamandra morenica. The genus Regulus has 13 organisms, including Regulus regulus azoricus, Regulus regulus japonensis and Regulus regulus regulus (these are all Goldcrests).

In total, there are 546 unique entries, when organisms with a unique subspecies name are considered distinct. If subspecies is not considered, the number of organisms with the same genus as species (i.e., regardless of subspecies) is 383. Here are organisms whose genus/species name is shorter than 6 letters (82 entries).

Shortest Species/Genus Duplicates (82, 5 letters or less)

Agama agama, Alces alces, Alle alle, Alosa alosa, Anser anser, Appia appia, Apus apus, Arita arita, Arius arius, Aroma aroma, Axis axis, Badis badis, Bagre bagre, Bison bison, Boops boops, Brama brama, Bubo bubo, Bufo bufo, Bulla bulla, Buteo buteo, Butis butis, Catla catla, Chaca chaca, Conta conta, Crex crex, Cynea cynea, Dama dama, Dario dario, Diuca diuca, Dives dives, Ensis ensis, Equus equus, Ficus ficus, Gemma gemma, Gesta gesta, Glis glis, Gobio gobio, Grus grus, Guira guira, Gulo gulo, Hara hara, Hucho hucho, Huso huso, Indri indri, Irus irus, Juga juga, Labeo labeo, Lima lima, Loa loa, Lota lota, Lutra lutra, Lynx lynx, Meles meles, Melo melo, Meza meza, Mitu mitu, Mola mola, Molva molva, Mops mops, Myaka myaka, Naja naja, Nasua nasua, Papio papio, Pauxi pauxi, Perna perna, Pica pica, Pipa pipa, Pipra pipra, Plica plica, Rapa rapa, Rita rita, Sarda sarda, Sisko sisko, Solea solea, Sula sula, Suta suta, Tinca tinca, Todus todus, Tor tor, Uncia uncia, Vimba vimba, Volva volva.

Longest Species/Genus Duplicates (5, 14 letters or more)

Coccothraustes coccothraustes

Labiostrongylus labiostrongylus

Macrobilharzia macrobilharzia

Macropostrongylus macropostrongylus

Xanthocephalus xanthocephalus

The nematode worm Macropostrongylus macropostrongylus has the honour of being the longest genus/species duplicate organism. Given this distinction, it is surprising that Pubmed returns only 2 papers that refer to it.

Dataset

Download the full list. The number next to each ENTRY field is the NCBI Taxonomy ID for the organism. In a small number of cases there are ambiguities in parsing the data file (e.g. Troglodytes cf. troglodytes PS-2, Troglodytes sp. troglodytes PS-1). I left these in.

Visual Acuity and Sequence Visualization

Tue 03-04-2012

Visual acuity limits of the human eye restrict the resolution at which we can comfortably visualize data.

In this short guide, I explain why dividing a scale into no more than 500 divisions is a good idea.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Visualizing 1.5 Mb (S. cerevisiae chrIV) in a 183 mm wide figure (size limit in Nature for double column figures) restricts scale division to 2.9 kb to ensure comfortable reading.

2011 EMBO Journal Cover Contest

Tue 14-06-2011

For the EMBO Journal 2011 Cover Contest, I prepared two entries, one for the scientific category and one for the non-scientific category.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The non-scientific entry is abstract photo of fiber optics. The scientific entry was an information graphic showing a hive panel of genomic annotations in human, mouse and dog genomes. The hive panel is based on the use of the newly introduced hive plot.

The 2011 winners have been announced. My non-scientific entry (photo of fiber optics) received honourable mention and was included in the Favourites of the Jury gallery.

New Circos Domains

Tue 03-05-2011

Until now, Circos did not have its own domain name, having been served from the lengthy and boring http://mkweb.bcgsc.ca/circos.

Circos - circular and genome visualization - now at the new domain circos.ca

Recently, I was surprised to find out that the following domains were available

All these now point to the Circos site.

ee spammings - beautiful language of spam poetry

Mon 02-05-2011

ee spammings are spam edited into a format reminiscent of the poetry of ee cummings. Unwanted solicitations for questionable endeavours and products suddenly turn into heady words of the new literature. Art suddenly freed from the husk of spam.

ee spammings - beautiful language of spam poetry

Literature 2.0 — from unlikely origins.


Here's one example that emphasizes that today is ok.

i got 
to touch you

i like us 
and know the more. 

believe
       recontact 
me

today ok!
but matters

waiting for
           happy

I now have over 20 ee spammings — enjoy them all.

Neologisms - New Words, Much Needed

Tue 26-04-2011

What do inconversible, mystific, postpetizer, prenopsis and suscitate have in common?

They are words that don't exist, but should. Learn new words.

Hive Plot Ads

Sat 16-04-2011

Download large ads: 00 01 02 03 04

World's Most Popular Questions
Today's Zeitgeist

Mon 28-03-2011
World's

What are the world's top questions?

Using Google's autocomplete feature, I have tabulated the world's most popular questions. By combining a interrogative term, such as what, who or why, with a term from a related set, such as do I, can I, and can't I, it is possible to sample the space of questions and obtain the most popular for a given start word combination.

I have tabulated the most popular questions by category.

general limits & desires
love money
career & education health
sizes & extremes religion & faith

Science

What kind of questions about science are people asking? From the Career & Education section,

  • Can biology lead to new theorems?
  • Can physics explain miracles?
  • Can math be fun?
  • Can science and religion coexist?
  • Can history repeat itself?
  • Can psychology be morally neutral?

Curios

What are the strangest questions? I'll let you explore, but these have me wondering:

  • Has the world gone mad or is it me?
  • Why can't I hold all these limes?
  • What happens if I make a formal commitment to Satan?
  • Why can't I sell my kidney?
  • Who is the most powerful Jedi?
  • Can Jesus microwave a burrito?
  • Where is the hardest part of your head?

Circos Table Browser

Thu 24-03-2011

Circos can be used to visualize tabular data, such as spreadsheets.

Circos - Circular Visualization of Tabular Data - Martin Krzywinski

1,000s of tables have already been visualized. Has yours?

648 Ratios

Thu 17-02-2011

Hive plots are excellent at visualizing ratios. They're not just an anti-hairball network visualization agent.

Below are visualized 3 x 8 x 27 = 648 (axes, ribbons, plots) ratios visualized.

Hive Plots - Network Visualization - Ratio Visualization - Martin Krzywinski

The image above compares the relative ratios of region annotations in human, mouse and dog genomes.

Cáceres Creativa - Model and Strategy for Urban Innovation

Fri 11-02-2011
$alt

Cáceres is a small city of 100,000 inhabitants in western Spain, where the city government is promoting Cáceres Creativa, a project to build citizens collaboratively sustainable future for the city based on activating the creative capacity of the population.

The project has been published as a book (excerpt), which provides a basis for working with city residents and businesses in this collaborative design. $alt

Circos proved useful in showing the complex relationships that are established in such an environment is a city which combines flows of energy and resources, physical items and intellectual concepts. The online Circos tableviewer was used to generate the images.

Storage Cluster

Fri 11-02-2011

Taking photos of inanimate objects is rewarding. Your subject doesn't complain, nor move, and a coffee break fits naturally into the workflow at any time. In this case, the inanimate object is over 3 Pb (3,000 Tb) of storage composed of a variety of Netapp appliances.

Genome Sciences Center Genesis Compute Cluster

Using three gelled Hensel Integras (500 Ws monoheads — here I'm using only the modelling light for illumination along with red, blue and green filters) (lighting details), I spent some time getting to know the components up close.

Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski

Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Storage Cluster / Lumondo Photography / Martin Krzywinski

See more photos.

All photos by Martin Krzywinski (Lumondo Photography).

Genesis 1.0

Fri 11-02-2011

Our new compute cluster has been released to the user community.

Genome Sciences Center Genesis Compute Cluster

This cluster consists of 420 compute nodes each with 12 cores and 48GB RAM, totaling 5,040 cores and 20TB RAM. Each node has 160GB local /tmp space and all nodes are tied together over an Inifiniband 40Gbs network.

Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski

The nodes all have access to a dedicated storage system over the Infiniband Network running GPFS with a total 700TB of usable scratch space. The filesystem is served by 8 IBM x3850 servers. All nodes are running CentOS5.4 and are using open source Grid Engine 6.2u5 as their scheduler.

Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski Genome Sciences Center Genesis Compute Cluster / Lumondo Photography / Martin Krzywinski

Lighting details and more photos.

All photos by Martin Krzywinski (Lumondo Photography).

1 First the server room was expanded 2 It was empty and without racks, and the lights were dim. Sysadmins scurried about and unpacked equipment 3 The circuit was closed and there were electrons 4 IT staff were pleased and accounts were handed out to users 5 Who had work they called "important" 6 But which the IT staff merely called "jobs".

Photo

Thu 17-02-2011

Periodically, I take my camera, point it at things. Here, I'll share a favourite from my creations.

Lumondo Photography - Martin Krzywinski - Diving Horror

This image — I will keep the subject a mystery — gives me the same feeling as some of the Hubble images. For this shot, I didn't need to reach orbit.

Lumondo Photography - Martin Krzywinski

Other images in this series are available on flickr.

I also like geometry and lines. This shot is a tense composition of the Hancock Building at Copley Square in Boston.

Lumondo Photography - Martin Krzywinski - Boston Copley Square

and an assortment of baggage carts at St Pancreas station (London) which catches the eye.

Lumondo Photography - Martin Krzywinski - St Pancreas Station - London

I like to collect time in a photo, be it uniformly as in this diptych of street and traffic lights from a moving car

Lumondo Photography - Martin Krzywinski - Driving Lights

or blended, as in this skyline of Vancouver showing the flow of time from 5.30pm to 9.30pm.

Lumondo Photography - Martin Krzywinski - High Dynamic Time Range Photography - Vancouver Skyline

WIZARD — Longest English Reverse Complement

Wed 26-01-2011

DNA is composed of two strands, which are complementary. Given a sequence, its reverse complement is created by swapping A/T and G/C and writing the remapped sequence backwards (e.g. ATGC is first remapped to TACG and then reversed to GCAT).

Consider the corresponding concept applied to English words (or any language, for that matter). First, construct the complementarity map, which assigns to the nth letter of the alphabet the N-n letter, given an alphabet of N letters.

abcdefghijklmnopqrstuvwxyz
||||||||||||||||||||||||||
zyxwvutsrqponmlkjihgfedcba

For example, a becomes z, b becomes y, and so on. To create a reverse complement of a word, apply this mapping and then reverse the new word (e.g. 'dog' is remapped to 'wlt' and then reversed to obtain 'tlw').

So far, that's not very exciting.

But consider the question: What is the longest English word that is a palindrome under this set of rules (reverse complementarity). In other words, it's the same forward and backward after complementing the letters. Clearly "dog" is not such a palindrome since its reverse complement is "tlw".

The answer? wizard and hovels.

wizard
||||||
draziw -> 'wizard' backwards

It's an amazingly fitting answer, since a wizard is someone with special powers.

A few interesting 4-letter words that are their own reverse complement palindromes are bevy, grit, trig and wold. Common surnames that match are Ghrist, Elizarov and Prawdzik. Female first name Zola and male first name Iver are also reverse complement palindromes, as are trolig (Norwegian for 'likely', as well as an IKEA curtain product) and aviverez (2nd person plural future of 'aviver', French for 'brighten').

I've scanend a very large word list (4,138,000 unique English and foreign words) and identified 108 reverse complement palindromes. If you find a new entry longer than 6 letters, let me know.

Typefaces that are worth it

Mon 28-03-2011

Finding just the right font is hard work. There are so many to choose from. Or are there?

If the type face is not on this list, don't use it (except Bodoni &mdash I hate Bodoni &mdash don't use it). If you need a shorter list, consult the quintissential 15 serif and 15 sans-serif fonts.

You'll notice a rotating image of type faces at the top of this page. Here's the full list.

Comic Sans font Dax font Frutiger font Gill Sans font Gotham font Helvetica font Syntax font The Sans font

I love Gotham and have used it in visualization projects. It's more rational than Helvetica and still enjoys a freshness that has evapourated from Helvetica after near-ubiquitous use. Don't get me wrong, there is still not enough Helvetica in the world, but more Gotham would be nice.

Paper

Mon 24-01-2011

Anyone who has met me, quickly learns that I have a personal and antagonistic relationship with Comic Sans, the type face that shouldn't have been.

In a recent article in the journal Cognition, Fortune favours the bold (and the italicized): Effects of disfluence on educational outcomes, Diemand-Yauman et al. suggest that rendering educational materials in a hard-to-read font, and thereby recruiting the effects of the disfluency ("the subjective experience of difficulty associated with cognitive operations"), improves retention of material.

Regardless whether the effect is real, there must be better ways to improve education than through bad design.

Kittens

Mon 24-01-2011

Surely you like kittens. So don't hurt your audience.

Edward Tufte says no to Powerpoint.

Fri 21-01-2011

Side Interest Spawns Brazilian Fashion Line

In a cosmically improbable confluence of multidisciplinary pursuits, my work on keyboard layouts, which as one of its fruits has produced the TNWMLC keyboard layout — the most difficult for English typing — has been incorporated into the eponymously named Brazilian fashion line by Julia Valle.

TNWMLC Fashion Line by Julia Valle, using work of Martin Krzywinski and the carpalx project.

Spatter of Network Communities

Mon 24-01-2011

Looking into network data sets for the linear layout project, I found pretty hairballs which make a juicy spatter pattern.



me as a keyword list

aikido | analogies | animals | astronomy | comfortable silence | cosmology | dorothy parker | drumming | espresso | fundamental forces | good kerning | graphic design | humanism | humour | jean michel jarre | kayaking | latin | little fluffy clouds | lord of the rings | mathematics | negative space | nuance | perceptual color palettes | philosophy of science | photography | physical constants | physics | poetry | pon farr | reason | rhythm | richard feynman | science | secularism | swing | symmetry and its breaking | technology | things that make me go hmmm | typography | unix | victoria arduino | wine | words