Distractions and amusements, with a sandwich and coffee.

Poetry is just the evidence of life. If your life is burning well, poetry is just the ash
Together with Alberto Cairo, I presented at the Bloomberg Businessweek Design Conference (highlights) on the topic of design and communication in the sciences.

Alberto, as the journalist, motivated why communication should include access to detail through an engaging narrative. He made the distinction between the specialist (heavy on detail) and the communicator (focus on narrative) and emphasized that the distinction is artificial, though often played out (watch video).

I, as the scientist, underscored the importance of clear communication *between* scientists. As the specialists, they are often very poor communicators. Pick up any science journal and you'll quickly discover that scientists either aren't good at telling stories or are discouraged to do so by the medium. The consequence is the same: papers read like a wall of text. TL;DR anyone? The quality of visual communication in general ranges from muddled to abysmal (watch video).

We need more leaders in the field, such as Nature Publishing Group, to reward and emphasize good visual communication (e.g. Nature Cancer Review 2013 Figure Calendar).

Our presentations concluded with a 15 minute moderated discussion with Sam Grobart, senior Businesssweek writer. Everyone got a little cheeky. Good fun.

Watch: my presentation, conversation with Alberto Cairo, moderated by Sam Grobart. (Bloomberg TV), Albert Cairo's presentation.

This was a lightning 7 minute talk. I did more planning about what to say than I usually do, given that there was virtually no opportunity for any kind of backtracking, and include a running narrative below each slide.

My slides are available as PDF, keynote (zipped) or Quicktime. The format is 16:9 HD.

On 28 Jan 2013, Bloomberg Businessweek Design Issue will capture the ideas from the conference and the personalities that generated them.

During the conference, each talk was captured in a series of sketches by Tom Wujec: my talk sketch and moderated discussion sketch.

*Appeal to intuition when designing with value judgments in mind.*

Figure clarity and concision are improved when the selection of shapes and colors is grounded in the Gestalt principles, which describe how we visually perceive and organize information.

The Gestalt principles are value free. For example, they tell us how we group objects but do not speak to any meaning that we might intuitively infer from visual characteristics.

This month, we discuss how appealing to such intuitions—related to shapes, colors and spatial orientation— can help us add information to a figure as well as anticipate and encourage useful interpretations.

Krzywinski, M. (2016) Points of View: Intuitive Design. Nature Methods 13:895.

*Constraining the magnitude of parameters of a model can control its complexity.*

This month we continue our discussion about model selection and evaluation and address how to choose a model that avoids both overfitting and underfitting.

Ideally, we want to avoid having either an underfitted model, which is usually a poor fit to the training data, or an overfitted model, which is a good fit to the training data but not to new data.

Regularization is a process that penalizes the magnitude of model parameters. This is done by not only minimizing the SSE, `\mathrm{SSE} = \sum_i (y_i - \hat{y}_i)^2 `, as is done normally in a fit, but adding to this minimized quantity the sum of the mode's squared parameters, `\mathrm{SSE} + \lambda \sum_i \hat{\beta}^2_i`.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Regularization. *Nature Methods* **13**:803-804.

*With four parameters I can fit an elephant and with five I can make him wiggle his trunk. —John von Neumann.*

By increasing the complexity of a model, it is easy to make it fit to data perfectly. Does this mean that the model is perfectly suitable? No.

When a model has a relatively large number of parameters, it is likely to be influenced by the noise in the data, which varies across observations, as much as any underlying trend, which remains the same. Such a model is overfitted—it matches training data well but does not generalize to new observations.

We discuss the use of training, validation and testing data sets and how they can be used, with methods such as cross-validation, to avoid overfitting.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. *Nature Methods* **13**:703-704.

*It is important to understand both what a classification metric expresses and what it hides.*

We examine various metrics use to assess the performance of a classifier. We show that a single metric is insufficient to capture performance—for any metric, a variety of scenarios yield the same value.

We also discuss ROC and AUC curves and how their interpretation changes based on class balance.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. *Nature Methods* **13**:603-604.

Today is the day and it's hardly an approximation. In fact, `22/7` is 20% more accurate of a representation of `\pi` than `3.14`!

Time to celebrate, graphically. This year I do so with perfect packing of circles that embody the approximation.

By warping the circle by 8% along one axis, we can create a shape whose ratio of circumference to diameter, taken as twice the average radius, is 22/7.

If you prefer something more accurate, check out art from previous `\pi` days: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, and 2016 `\pi` Day.