latest news

Distractions and amusements, with a sandwich and coffee.

Trance opera—Spente le Stelle
• be dramatic
• more quotes

Numbers are a lot of fun. They can start conversations—the interesting number paradox is a party favourite: every number must be interesting because the first number that wasn't would be very interesting! Of course, in the wrong company they can just as easily end conversations.

I debunk the proof that `\pi = 3` by proving, once and for all, that `\pi` can be any number you like!

Periodically I receive kooky emails from people who claim to know more. Not more than me—which makes me feel great—but more than everybody—which makes me feel suspicious. A veritable fount of crazy is The Great Design Book, Integration of the Cosmic, Atomic & Darmic (Dark Matter) Systems by R.A. Forde.

Look at the margin of error. Archimedes' value for `\pi` (3.14) is an approximation - not an exact value. Would you accept an approximation or errors for your bank account balance? Then, why do you accept it for `\pi`? What else may be wrong? —R.A. Forde

What else may be wrong? Everything!

Here is a "proof" I recently received that π = 3. The main thrust of the proof is that "God said so." QED? Not quite.

Curiously the proof was sent to me as a bitmap.

Given that it claims to show that π has the exact value of 3, it begins reasonably humbly—that I "may find this information ... interesting." Actually, if this were true, I would find this information *staggering*.

Because mathematics is the language of physical reality, there's only that far that you can go with wrong math. If you build it based on wrong math, it will break.

Given that math is axiomatic and not falsifiable, its arguments are a kind of argument from authority—the authority of the axioms. You must accept the axioms for the rest to make sense.

Religion also makes its arguments from authority—a kind of divine authority by proxy—though its "axioms" are nowhere as compelling nor its conclusions useful. Normally, the deception in religion's arguments from authority is not obvious. The arguments have been inocculated over time—amgiguity, hedging and the appeal to faith—to be immune to criticism.

When these arguments include demonstrably incorrect math, the curtain falls. The stage, props and other machinery of the scheme becomes apparent. Here you can see this machinery in action. Or, should I say, inaction.

If you're 5 years-old: (1) draw a reasonably good circle, (2) lay out a piece of string along the circle and measure the length of the string (circumference), (3) measure the diameter of the circle, (4) divide circumference by diameter. You should get a value close to the actual value of π = 3.14. If you're older, read on.

The book purports "real" (why the quotes?) life experiments to demonstrate that that π is 3. I'll take a look at one below, since it makes use of a coffee cup and I don't like to see coffee cups besmirched through hucksterish claims.

What appears below is a critique of a wrong proof. It constitutes the right proof of the fact that the original proof is wrong. It is not a proof that `\pi = 3`!

The proof begins with some horrendous notation. But, since notation has never killed anyone (though frustration is a kind of death, of patience), let's go with it. We're asked to consider the following equation, which is used by the proof to show that `\pi = 3`. $$ \sin^{-1} \Delta \theta^c = \frac{\pi}{6} \frac{\theta^{\circ}}{y}\tag{1} $$

where $$ \begin{array}{l} \Delta \theta^c = \frac{2\pi}{12} & \theta^{\circ} = \frac{360^\circ}{12} & y = \frac{1}{2} \end{array} $$

At this point you might already suspect that we're asked to consider a statement which is an **inequality**. The proof might as well have started by saying "We will use `6 = 2\pi` to show that `\pi = 3`." In fact, this is the exact approach I use below prove that `\pi` is any number. But let's continue with examining the proof.

Nothing so simple as equation (1) should look so complicated. Let's clean it up a little bit. $$ \sin^{-1} a = \tfrac{\pi}{3} b\tag{2} $$

where $$ \begin{array}{l} a = \frac{2\pi}{12} & b = \frac{360^\circ}{12} \end{array} $$

The fact that we're being asked to take the inverse sine of a quantity that is explicitly indicated to be an angle should make you suspicious. Although an angle is a dimensionless quantity and we can write $$ \sin^{-1}(\pi \; \text{rad}) = \sin^{-1}(\pi) = 0 $$

using an angle as an argument to `\sin()` suggests that we don't actually know what the function does.

If we go back to (2) and substitute the values we're being asked to use, $$ \sin^{-1} \tfrac{\pi}{6} = \tfrac{\pi}{3} 30 = 10 \pi \tag{3} $$

we get $$ 0.551 = 31.416 \tag{4} $$

That's as good an inequality as you're going to get. An ounce of reason would be enough for us to stop here, backtrack and find our error. Short of that, we press ahead to see how we can manipulate this to our advantage.

In the next step, the proof treats the left-hand side as a quantity in radians—completely bogus step, but let's go with it—and converts it to degrees to obtain $$ 0.551 \times \tfrac{360}{2 \pi} = 31.574 $$

Yes, we just multiplied only one side of equation (4) by a value that is not one. Sigh.

After committing this crime, the proof attempts to shock you into confusion by stating that $$ 31.574 \neq 31.416 $$

And, given that these numbers aren't the same—they weren't the same in equation (4) either, so the additional bogus multiplication by \(\tfrac{360}{{2 \pi}}\) wasn't actually needed‐the proof states that this inequality must be due to the fact that we used the wrong value for `\pi` in equation (1).

The proof fails to distinguish the difference between an incorrect identity (e.g. `1 = 2` is not correct) and the concept of a variable (e.g. `1 = 2 x` may be correct, depending on the value of `x`). Guided by the dim headlamp of unreason, it suggests that we right our delusion that `\pi = 3.1415...` and instead use `\pi = 3` in equation (1), we get $$ sin^{-1} \tfrac{1}{2} = 30 $$

which is true, because `\sin(30^\circ) = \tfrac{1}{2}`. Therefore, `\pi = 3`.

The entire proof is bogus because it starts with an equality that is not true. In equation (1), the left hand side is not equal to the right hand side.

To illustrate explicitly what just happened, here's a proof that `\pi = 4` using the exact same approach.

Consider the equation, $$ 4 = \pi \tag{5} $$

if we substitute the conventionally accepted value of `\pi` we find $$ 4 = 3.1415... $$

which isn't true! But if we use `\pi = 4` then $$ 4 = 4 $$

which is true! Therefore, `\pi = 4`. QED.

This only demonstrated that I'm an idiot, not that `\pi = 4`.

But why stop at 4? Everyone can have their own value of `\pi`. In equation (5) in the above "proof", set 4 to any number you like and use it to prove that `\pi` is any number you like.

Isn't misunderstanding math fun?

The history of the value of π is rich. There is good evidence for `\pi = (16/9)^2` in the Egyptian Rhind Papyris (circa 1650 BC). Archimedes (287-212 BC) estimated `\pi \approx 3.1418` using the inequality `\tfrac{223}{71} \lt \pi \lt \tfrac{22}{7}`

One thing is certain, the precision to which the number is known is always increasing. At this point, after about 12 trillion digits.

So, it might seem, that `\pi \approx 3` is ancient history. Not to some.

Approximations are fantastic—they allow us to get the job done early. We use the best knowledge available to us today to solve today's problems. Tomorrow's problems might require tomorrow's knowledge—an improvement in the approximations of today.

`\pi = 3` is an approximation that is about 2,000 years old (not the best of its time, either). It's comical to consider it as today's best knowledge.

One of the "real" life experiments proposed in the book (pp. 65-68) uses a coffee cup. The experiment is a great example in failing to identify your wrong assumptions.

First you take measurements of your coffee cup. The author finds that the inner radius is `r = 4 cm` and the depth is `d = 8.6 cm`. Using the volume of a cylinder, the author finds that the volume is either `412.8 \; \mathrm{cm}^3 \ 14.0 \mathrm \; {fl.oz}` if `\pi=3` or `432.3 \; \mathrm{cm}^3 = 14.6 \mathrm \; {fl.oz.}` if `pi=3.14...`.

You're next instructed to full up a measuring cup to 14.6 fl.oz. (good luck there, since measuring cups usually come in 1/2 (4 fl.oz) or 1/3 (2.6 fl.oz) increments).

The author supposedly does this and finds that he could fill the cup to the brim using only 13.7 fl.oz, with the remaining 0.9 fl.oz. spilling.

And now, for some reason, he concludes that this is proof that `\pi = 3`, despite that when using this value of `\pi` the cup's volume was calculated to be 14 fl.oz. not 13.7 fl.oz.

Other than being sloppy, it's most likely that the original assumption that the inside of the coffee cup is a perfect cylinder is wrong. The inside of the cup is probably smooth and perhaps even slightly tapered. Using the maximum radius and depth dimensions will yield a volume larger than the cup's. This is why water spilled out.

We discuss the many ways in which analysis can be confounded when data has a large number of dimensions (variables). Collectively, these are called the "curses of dimensionality".

Some of these are unintuitive, such as the fact that the volume of the hypersphere increases and then shrinks beyond about 7 dimensions, while the volume of the hypercube always increases. This means that high-dimensional space is "mostly corners" and the distance between points increases greatly with dimension. This has consequences on correlation and classification.

Altman, N. & Krzywinski, M. (2018) Points of significance: Curse(s) of dimensionality *Nature Methods* **15**:399–400.

Inference creates a mathematical model of the datageneration process to formalize understanding or test a hypothesis about how the system behaves. Prediction aims at forecasting unobserved outcomes or future behavior. Typically we want to do both and know how biological processes work and what will happen next. Inference and ML are complementary in pointing us to biologically meaningful conclusions.

Statistics asks us to choose a model that incorporates our knowledge of the system, and ML requires us to choose a predictive algorithm by relying on its empirical capabilities. Justification for an inference model typically rests on whether we feel it adequately captures the essence of the system. The choice of pattern-learning algorithms often depends on measures of past performance in similar scenarios.

Bzdok, D., Krzywinski, M. & Altman, N. (2018) Points of Significance: Statistics vs machine learning. Nature Methods 15:233–234.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: supervised methods. Nature Methods 15:5–6.

Celebrate `\pi` Day (March 14th) and go to brand new places. Together with Jake Lever, this year we shrink the world and play with road maps.

Streets are seamlessly streets from across the world. Finally, a halva shop on the same block!

Intriguing and personal patterns of urban development for each city appear in the Boonies, Burbs and Boutiques series.

No color—just lines. Lines from Marrakesh, Prague, Istanbul, Nice and other destinations for the mind and the heart.

The art is featured in the Pi City on the Scientific American SA Visual blog.

Check out art from previous years: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, 2016 `\pi` Day and 2017 `\pi` Day.

We examine two very common supervised machine learning methods: linear support vector machines (SVM) and k-nearest neighbors (kNN).

SVM is often less computationally demanding than kNN and is easier to interpret, but it can identify only a limited set of patterns. On the other hand, kNN can find very complex patterns, but its output is more challenging to interpret.

We illustrate SVM using a data set in which points fall into two categories, which are separated in SVM by a straight line "margin". SVM can be tuned using a parameter that influences the width and location of the margin, permitting points to fall within the margin or on the wrong side of the margin. We then show how kNN relaxes explicit boundary definitions, such as the straight line in SVM, and how kNN too can be tuned to create more robust classification.

Bzdok, D., Krzywinski, M. & Altman, N. (2018) Points of Significance: Machine learning: a primer. Nature Methods 15:5–6.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

In a Nature graphics blog article, I present my process behind designing the stark black-and-white Nature 10 cover.

Nature 10, 18 December 2017

In this primer, we focus on essential ML principles— a modeling strategy to let the data speak for themselves, to the extent possible.

The benefits of ML arise from its use of a large number of tuning parameters or weights, which control the algorithm’s complexity and are estimated from the data using numerical optimization. Often ML algorithms are motivated by heuristics such as models of interacting neurons or natural evolution—even if the underlying mechanism of the biological system being studied is substantially different. The utility of ML algorithms is typically assessed empirically by how well extracted patterns generalize to new observations.

We present a data scenario in which we fit to a model with 5 predictors using polynomials and show what to expect from ML when noise and sample size vary. We also demonstrate the consequences of excluding an important predictor or including a spurious one.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.