Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Poetry is just the evidence of life. If your life is burning well, poetry is just the ashLeonard Cohenburn somethingmore quotes

road trips: exciting


EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.


fun + amusement

The Google Maps Challenge — Longest Google Maps Driving Routes

Routes updated 13 June 2017.

CHALLENGES

1 — Longest land leg

1,597 km
Sakha, Russia
Magadan, Russia
view route

2 — Longest land route

15,280 km
Sagres, Portugal
Talon, Russia
view route

3 — Longest land route with ferries

26,340 km
Quoin Point, South Africa
Talon, Russia
view route

4 — Longest 3-point route

46,176
Shchukozero, Russia
Izingolweni, South Africa
Talon, Russia
view route

other visualizations

Take a road trip to clear your mind. Take in a few sights and bring home a spoon or other collector's item.

According to Google Maps, how far could you go?

And the most pressing question: can you reach the sorrow at the world's end? Yes you can.

the challenges

Each of the challenges below involves finding points A and B that yield the longest driving route in Google Maps. Each challenge has its own parameters, but certain rules apply to each challenge.

  • the route A→B must be generated by the Google Maps algorithm—it cannot be manually adjusted
  • the shorter of A→B and B→A must be used
  • when multiple routes are available, the shortest must be used
  • avoid highways and avoid tolls options must be off
  • the "no ferry" land route may include a ferry to cross a river or any ferry shorter than 10 km
  • the 3-point route cannot double back along the same path at any point on the route
  • the route must be generated by typing in the waypoints, not by dragging them

Any solution to the challenge will surely have a shorter route (not available to the routing algorithm) as well as many more longer ones (duh—it's always easy to pessimize a route).

Updates to Google Maps are frequent and unannounced, making most of you read here historical artefacts. I try to periodically make sure that the current challenge routes are at least functional.

This topic has been previously discussed on xkcd forums.

continuous updates to Google Maps — old routes unavailable, records reset

As Google Maps updates the routing network, some of the old routes are no longer available, or significantly shorter. This maps challenge page may therefore be out of date.

Don't be surprised if links to old routes show a significantly different distance, or point to a route that no longer exists. Such links, or out-dated entries, are annotated as historical.

30,000 km limit broken—again

historical Thanks to a route by Dean Sych, the 30,000 km limit has been broken again. And how! By over 3,600 km. Thanks Dean!

Previously, because of new routes through central Africa, route distances dropped by about 6,000 km. Whereas before routing avoided Congo and Algeria, they now go right through them. This made a lot of the 30,000+ km routes much shorter.

historical The longest route used to be 33,540 km from Quoin Point, South Africa to a dirt road in Indonesia. It starts with "Head northeast toward R317." and ends with "Turn left". After Malaysia, it's mostly ferries. Unfortunately, it's gone. Also, routing in Africa is now more optimized, shortening all the trips from south Africa.

historical (Paren' no longer available as a destination) | The 30,000 km limit has been broken by a route from Paren' to Pearly Beach. This was furthered by discovering the bizarrely remote Chimchememel', Russia (from Chimchememel' to Danger Point).

historical (Uelen still not accessible) | The next milestone for a route with ferries is 32,000 km. Unless you have the money to build a road to Uelen, this new limit is a significant challenge. Interested individuals should start digging immediately.

historical Many of the interesting past routes are now gone—the world is getting boring. This fun 191 hour drive from Portugal to Malaysia, shown below, is no longer available. Too bad. Stop off in Turkey to go to the bathroom. Pick up a few aluminum centrifuge tubes from Iran, too. That sounds like fun.

change for the worse

The details of the routing algorithm is the epitome of Rumsfeld's unknown unknown. It used to be that Portugal to Malaysia was a very doable and relaxing 8 day drive at 15,787 km. And you managed to avoid China.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Lisbon, Portugal to Pahang, Malaysia takes 7 days 23 hours and 15,797 km (view route)

Later, the route included China but excludes Iran. Apparently, we're no longer on driving terms with Iran. If you want to go from Lisbon to Singapore, you no longer need your Iran for Dummies book.

Still, the updated routing was longer and burned more gas. Prius C drivers with advanced road endurance skills need apply. You know who you are.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Lisbon, Portugal to Pahang, Malaysia takes 8 days 16 hours and 17,482 km (view route)

The 20,000 km all-land route limit still stands. Africa's complex routing may provide a solution. Or a local conflict that severs a road or two.

Blind Spots and Humor

If you play with routes in Google Maps you'll quickly notice that some parts of the world do not appear to be connected to the smarts of the routing algorithm. For example, you cannot drive from Bejing to New Delhi (As of 31 Aug 2015 this is no longer true. A 5,559 km route exists). These holes in the driving fabric pose a challenge in finding long routes.

historical (no more kayak routes) | Google's subtle humour can be found everywhere (though apparently not anymore, since kayaking has been removed from maps and replaced by decidedly un-funny flight routes), such as in step 9 of this Seattle to Hawaii route, which states "Kayak across the Pacific Ocean — 4,436 km". If you have endurance training, you might wish to continue kayaking to Tokyo, for another 6,243 km. For the purpose of this challenge, kayaking is not allowed.

These examples amply demonstrate that the internet was created by people with humor and is used largely by people without it. I say this because the kayak option has been removed. Doubtlessly because someone tried it, got hurt and the litigation lawyers got involved. It's always been the case that the 1% ruins it for the 99% of us.

Darién Gap

No routes from North to South America exist because of this boggy marsh.

Routing Changes

As new routes become available, long trips become shorter. For example, introduction of a route across Niger and Algeria cut the original land route with no ferry record from 18,260 km to a mere 15,576 km.

Other Google Data Munging and Visualizations

If you are interested in visualization and information, explore my global visualization of Google searches by language and find out where in the US people are searching in Chinese.

For the morbidly curious, of interest might be all the really stupid questions people ask Google.

mobile users

Google Maps routes linked to from this page do not appear to work on iPhone's Safari browser (I have not tested iPad or iPod). A "driving direction not found" error appears. Rest assured, these routes do exist, and can be viewed on a browser on a PC or Mac. Weird.

The routes below are the current answers to the challenge. Do you have a better (longer!) route? Let me know.

starting points

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Popular starting point for many routes is Quoin Point in South Africa. (view route)
Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Popular starting point for many routes is Quoin Point in South Africa. (view route)

terminus station

Talon is at the end of R481, which goes from Magadan to Talon. This road forks off the P-504 which runs for about 2,000 km from Manday to Magadan.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
All roads lead to Talon near Magadan in Russia. Good luck. (view route)
Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
All roads lead to Talon near Magadan in Russia. Good luck. (view route)

CHALLENGE 1 — Longest Land Leg

The longest land leg is a route along the R504 between the east dock of the a/d Kolyma M56 ferry and the gulag town Magadan, known as the sorrow at the world's end.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Sakha Republitis to Magadan, Russia. Just over 21 hours exactly and 1,597 km. (view route)

Google maps has changed how it reports the details about the trip. For example, when you cross a border, this appears as a step. At times, the direction details include a "continue onto..." step, at a point where no other driving option is available. The route above, for example, is described as

(from) P-504 Keskil, Sakha (Yakutiya) Republits, Russia, 678728

  1. Head east on Р-504 (811 km)
  2. Continue onto Р-504 (786 km)

(to) P-504, 724, Magadanskaya oblast', Russia, 685030

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Dark magic within Google Maps routing reports the beginning of a new leg in the middle of nowhere, despite the lack of choice. (view route)

But I nevertheless consider it as a one-leg trip.

challenge 1 version history

These routes may no longer be available in Google Maps or their routing may have changed.

v5 route Sakha Rebuplitis - Magadan 1,601 km (+44 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v5). Sakha Republits in Russia to Magadanskaya Oblast in Russia (1,601 km). (view route)

v4 route Haines Junction - Dawson Creek 1,557 km (+25 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v4). Haines Junction in Canada to Dawson Creek in Canada (1,557 km). (view route)

v3 route Haines Junction - Farmington 1,532 km (+32 km) (submitted by David Jackson)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v3). Haines Junction in Canada to Farmington in Canada (1,532 km). (view route)

v2 route Haines Junction - South Taylor 1,500 km (+25 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v2). Haines Junction in Canada to South Taylor in Canada (1,500 km). (view route)

v1 route Haines Junction - Charlie Lake 1,475 km

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v1). Haines Junction in Canada to Charlie Lake in Canada (1,475 km). (view route)

CHALLENGE 2 — Longest Land Route

The longest Google Maps route that does not use ferries takes us from Sagres in Portugal to Khasan in Russia. Thanks to Pieter Vandromme for submitting this route.

(13 Jun 2017) I just noticed that there's a 9.4 km ferry in this route now, which crosses the Aldan River. Technically, this disqualifies the route. But instead, I'll change the rules so that the ferry crosses a river and is no more than 10 km :)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Outside Sagres in Portugal to Talon, Russia. This trip is 184 hours and 15,280 km. (view route)

challenge 2 version history

v7 route Sagres in Portugal to Khasan, Russia 14,069 km. This route is now significantly shorter (13,322 km).

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v7). Sagres in Portugal to Khasan, Primorsky Krai in Russia (14,096 km). (view route)

v6 route Sagres in Portugal to Tanjung Pengelih Pengerang Johor Malaysia 16,280 km. Thanks to Jørgen Adam Holen for pointing out a better destination in Malaysia. (Google Map route no longer available)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v6). Sagres in Portugal to Tanjung Pengelih Pengerang Johorm in Malaysis (16,280 km). (view route)

v5 route Duyker Eiland, South Africa - Sidi Bettache, Morocco 15,594 km (+18 km) (Duyker Eiland submitted by David Jackson on xkcd) (Google map route, 12,523 km on 26 Nov 2014, significantly shorter with routing through Congo)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v5). Duyker Eiland in South Africa to Sidi Bettache in Morocco (15,594 km). (view route)

v4 route Pearly Beach, South Africa - Sidi Bettache, Morocco 18,260 km (+84 km) (Google map route 12,673 km on 26 Nov 2014)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v4). Pearly Beach in South Africa to Sidi Bettache in Morocco (18,260 km). (view route)

v3 route Pearly Beach, South Africa - Casablanca, Morocco, 18,176 km (+2,180 km) (submitted by ElWanderer via xkcd) (Google map route 12,708 km on 24 Nov 2014)

v2 route Gibraltar - Paren', Russia 15,996 km (+602 km) (submitted by ElWanderer via xkcd) (Google map route no longer exists on 6 Jun 2013)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v2). Gibraltar to Paren' in Russia (15,996 km). (view route)

v1 route Gibraltar - Magadan, Russia 15,394 km (Google map route now 15,051 km on 26 Nov 2014 and now includes a 9.4 km ferry (a/d Kolyma/M56))

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v1). Gibraltar to Magadan in Russia (15,394 km). (view route)

CHALLENGE 3—Longest Land Route with Ferries

The longest Google Maps A–B route that uses ferries. The ferry distance cannot be more than 25% of the entire trip.

Sadly, it appears you can no longer drive from Quoin Point to Indonesia. So, Talon in the Magadanskaya oblast' is back as everyone's favourite gulag town destination—truly the last stop for sadness. To make matters worse, this route is now only 26,320 km, which is shorter than the original entry for this challenge. Sad!

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route. Quoin Point in South Africa to Talon in Russia. The trip takes 359 hours and 26,320 km. (view route)

challenge 3 version history

v12 route (7 Sep 2015). This route takes you from Quoin Point on Cape Agulhas to the edge of Indonesia, where there are things to do.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v12). Serai, Indonesia to Quoin Point, South Africa. The trip takes 529 hours (22 days) and 33,612 km. (view route)

For some reason this excellent route was dropped by Google Maps. Its creator, Christopher Camitta, points out that if you assist the routing algorithm by adding a mid-point in Pakistan, the route can be recovered.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v12-mod). Smitswinkel Bay, Cape Town, South Africa to Konotikero, Sakron Tane, Tor Atas, Sarmi Regency, Papua, Indonesia via Ghawindi Border, Ghawindi, Lahore 54030, Pakistan. The trip takes 655 hours and 27,402 km. (view route)

v11 route (7 Sep 2013) Quoin Point, South Africa to Talon, Russia 27,224 km (Google map route). This route takes you from Quoin Point on Cape Agulhas to the edge of nowhere in Russia. During your voyage, you experience the endless visual monotony of sprawling Africa (keeping a wide bearth from Angola), only to end up at the sorrow at the world's end—Magadan, Russia's gulag region.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v11). Quoin Point in South Africa to Ola, Magadan Oblast in Russia (27,224 km). (view route)

v10 route (18 Jul 2013) Quoin Point, South Africa to Ola, Russia 24,201 km (Google map route. Thanks to Pieter Vandromme for submitting this route.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v10). Quoin Point in South Africa to Ola, Magadan Oblast in Russia (24,201 km). (view route)

v9 route (18 Jul 2013) Quoin Point, South Africa to Unknown Road, Indonesia 33,540 km (Google map route no longer available, Map geek Earl Higgins found this route after invalidating v7. The 33,000 km barrier is again, and legitimately, broken.)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v9). Quoin Point in South Africa to the eerie Unknown Road in Indonesia (33,540 km). (view route)

v8 route (16 Jul 2013) Unknown Road, Indonesia to Groot Paternoster Reserve, South Africa 32,433 km (Google map route no longer available, Map geek Earl Higgins pointed out that the reverse of the v7 route is significantly shorter. This means that the 33,000 km barrier has, in fact, not been broken. Pity.)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v8). Unknown Road in Indonesia to Groot Paternoster Reserve in South Africa (32,433 km). (view route)

v7 route (6 Jun 2013) Groot Paternoster Reserve, South Africa to Unknown Road, Indonesia 33,634 km (Google map route no longer available, submitted by Sue DoNym on xkcd). (33,557 on 16 Jul 2013; reverse of route is significantly shorter—see v8)

v6 route Chimchememel' Russia - Duyker Eiland, South Africa 31,766 km (+626 km) (Google map route no longer exists, submitted by David Jackson on xkcd)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v6). Chimchmemel in Russia to Duyker Eiland in South Africa (31,766 km). (view route)

v5 route Chimchememel' Russia - Pearly Beach, South Africa 31,140 km (+15 km) (Google map route no longer exists)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v5). Chimchmemel in Russia to Pearly Beach in South Africa (31,140 km). (view route)

v4 route Chimchememel' Russia - Danger Point, South Africa 31,125 km (+692 km) (Google map route no longer exists, submitted by nerd-7hi+42e via xkcd)

v3 route Pearly Beach - Paren', Russia 30,433 km (+602 km) (Google map route no longer exists, submitted by ElWanderer via xkcd)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v3). Pearly Beach in South Africa to Paren in Russia (30,433 km). (view route)

v2 route Pearly Beach - Magadan 29,831 km (+66 km) (Google map route, 24,191 km on 26 Nov 2014)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v2). Pearly Beach in South Africa to Magadan in Russia (29,831 km). (view route)

v1 route Bregasdorp - Magadan 29,765 km (Google map route, 24,125 km on 26 Nov 2014)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v1). Beadasdorp in South Africa to Magadan in Russia (29,765 km). (view route)

CHALLENGE 4—Longest 3-Point Land Route

Thanks to Nico Tsaousis for suggesting the format of this challenge. He came up with the first entry, a 33,156 km route from an Unnamed Road in South Africa to Magadan via Norway.

We're now up to 46,176 km. This route requires that you select "avoid ferries". What makes this route really unique is that it starts and ends in the same country.

The 3-point route must be constructed by typing in the waypoints, not dragging them.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest 3-point route. Shchukozero in Russia via Izingolweni in South Africa to Talon in Russia (46,176 km). (view route)

v2 route Mpahlane in South Africa via Riksveg in Norway to Talon in Russia (31,377 km).

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest 3-point route (v2). Mpahlane in South Africa via Riksveg in Norway to Talon in Russia (31,377 km). (view route)
Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Small differences in starting positions can make a large difference in the overall route. (A) Starting at Mpahlane gives a north route through Durban for a total trip of 31,377. (B) Starting in Xolobeni, only 63 km south of Mpahlane, routes south of Lesotho and shaves 110 km to create a 31,267 km route. (view route)

v1 route Unnamed road in South Africa to Talon in Russia (Google map route, 33,156 km on 14 Jun 2017). Because this route starts at an "unnamed road" rather than a specific address, I'm retiring it and replacing it by one that starts at Mpahlane in South Africa.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest 3-point route (v1) by Nico Tsaousis. Unnamed Road in South Africa via Riksveg in Norway to Talon in Russia (33,156 km). (view route)
VIEW ALL

news + thoughts

Classification and regression trees

Fri 28-07-2017
Decision trees are a powerful but simple prediction method.

Decision trees classify data by splitting it along the predictor axes into partitions with homogeneous values of the dependent variable. Unlike logistic or linear regression, CART does not develop a prediction equation. Instead, data are predicted by a series of binary decisions based on the boundaries of the splits. Decision trees are very effective and the resulting rules are readily interpreted.

Trees can be built using different metrics that measure how well the splits divide up the data classes: Gini index, entropy or misclassification error.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Classification and decision trees. (read)

When the predictor variable is quantitative and not categorical, regression trees are used. Here, the data are still split but now the predictor variable is estimated by the average within the split boundaries. Tree growth can be controlled using the complexity parameter, a measure of the relative improvement of each new split.

Individual trees can be very sensitive to minor changes in the data and even better prediction can be achieved by exploiting this variability. Using ensemble methods, we can grow multiple trees from the same data.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Multiple Linear Regression Nature Methods 12:1103-1104.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. Nature Methods 13:703-704.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Regularization. Nature Methods 13:803-804.

...more about the Points of Significance column

Personal Oncogenomics Program 5 Year Anniversary Art

Wed 26-07-2017

The artwork was created in collaboration with my colleagues at the Genome Sciences Center to celebrate the 5 year anniversary of the Personalized Oncogenomics Program (POG).

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
5 Years of Personalized Oncogenomics Program at Canada's Michael Smith Genome Sciences Centre. The poster shows 545 cancer cases. (left) Cases ordered chronologically by case number. (right) Cases grouped by diagnosis (tissue type) and then by similarity within group.

The Personal Oncogenomics Program (POG) is a collaborative research study including many BC Cancer Agency oncologists, pathologists and other clinicians along with Canada's Michael Smith Genome Sciences Centre with support from BC Cancer Foundation.

The aim of the program is to sequence, analyze and compare the genome of each patient's cancer—the entire DNA and RNA inside tumor cells— in order to understand what is enabling it to identify less toxic and more effective treatment options.

Principal component analysis

Thu 06-07-2017
PCA helps you interpret your data, but it will not always find the important patterns.

Principal component analysis (PCA) simplifies the complexity in high-dimensional data by reducing its number of dimensions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Principal component analysis. (read)

To retain trend and patterns in the reduced representation, PCA finds linear combinations of canonical dimensions that maximize the variance of the projection of the data.

PCA is helpful in visualizing high-dimensional data and scatter plots based on 2-dimensional PCA can reveal clusters.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Principal component analysis. Nature Methods 14:641–642.

Background reading

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.

...more about the Points of Significance column

`k` index: a weightlighting and Crossfit performance measure

Wed 07-06-2017

Similar to the `h` index in publishing, the `k` index is a measure of fitness performance.

To achieve a `k` index for a movement you must perform `k` unbroken reps at `k`% 1RM.

The expected value for the `k` index is probably somewhere in the range of `k = 26` to `k=35`, with higher values progressively more difficult to achieve.

In my `k` index introduction article I provide detailed explanation, rep scheme table and WOD example.

Dark Matter of the English Language—the unwords

Wed 07-06-2017

I've applied the char-rnn recurrent neural network to generate new words, names of drugs and countries.

The effect is intriguing and facetious—yes, those are real words.

But these are not: necronology, abobionalism, gabdologist, and nonerify.

These places only exist in the mind: Conchar and Pobacia, Hzuuland, New Kain, Rabibus and Megee Islands, Sentip and Sitina, Sinistan and Urzenia.

And these are the imaginary afflictions of the imagination: ictophobia, myconomascophobia, and talmatomania.

And these, of the body: ophalosis, icabulosis, mediatopathy and bellotalgia.

Want to name your baby? Or someone else's baby? Try Ginavietta Xilly Anganelel or Ferandulde Hommanloco Kictortick.

When taking new therapeutics, never mix salivac and labromine. And don't forget that abadarone is best taken on an empty stomach.

And nothing increases the chance of getting that grant funded than proposing the study of a new –ome! We really need someone to looking into the femome and manome.

Dark Matter of the Genome—the nullomers

Wed 31-05-2017

An exploration of things that are missing in the human genome. The nullomers.

Julia Herold, Stefan Kurtz and Robert Giegerich. Efficient computation of absent words in genomic sequences. BMC Bioinformatics (2008) 9:167

Clustering

Sat 01-07-2017
Clustering finds patterns in data—whether they are there or not.

We've already seen how data can be grouped into classes in our series on classifiers. In this column, we look at how data can be grouped by similarity in an unsupervised way.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Clustering. (read)

We look at two common clustering approaches: `k`-means and hierarchical clustering. All clustering methods share the same approach: they first calculate similarity and then use it to group objects into clusters. The details of the methods, and outputs, vary widely.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

...more about the Points of Significance column

What's wrong with pie charts?

Thu 25-05-2017

In this redesign of a pie chart figure from a Nature Medicine article [1], I look at how to organize and present a large number of categories.

I first discuss some of the benefits of a pie chart—there are few and specific—and its shortcomings—there are few but fundamental.

I then walk through the redesign process by showing how the tumor categories can be shown more clearly if they are first aggregated into a small number groups.

(bottom left) Figure 2b from Zehir et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. (2017) Nature Medicine doi:10.1038/nm.4333

Tabular Data

Tue 11-04-2017
Tabulating the number of objects in categories of interest dates back to the earliest records of commerce and population censuses.

After 30 columns, this is our first one without a single figure. Sometimes a table is all you need.

In this column, we discuss nominal categorical data, in which data points are assigned to categories in which there is no implied order. We introduce one-way and two-way tables and the `\chi^2` and Fisher's exact tests.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Tabular data. Nature Methods 14:329–330.

...more about the Points of Significance column

Happy 2017 `\pi` Day—Star Charts, Creatures Once Living and a Poem

Tue 14-03-2017


on a brim of echo,

capsized chamber
drawn into our constellation, and cooling.
—Paolo Marcazzan

Celebrate `\pi` Day (March 14th) with star chart of the digits. The charts draw 40,000 stars generated from the first 12 million digits.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
12,000,000 digits of `\pi` interpreted as a star catalogue. (details)

The 80 constellations are extinct animals and plants. Here you'll find old friends and new stories. Read about how Desmodus is always trying to escape or how Megalodon terrorizes the poor Tecopa! Most constellations have a story.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Find friends and stories among the 80 constellations of extinct animals and plants. Oh look, a Dodo guardings his eggs! (details)

This year I collaborate with Paolo Marcazzan, a Canadian poet, who contributes a poem, Of Black Body, about space and things we might find and lose there.

Check out art from previous years: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day and and 2016 `\pi` Day.