Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
Tango is a sad thought that is danced.Enrique Santos Discépolothink & dancemore quotes

driving: fun


EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.


fun + amusement

The Google Maps Challenge — Longest Google Maps Driving Routes

Routes updated 13 June 2017.

CHALLENGES

1 — Longest land leg

2,121 km
Great Northern Highway, Australia
Victoria Highway, Australia
view route

2 — Longest land route

15,280 km
Sagres, Portugal
Talon, Russia
view route

3 — Longest land route with ferries

26,340 km
Quoin Point, South Africa
Talon, Russia
view route

4 — Longest 3-point route

48,504
Mou Shkola P.kharp, Russia
Kimberley, South Africa
Talon, Russia
view route

other visualizations

Take a road trip to clear your mind. Take in a few sights and bring home a spoon or other collector's item.

According to Google Maps, how far could you go?

And the most pressing question: can you reach the sorrow at the world's end? Yes you can.

the challenges

Each of the challenges below involves finding points A and B that yield the longest driving route in Google Maps. Each challenge has its own parameters, but certain rules apply to each challenge.

  • the route A→B must be generated by the Google Maps algorithm—it cannot be manually adjusted
  • the shorter of A→B and B→A must be used
  • when multiple routes are available, the shortest must be used
  • avoid highways and avoid tolls options must be off, except for the 3-point route challenge
  • the "no ferry" land route may include a ferry to cross a river or any ferry shorter than 10 km
  • the 3-point route cannot double back along the same path at any point on the route
  • the route must be generated by typing in the waypoints, not by dragging them

Any solution to the challenge will surely have a shorter route (not available to the routing algorithm) as well as many more longer ones (duh—it's always easy to pessimize a route).

Updates to Google Maps are frequent and unannounced, making most of you read here historical artefacts. I try to periodically make sure that the current challenge routes are at least functional.

This topic has been previously discussed on xkcd forums.

continuous updates to Google Maps — old routes unavailable, records reset

As Google Maps updates the routing network, some of the old routes are no longer available, or significantly shorter. This maps challenge page may therefore be out of date.

Don't be surprised if links to old routes show a significantly different distance, or point to a route that no longer exists. Such links, or out-dated entries, are annotated as historical.

30,000 km limit broken—again

historical Thanks to a route by Dean Sych, the 30,000 km limit has been broken again. And how! By over 3,600 km. Thanks Dean!

Previously, because of new routes through central Africa, route distances dropped by about 6,000 km. Whereas before routing avoided Congo and Algeria, they now go right through them. This made a lot of the 30,000+ km routes much shorter.

historical The longest route used to be 33,540 km from Quoin Point, South Africa to a dirt road in Indonesia. It starts with "Head northeast toward R317." and ends with "Turn left". After Malaysia, it's mostly ferries. Unfortunately, it's gone. Also, routing in Africa is now more optimized, shortening all the trips from south Africa.

historical (Paren' no longer available as a destination) | The 30,000 km limit has been broken by a route from Paren' to Pearly Beach. This was furthered by discovering the bizarrely remote Chimchememel', Russia (from Chimchememel' to Danger Point).

historical (Uelen still not accessible) | The next milestone for a route with ferries is 32,000 km. Unless you have the money to build a road to Uelen, this new limit is a significant challenge. Interested individuals should start digging immediately.

historical Many of the interesting past routes are now gone—the world is getting boring. This fun 191 hour drive from Portugal to Malaysia, shown below, is no longer available. Too bad. Stop off in Turkey to go to the bathroom. Pick up a few aluminum centrifuge tubes from Iran, too. That sounds like fun.

change for the worse

The details of the routing algorithm is the epitome of Rumsfeld's unknown unknown. It used to be that Portugal to Malaysia was a very doable and relaxing 8 day drive at 15,787 km. And you managed to avoid China.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Lisbon, Portugal to Pahang, Malaysia takes 7 days 23 hours and 15,797 km (view route)

Later, the route included China but excludes Iran. Apparently, we're no longer on driving terms with Iran. If you want to go from Lisbon to Singapore, you no longer need your Iran for Dummies book.

Still, the updated routing was longer and burned more gas. Prius C drivers with advanced road endurance skills need apply. You know who you are.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Lisbon, Portugal to Pahang, Malaysia takes 8 days 16 hours and 17,482 km (view route)

The 20,000 km all-land route limit still stands. Africa's complex routing may provide a solution. Or a local conflict that severs a road or two.

Blind Spots and Humor

If you play with routes in Google Maps you'll quickly notice that some parts of the world do not appear to be connected to the smarts of the routing algorithm. For example, you cannot drive from Bejing to New Delhi (As of 31 Aug 2015 this is no longer true. A 5,559 km route exists). These holes in the driving fabric pose a challenge in finding long routes.

historical (no more kayak routes) | Google's subtle humour can be found everywhere (though apparently not anymore, since kayaking has been removed from maps and replaced by decidedly un-funny flight routes), such as in step 9 of this Seattle to Hawaii route, which states "Kayak across the Pacific Ocean — 4,436 km". If you have endurance training, you might wish to continue kayaking to Tokyo, for another 6,243 km. For the purpose of this challenge, kayaking is not allowed.

These examples amply demonstrate that the internet was created by people with humor and is used largely by people without it. I say this because the kayak option has been removed. Doubtlessly because someone tried it, got hurt and the litigation lawyers got involved. It's always been the case that the 1% ruins it for the 99% of us.

Darién Gap

No routes from North to South America exist because of this boggy marsh.

Routing Changes

As new routes become available, long trips become shorter. For example, introduction of a route across Niger and Algeria cut the original land route with no ferry record from 18,260 km to a mere 15,576 km.

Other Google Data Munging and Visualizations

If you are interested in visualization and information, explore my global visualization of Google searches by language and find out where in the US people are searching in Chinese.

For the morbidly curious, of interest might be all the really stupid questions people ask Google.

mobile users

Google Maps routes linked to from this page do not appear to work on iPhone's Safari browser (I have not tested iPad or iPod). A "driving direction not found" error appears. Rest assured, these routes do exist, and can be viewed on a browser on a PC or Mac. Weird.

The routes below are the current answers to the challenge. Do you have a better (longer!) route? Let me know.

starting points

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Popular starting point for many routes is Quoin Point in South Africa. (view route)
Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Popular starting point for many routes is Quoin Point in South Africa. (view route)

terminus station

Talon is at the end of R481, which goes from Magadan to Talon. This road forks off the P-504 which runs for about 2,000 km from Manday to Magadan.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
All roads lead to Talon near Magadan in Russia. Good luck. (view route)
Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
All roads lead to Talon near Magadan in Russia. Good luck. (view route)

CHALLENGE 1 — Longest Land Leg

Thanks to James Robertson for finding this exhausting 2,121 km single leg trip. It's a route along Australia's National Highway between Kathernine Hot Springs and (2,121 km later) a fork in the road that splits into the NW Coastal Highway.

I'm sad to see that the route no longer includes Magadan, known as the sorrow at the world's end, which is such an important destination for the Google Map Challenge.

On the upside, this route is so long that "Your destination is in a different time zone". Depending on which way you drive, this might save you an hour.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Along the National Highway in Australia for 2,121 km. (view route)

What happens when you get to the end of the route? You get a much deserved rest stop. Hilarious Aussie humor at its best.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Once you've finished your 2,121 km ride, you get a rest. (view route)

challenge 1 version history

These routes may no longer be available in Google Maps or their routing may have changed.

v6 route Sakha Rebuplitis - Magadan 1,601 km (+44 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v6). Sakha Republits in Russia to Magadanskaya Oblast in Russia (1,597 km). (view route)

v5 route Sakha Rebuplitis - Magadan 1,601 km (+44 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v5). Sakha Republits in Russia to Magadanskaya Oblast in Russia (1,601 km). (view route)

v4 route Haines Junction - Dawson Creek 1,557 km (+25 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v4). Haines Junction in Canada to Dawson Creek in Canada (1,557 km). (view route)

v3 route Haines Junction - Farmington 1,532 km (+32 km) (submitted by David Jackson)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v3). Haines Junction in Canada to Farmington in Canada (1,532 km). (view route)

v2 route Haines Junction - South Taylor 1,500 km (+25 km)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v2). Haines Junction in Canada to South Taylor in Canada (1,500 km). (view route)

v1 route Haines Junction - Charlie Lake 1,475 km

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v1). Haines Junction in Canada to Charlie Lake in Canada (1,475 km). (view route)

CHALLENGE 2 — Longest Land Route

The longest Google Maps route that does not use ferries takes us from Sagres in Portugal to Khasan in Russia. Thanks to Pieter Vandromme for submitting this route.

(13 Jun 2017) I just noticed that there's a 9.4 km ferry in this route now, which crosses the Aldan River. Technically, this disqualifies the route. But instead, I'll change the rules so that the ferry crosses a river and is no more than 10 km :)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Outside Sagres in Portugal to Talon, Russia. This trip is 184 hours and 15,280 km. (view route)

challenge 2 version history

v7 route Sagres in Portugal to Khasan, Russia 14,069 km. This route is now significantly shorter (13,322 km).

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v7). Sagres in Portugal to Khasan, Primorsky Krai in Russia (14,096 km). (view route)

v6 route Sagres in Portugal to Tanjung Pengelih Pengerang Johor Malaysia 16,280 km. Thanks to Jørgen Adam Holen for pointing out a better destination in Malaysia. (Google Map route no longer available)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v6). Sagres in Portugal to Tanjung Pengelih Pengerang Johorm in Malaysis (16,280 km). (view route)

v5 route Duyker Eiland, South Africa - Sidi Bettache, Morocco 15,594 km (+18 km) (Duyker Eiland submitted by David Jackson on xkcd) (Google map route, 12,523 km on 26 Nov 2014, significantly shorter with routing through Congo)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v5). Duyker Eiland in South Africa to Sidi Bettache in Morocco (15,594 km). (view route)

v4 route Pearly Beach, South Africa - Sidi Bettache, Morocco 18,260 km (+84 km) (Google map route 12,673 km on 26 Nov 2014)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v4). Pearly Beach in South Africa to Sidi Bettache in Morocco (18,260 km). (view route)

v3 route Pearly Beach, South Africa - Casablanca, Morocco, 18,176 km (+2,180 km) (submitted by ElWanderer via xkcd) (Google map route 12,708 km on 24 Nov 2014)

v2 route Gibraltar - Paren', Russia 15,996 km (+602 km) (submitted by ElWanderer via xkcd) (Google map route no longer exists on 6 Jun 2013)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v2). Gibraltar to Paren' in Russia (15,996 km). (view route)

v1 route Gibraltar - Magadan, Russia 15,394 km (Google map route now 15,051 km on 26 Nov 2014 and now includes a 9.4 km ferry (a/d Kolyma/M56))

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v1). Gibraltar to Magadan in Russia (15,394 km). (view route)

CHALLENGE 3—Longest Land Route with Ferries

The longest Google Maps A–B route that uses ferries. The ferry distance cannot be more than 25% of the entire trip.

Sadly, it appears you can no longer drive from Quoin Point to Indonesia. So, Talon in the Magadanskaya oblast' is back as everyone's favourite gulag town destination—truly the last stop for sadness. To make matters worse, this route is now only 26,320 km, which is shorter than the original entry for this challenge. Sad!

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route. Quoin Point in South Africa to Talon in Russia. The trip takes 359 hours and 26,320 km. (view route)

challenge 3 version history

v12 route (7 Sep 2015). This route takes you from Quoin Point on Cape Agulhas to the edge of Indonesia, where there are things to do.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v12). Serai, Indonesia to Quoin Point, South Africa. The trip takes 529 hours (22 days) and 33,612 km. (view route)

For some reason this excellent route was dropped by Google Maps. Its creator, Christopher Camitta, points out that if you assist the routing algorithm by adding a mid-point in Pakistan, the route can be recovered.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v12-mod). Smitswinkel Bay, Cape Town, South Africa to Konotikero, Sakron Tane, Tor Atas, Sarmi Regency, Papua, Indonesia via Ghawindi Border, Ghawindi, Lahore 54030, Pakistan. The trip takes 655 hours and 27,402 km. (view route)

v11 route (7 Sep 2013) Quoin Point, South Africa to Talon, Russia 27,224 km (Google map route). This route takes you from Quoin Point on Cape Agulhas to the edge of nowhere in Russia. During your voyage, you experience the endless visual monotony of sprawling Africa (keeping a wide bearth from Angola), only to end up at the sorrow at the world's end—Magadan, Russia's gulag region.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v11). Quoin Point in South Africa to Ola, Magadan Oblast in Russia (27,224 km). (view route)

v10 route (18 Jul 2013) Quoin Point, South Africa to Ola, Russia 24,201 km (Google map route. Thanks to Pieter Vandromme for submitting this route.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v10). Quoin Point in South Africa to Ola, Magadan Oblast in Russia (24,201 km). (view route)

v9 route (18 Jul 2013) Quoin Point, South Africa to Unknown Road, Indonesia 33,540 km (Google map route no longer available, Map geek Earl Higgins found this route after invalidating v7. The 33,000 km barrier is again, and legitimately, broken.)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v9). Quoin Point in South Africa to the eerie Unknown Road in Indonesia (33,540 km). (view route)

v8 route (16 Jul 2013) Unknown Road, Indonesia to Groot Paternoster Reserve, South Africa 32,433 km (Google map route no longer available, Map geek Earl Higgins pointed out that the reverse of the v7 route is significantly shorter. This means that the 33,000 km barrier has, in fact, not been broken. Pity.)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v8). Unknown Road in Indonesia to Groot Paternoster Reserve in South Africa (32,433 km). (view route)

v7 route (6 Jun 2013) Groot Paternoster Reserve, South Africa to Unknown Road, Indonesia 33,634 km (Google map route no longer available, submitted by Sue DoNym on xkcd). (33,557 on 16 Jul 2013; reverse of route is significantly shorter—see v8)

v6 route Chimchememel' Russia - Duyker Eiland, South Africa 31,766 km (+626 km) (Google map route no longer exists, submitted by David Jackson on xkcd)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v6). Chimchmemel in Russia to Duyker Eiland in South Africa (31,766 km). (view route)

v5 route Chimchememel' Russia - Pearly Beach, South Africa 31,140 km (+15 km) (Google map route no longer exists)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v5). Chimchmemel in Russia to Pearly Beach in South Africa (31,140 km). (view route)

v4 route Chimchememel' Russia - Danger Point, South Africa 31,125 km (+692 km) (Google map route no longer exists, submitted by nerd-7hi+42e via xkcd)

v3 route Pearly Beach - Paren', Russia 30,433 km (+602 km) (Google map route no longer exists, submitted by ElWanderer via xkcd)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v3). Pearly Beach in South Africa to Paren in Russia (30,433 km). (view route)

v2 route Pearly Beach - Magadan 29,831 km (+66 km) (Google map route, 24,191 km on 26 Nov 2014)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v2). Pearly Beach in South Africa to Magadan in Russia (29,831 km). (view route)

v1 route Bregasdorp - Magadan 29,765 km (Google map route, 24,125 km on 26 Nov 2014)

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest land leg route entry (v1). Beadasdorp in South Africa to Magadan in Russia (29,765 km). (view route)

CHALLENGE 4—Longest 3-Point Land Route

Thanks to Nico Tsaousis for suggesting the format of this challenge. The 3-point route must be constructed by typing in the waypoints, not dragging them.

Nico came up with the first entry (see v1 route below), a 33,156 km route from an Unnamed Road in South Africa to Magadan via Norway.

The current longest route in this category is a 48,504 km trip from the Mou Shkola P.kharp high school in Severnyy Kvartal in Russia via South Africa back to Talon in Russia. The route was submitted by Brad Bailey and requires the "avoid ferries" option.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest 3-point route. Mou Shkola P.kharp via Kimberley, South Africa to Talon, Magadan Oblast, Russia. 48,504 km. Requires 'avoid ferries' option. (view route)

v3 route Shchukozero in Russia via Izingolweni in South Africa to Talon in Russia (Google map route 46,176 km). This route requires that you select "avoid ferries" as well as "avoid tolls". What makes this route really unique is that it starts and ends in the same country.

v2 route Mpahlane in South Africa via Riksveg in Norway to Talon in Russia (31,377 km).

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest 3-point route (v2). Mpahlane in South Africa via Riksveg in Norway to Talon in Russia (31,377 km). (view route)
Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Small differences in starting positions can make a large difference in the overall route. (A) Starting at Mpahlane gives a north route through Durban for a total trip of 31,377. (B) Starting in Xolobeni, only 63 km south of Mpahlane, routes south of Lesotho and shaves 110 km to create a 31,267 km route. (view route)

v1 route Unnamed road in South Africa to Talon in Russia (Google map route, 33,156 km on 14 Jun 2017). Because this route starts at an "unnamed road" rather than a specific address, I'm retiring it and replacing it by one that starts at Mpahlane in South Africa.

Google Maps Challenge - Longest Driving Routes / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Longest 3-point route (v1) by Nico Tsaousis. Unnamed Road in South Africa via Riksveg in Norway to Talon in Russia (33,156 km). (view route)
VIEW ALL

news + thoughts

Machine learning: a primer

Tue 05-12-2017
Machine learning extracts patterns from data without explicit instructions.

In this primer, we focus on essential ML principles— a modeling strategy to let the data speak for themselves, to the extent possible.

The benefits of ML arise from its use of a large number of tuning parameters or weights, which control the algorithm’s complexity and are estimated from the data using numerical optimization. Often ML algorithms are motivated by heuristics such as models of interacting neurons or natural evolution—even if the underlying mechanism of the biological system being studied is substantially different. The utility of ML algorithms is typically assessed empirically by how well extracted patterns generalize to new observations.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Machine learning: a primer. (read)

We present a data scenario in which we fit to a model with 5 predictors using polynomials and show what to expect from ML when noise and sample size vary. We also demonstrate the consequences of excluding an important predictor or including a spurious one.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.",

...more about the Points of Significance column

Snowflake simulation

Tue 14-11-2017
Symmetric, beautiful and unique.

Just in time for the season, I've simulated a snow-pile of snowflakes based on the Gravner-Griffeath model.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
A few of the beautiful snowflakes generated by the Gravner-Griffeath model. (explore)

Gravner, J. & Griffeath, D. (2007) Modeling Snow Crystal Growth II: A mesoscopic lattice map with plausible dynamics.

Genes that make us sick

Thu 02-11-2017
Where disease hides in the genome.

My illustration of the location of genes in the human genome that are implicated in disease appears in The Objects that Power the Global Economy, a book by Quartz.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
The location of genes implicated in disease in the human genome, shown here as a spiral. (more...)

Ensemble methods: Bagging and random forests

Mon 16-10-2017
Many heads are better than one.

We introduce two common ensemble methods: bagging and random forests. Both of these methods repeat a statistical analysis on a bootstrap sample to improve the accuracy of the predictor. Our column shows these methods as applied to Classification and Regression Trees.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Ensemble methods: Bagging and random forests. (read)

For example, we can sample the space of values more finely when using bagging with regression trees because each sample has potentially different boundaries at which the tree splits.

Random forests generate a large number of trees by not only generating bootstrap samples but also randomly choosing which predictor variables are considered at each split in the tree.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Ensemble methods: bagging and random forests. Nature Methods 14:933–934.

Background reading

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

...more about the Points of Significance column

Classification and regression trees

Mon 16-10-2017
Decision trees are a powerful but simple prediction method.

Decision trees classify data by splitting it along the predictor axes into partitions with homogeneous values of the dependent variable. Unlike logistic or linear regression, CART does not develop a prediction equation. Instead, data are predicted by a series of binary decisions based on the boundaries of the splits. Decision trees are very effective and the resulting rules are readily interpreted.

Trees can be built using different metrics that measure how well the splits divide up the data classes: Gini index, entropy or misclassification error.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Classification and decision trees. (read)

When the predictor variable is quantitative and not categorical, regression trees are used. Here, the data are still split but now the predictor variable is estimated by the average within the split boundaries. Tree growth can be controlled using the complexity parameter, a measure of the relative improvement of each new split.

Individual trees can be very sensitive to minor changes in the data and even better prediction can be achieved by exploiting this variability. Using ensemble methods, we can grow multiple trees from the same data.

Krzywinski, M. & Altman, N. (2017) Points of Significance: Classification and regression trees. Nature Methods 14:757–758.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

Altman, N. & Krzywinski, M. (2015) Points of Significance: Multiple Linear Regression Nature Methods 12:1103-1104.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Model Selection and Overfitting. Nature Methods 13:703-704.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Regularization. Nature Methods 13:803-804.

...more about the Points of Significance column

Personal Oncogenomics Program 5 Year Anniversary Art

Wed 26-07-2017

The artwork was created in collaboration with my colleagues at the Genome Sciences Center to celebrate the 5 year anniversary of the Personalized Oncogenomics Program (POG).

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
5 Years of Personalized Oncogenomics Program at Canada's Michael Smith Genome Sciences Centre. The poster shows 545 cancer cases. (left) Cases ordered chronologically by case number. (right) Cases grouped by diagnosis (tissue type) and then by similarity within group.

The Personal Oncogenomics Program (POG) is a collaborative research study including many BC Cancer Agency oncologists, pathologists and other clinicians along with Canada's Michael Smith Genome Sciences Centre with support from BC Cancer Foundation.

The aim of the program is to sequence, analyze and compare the genome of each patient's cancer—the entire DNA and RNA inside tumor cells— in order to understand what is enabling it to identify less toxic and more effective treatment options.

Principal component analysis

Thu 06-07-2017
PCA helps you interpret your data, but it will not always find the important patterns.

Principal component analysis (PCA) simplifies the complexity in high-dimensional data by reducing its number of dimensions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Principal component analysis. (read)

To retain trend and patterns in the reduced representation, PCA finds linear combinations of canonical dimensions that maximize the variance of the projection of the data.

PCA is helpful in visualizing high-dimensional data and scatter plots based on 2-dimensional PCA can reveal clusters.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Principal component analysis. Nature Methods 14:641–642.

Background reading

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.

...more about the Points of Significance column

`k` index: a weightlighting and Crossfit performance measure

Wed 07-06-2017

Similar to the `h` index in publishing, the `k` index is a measure of fitness performance.

To achieve a `k` index for a movement you must perform `k` unbroken reps at `k`% 1RM.

The expected value for the `k` index is probably somewhere in the range of `k = 26` to `k=35`, with higher values progressively more difficult to achieve.

In my `k` index introduction article I provide detailed explanation, rep scheme table and WOD example.

Dark Matter of the English Language—the unwords

Wed 07-06-2017

I've applied the char-rnn recurrent neural network to generate new words, names of drugs and countries.

The effect is intriguing and facetious—yes, those are real words.

But these are not: necronology, abobionalism, gabdologist, and nonerify.

These places only exist in the mind: Conchar and Pobacia, Hzuuland, New Kain, Rabibus and Megee Islands, Sentip and Sitina, Sinistan and Urzenia.

And these are the imaginary afflictions of the imagination: ictophobia, myconomascophobia, and talmatomania.

And these, of the body: ophalosis, icabulosis, mediatopathy and bellotalgia.

Want to name your baby? Or someone else's baby? Try Ginavietta Xilly Anganelel or Ferandulde Hommanloco Kictortick.

When taking new therapeutics, never mix salivac and labromine. And don't forget that abadarone is best taken on an empty stomach.

And nothing increases the chance of getting that grant funded than proposing the study of a new –ome! We really need someone to looking into the femome and manome.

Dark Matter of the Genome—the nullomers

Wed 31-05-2017

An exploration of things that are missing in the human genome. The nullomers.

Julia Herold, Stefan Kurtz and Robert Giegerich. Efficient computation of absent words in genomic sequences. BMC Bioinformatics (2008) 9:167