Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
listen; there's a hell of a good universe next door: let's go.e.e. cummingsgo theremore quotes

words: lettery


In Silico Flurries: Computing a world of snow. Scientific American. 23 December 2017


language + fiction

Dark Matter of the English Language—the unwords

Words are easy, like the wind;
Faithful friends are hard to find.
—William Shakespeare

uncountries

The uncountries are places that don't exist, but perhaps should. If you're starting your own country or are hoping to secede from your current employer (here's looking at you States of the US), you might find this list useful.

The list of uncountries is generated by training on list of 257 countries and territories.

Here's my bucket list of where I'm going next:

  • Conchar and Pobacia
  • Hzuuland
  • New Kain
  • Rabibus and Megee Islands
  • Sentip and Sitina
  • Sinistan
  • Tuskia
  • Urzenia
  • Vontila

Below are the alphabetically first 4–10 letter single-word uncountries for each letter. In some cases, no names of a given length were generated for a given letter.

—4—
Aani
Aemo
Ball
Bang
Cada
Caga
Dafa
Dalr
Eira
Eran
Fani
Fato
Gaar
Gace
Hana
Hhen
Iaou
Inor
Kala
Kani
Lain
Lale
Mabe
Mage
Naom
Nare
Pein
Peis
Ragi
Raiy
Saen
Saic
Taga
Tans
Ucan
Uica
Venr
Viam
—5—
Aalce
Aanie
Babat
Baica
Caane
Cacae
Daaca
Demga
Eamat
Eanda
Fabta
Fania
Gaane
Gaare
Haemy
Harre
Imina
Imoba
Jacin
Jania
Kande
Kangi
Labua
Laiir
Mabia
Maeky
Nague
Nalgo
Paine
Paiti
Qetia
Qiria
Relat
Renir
Sabga
Sabta
Tabia
Tagan
Ucite
Uenha
Vamar
Vanga
Wiady
Zepek
—6—
Aamade
Afanen
Baaira
Baggia
Cabomo
Caemia
Darlye
Darzan
Eagero
Eattas
Faitia
Farado
Gabbak
Gabiai
Hauima
Henlia
Icalin
Iganda
Jaigio
Jartar
KSonko
Kabolo
Labopa
Lacuua
Mabana
Mabiak
Nalgin
Nandar
Ougagu
Paiada
Paldol
Rabgen
Ramgui
Saadae
Sabite
Taikua
Takkia
Udapor
Uermin
Vandes
Vangan
Wantia
Wengan
—7—
Aanigah
Agentan
Bagtera
Baliadi
Caathea
Cabibia
Dahmomo
Dalhais
Ebaniar
Eboniat
Falinha
Falttin
Gaereta
Gainana
Haratan
Hasland
Iafercy
Icrotii
Jelican
Jeliwia
Kanfono
Kargice
Lacitia
Laitgoe
Mabaden
Maellha
Naendia
Nalhind
Onbagin
Pabapua
Pagonia
Rarralo
Rarrkak
SEerral
Sabaida
Taateaa
Tahiria
Uagayas
Ujoland
Vaairta
Valonds
—8—
Aegtomis
Aeirania
Balcosda
Baltages
Cacbilii
Calnaria
Dargonda
Darirnio
Eatisasa
Eeniicia
Femanlan
Ferlanda
Gacterat
Gadeqtam
Hinaniti
Hndiwiun
Ihibiano
Ilriload
JenogRan
Jenorala
Kcinisan
Keberlan
Laenhuan
Lalostia
Mabdadde
Mabhalin
Naitieta
Naltenis
Oinvalia
Orringia
Pajtbava
Palalian
Qerbacon
Raliutan
Raphenla
SDatelan
Sabmutia
Taigonia
Talbiela
Uginbiam
Uityeate
Vasguace
Velentin
—9—
Aiwontmia
Alememter
Bawelilia
Beciradue
Cankeslas
Cansiaila
Dacucania
Elorbhiad
Epubhulon
Fredelapo
Gakgasdan
Gantiulan
Hotutuias
Ilallasda
Imroldian
Jendiulia
Jitgodien
Kemadicis
Kerndhand
Lazekatis
Lectarada
MInledian
Mabertima
Nacasnand
Nadordinh
Palcotiis
Panciland
Raecsatas
Rentelisy
SDirniata
Saentolia
Tarhhaldi
Tarnoigan
Untensian
Vantanira
Ventalica
Wensatial
—10—
Amodedhani
Andhsituia
Badetcinia
Bandesland
Camegessow
Canoniitia
Damalhania
Denwarinia
Ensriitrui
Eremgosdon
Garilsista
Gebticiita
Hatendacan
Hecrapband
Inteniania
Irrhalipan
Kendestand
Lantunutan
Lenkalland
Macgalland
Malbaninis
Namgelasta
Naniheomie
Pamestitia
Pilimintan
Reboteisia
Ricanlands
Sacgsainas
Samanhalaa
Tenheposda
Tezadtinia
Vagioliale
Veuthalian
—11—
Adrelebcima
Amurenoilan
Berniwhpana
Buenaslatda
Catdhtilard
Cerrertoria
Daniacsalon
Eirniatiars
Gaundaniani
Getnicistan
Kiaghbaliaa
Lurolelicam
Madcesladts
Melhellunds
Naporrestan
Niuritantsa
Pebenigisia
Rancitolian
Sarcestalta
Sengrerolia
Varenchales

And below are uncountries that are composed of compound words. The neural network doesn't always do a good job in capitalization.

—6—
Bel Eo
—7—
Ar Neli
Bei Ros
Co Naf,
Es onda
Gob and
Ka -uca
Lex Sen
Mec Len
Se amar
Ton San
—8—
Anru Ran
Baqta Aa
Can Kanc
Dr Belle
Eone Vue
Giinu an
In Gecan
Leen Kon
Mons ald
New Kain
Pobia io
Se Mawan
Tanv Wag
Un Sayth
Vanbo ia
—9—
Arte olia
Ban Tenka
Cui iepes
Dant Sion
Enu Balra
Fem Feriu
Gia honia
Kamen San
Lan Giane
Ma orepan
Nak Manti
Pat Gamia
Saa hetiu
Tar Itlin
Uem Ladde
—10—
Aem Latlia
Banh Cerra
Cairt Aani
Dal Vclcin
Ee Riritan
Gaine Sora
Ken Sonras
Laen Lalor
Mal Rilteh
Nib Carean
Paoth enia
Ran borado
Saed Canua
Tamr eatos
Vamin ores
—11—
Agim Niidea
Baman oshon
Catil Menia
Eil Mitakeo
Freni Niray
Ge Manlando
Jhs Lelland
Ktuct Calia
Leg Saltima
Macan Taman
Namet Gacia
Panan Rerni
Sab Nelieda
Tanc Talind
Vuitan Sera
—12—
Aginh Erpata
Bamen Island
Cancd Samcua
Demun Bondan
Eginan Kerta
Gaicc Iutand
Inanon Gapia
Kanan Island
Lionh Irania
Nameon Raran
Rarve Iitand
Saeci Inlond
Talth Isliin
Unde Narpisa

—13—
Aipen Sabtars
Bab il Rinvee
Caneg Iclands
Dont Calioica
Emanh Naytani
Gaetia esland
Iatin Islands
Jamedh Island
Karch Sartier
Laine Islands
Maind Veltant
Nadtin Launua
Pau Meny ings
Rarciua Oaros
Saini Islands
Tacen Goetian
Urran Teviina
Vopenkb Toppa
—14—
Actatiat Nalto
Bahten Gojilan
Cela Ticgialia
Dototiat Lieda
Eusruuta essan
Fentch Nopyvon
Gaicd Eingasya
Inoral Islands
Jounh Miiticci
Kinera en Cime
Ladten Reperta
Maponit Peraco
Nalton Islands
Pamari eslands
Raqton Beparte
Salnen Islands
Tamokan Asteto
Vekmad Islands
—15—
Aepgenin Veliin
Bemanan Supatii
Caddad Gialerna
Eertiton Ialesd
Fuparrat Asiben
Gamgad Rerradia
Hitetal Iflasds
Jawshanad Meemi
Launiad Istands
Mariadec Geruur
Negend Peregine
Pacabara islans
Rortan Rertaria
Saldusl Islands
Talion Reputlic
Umzenon Islands
—16—
Anevares Inlands
Bangan Aulentand
Caun Nelch afdsa
Dita onu Islands
Eurania Tomontia
Fontican Aviisli
Gaitinl Indipige
Izerlind Reraumo
Jarenran Islands
Krdirtel Cenuria
Lazinan Bestilin
Maltinun Islands
Namnan Sceneilin
Pauran and Rarbe
Ramgiad Cruerava
Sagcas and Narol
Tanin Kenthurian
Urarcan Sudhiety
Vharkiin Ralands
—17—
Avaryi Barsseland
Binicon Areonieme
Cbiathan Islicssa
Dulial and Carora
Eciircan Rarandin
Gainonial Islands
Jumo iten Serlacd
Kaldtan Fiamusaia
Lattioalan CSubia
Matperacd ofdente
Neni Nicch Dorova
Sanon Perbelobgie
Tengepoth Iscands
Wapu et Lat Miban

—18—
Amanes Vontenseiar
Bamntal Asitrcanos
Canomad Rertheqiad
Detuereilin Inlatd
Eeet orwen Seworas
Gantarten Afan and
Kary escin Islands
Lamorho an Lebbati
Mem aynol Islandsi
Nibitayd ordheriio
Oenginh Kinderland
Pacaut Martirlalon
Salend Fe Selerdan
Terviniot Islandss
Vetbited Afobeslen
—19—
Agen Tand and Suoni
Bont Marrtan Toraza
Conchar and Pobacia
Eith Miap Minebanis
Galenta Gueutinanes
Kucaran and Samosea
Loettiruan Atereeti
Mon Varmands Island
Notk Maeg Lemtorban
Pestsian and Kupeta
Sarretonien endanda
Tinba Rand Banciton
Utabtin Anian hulan
—20—
Alenra Varin Mepubla
Cenialas and Malalia
Dagic Islands Island
Ekmhula andi Leprico
Garthda and Geekiree
Khin Matib Gebuyston
Liqan and Nantorepua
Maloil Iulands Ncoty
Nanuu Ieean Iymoldir
Samtes iucis Inlends
Teni emctan Sanucias
Urpoe an Rec Sucilia
—21—
Borne iarran Devarece
Car Rethil Nacinitana
Emytican Geru atgania
Gaind Marchan Islonds
Molticon Cini ondiles
Naund Ramen Arigariar
Pomithan Micrilanimia
Ramtitan Saruan Gaico
Sala am Ton Pameruidi

Here are all some lists with common suffixes

*nia Ariania Aruenia Bamenia Bolsnia Bukania Caminia Carenia Copania Eniania Eruinia Eryinia Eyuinia Fvounia Gapania Gorania Guyinia Imgania Lebania Lepania Mezania Pagonia Pamonia Piainia Pirania Saminia Sesinia Simania Somenia Sorinia Tinonia Turunia Urzenia Badetcinia Damalhania Denwarinia Inteniania Mangevinia Seregiania Tezadtinia Tudennenia Akinia Arenia Arunia Bocnia Boinia Bounia Buinia Burnia Byunia Caunia Eminia Gainia Geenia Geinia Giania Guania Guinia Guonia Gwinia Jhunia Jiinia Jirnia Kcenia Leinia Lornia Neenia Rernia Ruenia Sannia Shinia Siinia Siunia Suinia Uninia Vasnia Arefeonia Bevomania Dacucania Eziboonia Gibstania Klbininia Setrounia Shlatania Suunienia Teroninia EwDirireonia Aeirania Bemginia Bunyonia Canmania Carginia Carnania Cosrania Culiinia Cumiinia Duinania Ezupinia Geziania Guinenia Guurania Konvonia Lalzinia Lertania Marbania Nandania Narnania Nenconia Pastania Sadiania Sazcinia Sigwenia Smeminia Sonconia Surbania Taigonia Tebcania Tendania Unyrania Cania Conia Fania Henia Jania Jonia Kinia Lonia Mania Ninia Nonia Sania Tenia Tonia Vania

*lan Anualan Binelan Biselan Comelan Donolan Eduulan Iferlan Ilaslan Iudelan Papilan Potalan Srinlan Takilan Tamglan Cemuneilan Gehsyanlan Mecineslan Amurenoilan Aralan Cralan Geilan Inilan Innlan Kerlan Nanlan Sorlan Tnulan Beugeilan Condamlan Cunogslan Gantiulan Geevallan Gienyslan Memsinlan Mertorlan Minnaulan Mururolan Neminolan Sandeslan Sennerlan Titorilan Vertonlan Andenlan Betarlan Ceneslan Cunmelan Curislan Femanlan Geamilan Keberlan Larielan Meloelan Menrulan Molielan Otenelan Redallan SDatelan Selenlan Alan Glan Tlan Bolan Bulan Culan Galan Malan Selan Solan

*land Garland Hasland Ujoland Bandesland Benhelland Bhqlalland Dhinioland Lenkalland Macgalland Vuleslland Caland Feland Maland Saland Anderland Cemerland Geunoland Lutkaland Mowurland Panciland Parraland Anreland Asealand Hzuuland Maerland Masrland Memoland Namaland Navaland Ponoland Tuysland Vetaland

*ana Amynana Balpana Burgana Congana Fuubana Gainana Gaulana Guiiana Somuana Tartana Vehcana Cunheqrana Berniwhpana Antana Argana Buvana Mabana Merana Mobana Relana Rucana Semana Sikana Nteradana Gitanana Hana Lana Mana Sana Giana Guana Gvana Toana

*ica Cinuica Deyrica Goitica Maltica Mannica Merlica Peotica Raryica Sortica Stamica Sumhica Tektica Tiumica Utiuica Bemgbicica Aniica Bapica Narica Sanica Selica Sibica Gatuitica Iuperiica Ventalica Buuntica Bwentica Sorgeica Uica Baica Umica

*can Banecan Celican Jelican Pelecan Deslisacan Hatendacan Leucan Noccan Tircan Tlycan Shaylican Suniracan Cerarcan Emunecan Gepuucan Mamescan Salgican Vongican Ucan

*dan Euvadan Gtardan Monmdan Seundan Srisdan Unendan Banitisdan Ringkeldan Bildan Landan Saldan Soldan Sordan Tamdan Gakgasdan Mremaldan Stelosdan Lapardan Siwesdan Srunadan

*stan Baystan Caistan Velstan Gentiastan Getnicistan Naporrestan Gistan Mastan Tengastan Sinistan

*tar Lalatar Sanktar Simntar Somytar Swettar Temitar Burekertar Jartar Tantar Unitar Gornitar Satar

VIEW ALL

news + thoughts

Curse(s) of dimensionality

Tue 05-06-2018
There is such a thing as too much of a good thing.

We discuss the many ways in which analysis can be confounded when data has a large number of dimensions (variables). Collectively, these are called the "curses of dimensionality".

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Curse(s) of dimensionality. (read)

Some of these are unintuitive, such as the fact that the volume of the hypersphere increases and then shrinks beyond about 7 dimensions, while the volume of the hypercube always increases. This means that high-dimensional space is "mostly corners" and the distance between points increases greatly with dimension. This has consequences on correlation and classification.

Altman, N. & Krzywinski, M. (2018) Points of significance: Curse(s) of dimensionality Nature Methods 15:399–400.

Statistics vs Machine Learning

Tue 03-04-2018
We conclude our series on Machine Learning with a comparison of two approaches: classical statistical inference and machine learning. The boundary between them is subject to debate, but important generalizations can be made.

Inference creates a mathematical model of the datageneration process to formalize understanding or test a hypothesis about how the system behaves. Prediction aims at forecasting unobserved outcomes or future behavior. Typically we want to do both and know how biological processes work and what will happen next. Inference and ML are complementary in pointing us to biologically meaningful conclusions.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Statistics vs machine learning. (read)

Statistics asks us to choose a model that incorporates our knowledge of the system, and ML requires us to choose a predictive algorithm by relying on its empirical capabilities. Justification for an inference model typically rests on whether we feel it adequately captures the essence of the system. The choice of pattern-learning algorithms often depends on measures of past performance in similar scenarios.

Bzdok, D., Krzywinski, M. & Altman, N. (2018) Points of Significance: Statistics vs machine learning. Nature Methods 15:233–234.

Background reading

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: supervised methods. Nature Methods 15:5–6.

...more about the Points of Significance column

Happy 2018 `\pi` Day—Boonies, burbs and boutiques of `\pi`

Wed 14-03-2018

Celebrate `\pi` Day (March 14th) and go to brand new places. Together with Jake Lever, this year we shrink the world and play with road maps.

Streets are seamlessly streets from across the world. Finally, a halva shop on the same block!

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
A great 10 km run loop between Istanbul, Copenhagen, San Francisco and Dublin. Stop off for halva, smørrebrød, espresso and a Guinness on the way. (details)

Intriguing and personal patterns of urban development for each city appear in the Boonies, Burbs and Boutiques series.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
In the Boonies, Burbs and Boutiques of `\pi` we draw progressively denser patches using the digit sequence 159 to inform density. (details)

No color—just lines. Lines from Marrakesh, Prague, Istanbul, Nice and other destinations for the mind and the heart.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Roads from cities rearranged according to the digits of `\pi`. (details)

The art is featured in the Pi City on the Scientific American SA Visual blog.

Check out art from previous years: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, 2016 `\pi` Day and 2017 `\pi` Day.

Machine learning: supervised methods (SVM & kNN)

Thu 18-01-2018
Supervised learning algorithms extract general principles from observed examples guided by a specific prediction objective.

We examine two very common supervised machine learning methods: linear support vector machines (SVM) and k-nearest neighbors (kNN).

SVM is often less computationally demanding than kNN and is easier to interpret, but it can identify only a limited set of patterns. On the other hand, kNN can find very complex patterns, but its output is more challenging to interpret.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Machine learning: supervised methods (SVM & kNN). (read)

We illustrate SVM using a data set in which points fall into two categories, which are separated in SVM by a straight line "margin". SVM can be tuned using a parameter that influences the width and location of the margin, permitting points to fall within the margin or on the wrong side of the margin. We then show how kNN relaxes explicit boundary definitions, such as the straight line in SVM, and how kNN too can be tuned to create more robust classification.

Bzdok, D., Krzywinski, M. & Altman, N. (2018) Points of Significance: Machine learning: a primer. Nature Methods 15:5–6.

Background reading

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

...more about the Points of Significance column

Human Versus Machine

Tue 16-01-2018
Balancing subjective design with objective optimization.

In a Nature graphics blog article, I present my process behind designing the stark black-and-white Nature 10 cover.

Nature 10, 18 December 2017

Machine learning: a primer

Thu 18-01-2018
Machine learning extracts patterns from data without explicit instructions.

In this primer, we focus on essential ML principles— a modeling strategy to let the data speak for themselves, to the extent possible.

The benefits of ML arise from its use of a large number of tuning parameters or weights, which control the algorithm’s complexity and are estimated from the data using numerical optimization. Often ML algorithms are motivated by heuristics such as models of interacting neurons or natural evolution—even if the underlying mechanism of the biological system being studied is substantially different. The utility of ML algorithms is typically assessed empirically by how well extracted patterns generalize to new observations.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Machine learning: a primer. (read)

We present a data scenario in which we fit to a model with 5 predictors using polynomials and show what to expect from ML when noise and sample size vary. We also demonstrate the consequences of excluding an important predictor or including a spurious one.

Bzdok, D., Krzywinski, M. & Altman, N. (2017) Points of Significance: Machine learning: a primer. Nature Methods 14:1119–1120.

...more about the Points of Significance column