Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - contact me Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca on Twitter Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Lumondo Photography Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Pi Art Martin Krzywinski / Genome Sciences Center / mkweb.bcgsc.ca - Hilbertonians - Creatures on the Hilbert Curve
music + dance + projected visualsNosaj Thingmarvel at perfect timingmore quotes

words: fun


EMBO Practical Course: Bioinformatics and Genome Analysis, 5–17 June 2017.


language + fiction

Dark Matter of the English Language—the unwords

Words are easy, like the wind;
Faithful friends are hard to find.
—William Shakespeare

uncountries

The uncountries are places that don't exist, but perhaps should. If you're starting your own country or are hoping to secede from your current employer (here's looking at you States of the US), you might find this list useful.

The list of uncountries is generated by training on list of 257 countries and territories.

Here's my bucket list of where I'm going next:

  • Conchar and Pobacia
  • Hzuuland
  • New Kain
  • Rabibus and Megee Islands
  • Sentip and Sitina
  • Sinistan
  • Tuskia
  • Urzenia
  • Vontila

Below are the alphabetically first 4–10 letter single-word uncountries for each letter. In some cases, no names of a given length were generated for a given letter.

—4—
Aani
Aemo
Ball
Bang
Cada
Caga
Dafa
Dalr
Eira
Eran
Fani
Fato
Gaar
Gace
Hana
Hhen
Iaou
Inor
Kala
Kani
Lain
Lale
Mabe
Mage
Naom
Nare
Pein
Peis
Ragi
Raiy
Saen
Saic
Taga
Tans
Ucan
Uica
Venr
Viam
—5—
Aalce
Aanie
Babat
Baica
Caane
Cacae
Daaca
Demga
Eamat
Eanda
Fabta
Fania
Gaane
Gaare
Haemy
Harre
Imina
Imoba
Jacin
Jania
Kande
Kangi
Labua
Laiir
Mabia
Maeky
Nague
Nalgo
Paine
Paiti
Qetia
Qiria
Relat
Renir
Sabga
Sabta
Tabia
Tagan
Ucite
Uenha
Vamar
Vanga
Wiady
Zepek
—6—
Aamade
Afanen
Baaira
Baggia
Cabomo
Caemia
Darlye
Darzan
Eagero
Eattas
Faitia
Farado
Gabbak
Gabiai
Hauima
Henlia
Icalin
Iganda
Jaigio
Jartar
KSonko
Kabolo
Labopa
Lacuua
Mabana
Mabiak
Nalgin
Nandar
Ougagu
Paiada
Paldol
Rabgen
Ramgui
Saadae
Sabite
Taikua
Takkia
Udapor
Uermin
Vandes
Vangan
Wantia
Wengan
—7—
Aanigah
Agentan
Bagtera
Baliadi
Caathea
Cabibia
Dahmomo
Dalhais
Ebaniar
Eboniat
Falinha
Falttin
Gaereta
Gainana
Haratan
Hasland
Iafercy
Icrotii
Jelican
Jeliwia
Kanfono
Kargice
Lacitia
Laitgoe
Mabaden
Maellha
Naendia
Nalhind
Onbagin
Pabapua
Pagonia
Rarralo
Rarrkak
SEerral
Sabaida
Taateaa
Tahiria
Uagayas
Ujoland
Vaairta
Valonds
—8—
Aegtomis
Aeirania
Balcosda
Baltages
Cacbilii
Calnaria
Dargonda
Darirnio
Eatisasa
Eeniicia
Femanlan
Ferlanda
Gacterat
Gadeqtam
Hinaniti
Hndiwiun
Ihibiano
Ilriload
JenogRan
Jenorala
Kcinisan
Keberlan
Laenhuan
Lalostia
Mabdadde
Mabhalin
Naitieta
Naltenis
Oinvalia
Orringia
Pajtbava
Palalian
Qerbacon
Raliutan
Raphenla
SDatelan
Sabmutia
Taigonia
Talbiela
Uginbiam
Uityeate
Vasguace
Velentin
—9—
Aiwontmia
Alememter
Bawelilia
Beciradue
Cankeslas
Cansiaila
Dacucania
Elorbhiad
Epubhulon
Fredelapo
Gakgasdan
Gantiulan
Hotutuias
Ilallasda
Imroldian
Jendiulia
Jitgodien
Kemadicis
Kerndhand
Lazekatis
Lectarada
MInledian
Mabertima
Nacasnand
Nadordinh
Palcotiis
Panciland
Raecsatas
Rentelisy
SDirniata
Saentolia
Tarhhaldi
Tarnoigan
Untensian
Vantanira
Ventalica
Wensatial
—10—
Amodedhani
Andhsituia
Badetcinia
Bandesland
Camegessow
Canoniitia
Damalhania
Denwarinia
Ensriitrui
Eremgosdon
Garilsista
Gebticiita
Hatendacan
Hecrapband
Inteniania
Irrhalipan
Kendestand
Lantunutan
Lenkalland
Macgalland
Malbaninis
Namgelasta
Naniheomie
Pamestitia
Pilimintan
Reboteisia
Ricanlands
Sacgsainas
Samanhalaa
Tenheposda
Tezadtinia
Vagioliale
Veuthalian
—11—
Adrelebcima
Amurenoilan
Berniwhpana
Buenaslatda
Catdhtilard
Cerrertoria
Daniacsalon
Eirniatiars
Gaundaniani
Getnicistan
Kiaghbaliaa
Lurolelicam
Madcesladts
Melhellunds
Naporrestan
Niuritantsa
Pebenigisia
Rancitolian
Sarcestalta
Sengrerolia
Varenchales

And below are uncountries that are composed of compound words. The neural network doesn't always do a good job in capitalization.

—6—
Bel Eo
—7—
Ar Neli
Bei Ros
Co Naf,
Es onda
Gob and
Ka -uca
Lex Sen
Mec Len
Se amar
Ton San
—8—
Anru Ran
Baqta Aa
Can Kanc
Dr Belle
Eone Vue
Giinu an
In Gecan
Leen Kon
Mons ald
New Kain
Pobia io
Se Mawan
Tanv Wag
Un Sayth
Vanbo ia
—9—
Arte olia
Ban Tenka
Cui iepes
Dant Sion
Enu Balra
Fem Feriu
Gia honia
Kamen San
Lan Giane
Ma orepan
Nak Manti
Pat Gamia
Saa hetiu
Tar Itlin
Uem Ladde
—10—
Aem Latlia
Banh Cerra
Cairt Aani
Dal Vclcin
Ee Riritan
Gaine Sora
Ken Sonras
Laen Lalor
Mal Rilteh
Nib Carean
Paoth enia
Ran borado
Saed Canua
Tamr eatos
Vamin ores
—11—
Agim Niidea
Baman oshon
Catil Menia
Eil Mitakeo
Freni Niray
Ge Manlando
Jhs Lelland
Ktuct Calia
Leg Saltima
Macan Taman
Namet Gacia
Panan Rerni
Sab Nelieda
Tanc Talind
Vuitan Sera
—12—
Aginh Erpata
Bamen Island
Cancd Samcua
Demun Bondan
Eginan Kerta
Gaicc Iutand
Inanon Gapia
Kanan Island
Lionh Irania
Nameon Raran
Rarve Iitand
Saeci Inlond
Talth Isliin
Unde Narpisa

—13—
Aipen Sabtars
Bab il Rinvee
Caneg Iclands
Dont Calioica
Emanh Naytani
Gaetia esland
Iatin Islands
Jamedh Island
Karch Sartier
Laine Islands
Maind Veltant
Nadtin Launua
Pau Meny ings
Rarciua Oaros
Saini Islands
Tacen Goetian
Urran Teviina
Vopenkb Toppa
—14—
Actatiat Nalto
Bahten Gojilan
Cela Ticgialia
Dototiat Lieda
Eusruuta essan
Fentch Nopyvon
Gaicd Eingasya
Inoral Islands
Jounh Miiticci
Kinera en Cime
Ladten Reperta
Maponit Peraco
Nalton Islands
Pamari eslands
Raqton Beparte
Salnen Islands
Tamokan Asteto
Vekmad Islands
—15—
Aepgenin Veliin
Bemanan Supatii
Caddad Gialerna
Eertiton Ialesd
Fuparrat Asiben
Gamgad Rerradia
Hitetal Iflasds
Jawshanad Meemi
Launiad Istands
Mariadec Geruur
Negend Peregine
Pacabara islans
Rortan Rertaria
Saldusl Islands
Talion Reputlic
Umzenon Islands
—16—
Anevares Inlands
Bangan Aulentand
Caun Nelch afdsa
Dita onu Islands
Eurania Tomontia
Fontican Aviisli
Gaitinl Indipige
Izerlind Reraumo
Jarenran Islands
Krdirtel Cenuria
Lazinan Bestilin
Maltinun Islands
Namnan Sceneilin
Pauran and Rarbe
Ramgiad Cruerava
Sagcas and Narol
Tanin Kenthurian
Urarcan Sudhiety
Vharkiin Ralands
—17—
Avaryi Barsseland
Binicon Areonieme
Cbiathan Islicssa
Dulial and Carora
Eciircan Rarandin
Gainonial Islands
Jumo iten Serlacd
Kaldtan Fiamusaia
Lattioalan CSubia
Matperacd ofdente
Neni Nicch Dorova
Sanon Perbelobgie
Tengepoth Iscands
Wapu et Lat Miban

—18—
Amanes Vontenseiar
Bamntal Asitrcanos
Canomad Rertheqiad
Detuereilin Inlatd
Eeet orwen Seworas
Gantarten Afan and
Kary escin Islands
Lamorho an Lebbati
Mem aynol Islandsi
Nibitayd ordheriio
Oenginh Kinderland
Pacaut Martirlalon
Salend Fe Selerdan
Terviniot Islandss
Vetbited Afobeslen
—19—
Agen Tand and Suoni
Bont Marrtan Toraza
Conchar and Pobacia
Eith Miap Minebanis
Galenta Gueutinanes
Kucaran and Samosea
Loettiruan Atereeti
Mon Varmands Island
Notk Maeg Lemtorban
Pestsian and Kupeta
Sarretonien endanda
Tinba Rand Banciton
Utabtin Anian hulan
—20—
Alenra Varin Mepubla
Cenialas and Malalia
Dagic Islands Island
Ekmhula andi Leprico
Garthda and Geekiree
Khin Matib Gebuyston
Liqan and Nantorepua
Maloil Iulands Ncoty
Nanuu Ieean Iymoldir
Samtes iucis Inlends
Teni emctan Sanucias
Urpoe an Rec Sucilia
—21—
Borne iarran Devarece
Car Rethil Nacinitana
Emytican Geru atgania
Gaind Marchan Islonds
Molticon Cini ondiles
Naund Ramen Arigariar
Pomithan Micrilanimia
Ramtitan Saruan Gaico
Sala am Ton Pameruidi

Here are all some lists with common suffixes

*nia Ariania Aruenia Bamenia Bolsnia Bukania Caminia Carenia Copania Eniania Eruinia Eryinia Eyuinia Fvounia Gapania Gorania Guyinia Imgania Lebania Lepania Mezania Pagonia Pamonia Piainia Pirania Saminia Sesinia Simania Somenia Sorinia Tinonia Turunia Urzenia Badetcinia Damalhania Denwarinia Inteniania Mangevinia Seregiania Tezadtinia Tudennenia Akinia Arenia Arunia Bocnia Boinia Bounia Buinia Burnia Byunia Caunia Eminia Gainia Geenia Geinia Giania Guania Guinia Guonia Gwinia Jhunia Jiinia Jirnia Kcenia Leinia Lornia Neenia Rernia Ruenia Sannia Shinia Siinia Siunia Suinia Uninia Vasnia Arefeonia Bevomania Dacucania Eziboonia Gibstania Klbininia Setrounia Shlatania Suunienia Teroninia EwDirireonia Aeirania Bemginia Bunyonia Canmania Carginia Carnania Cosrania Culiinia Cumiinia Duinania Ezupinia Geziania Guinenia Guurania Konvonia Lalzinia Lertania Marbania Nandania Narnania Nenconia Pastania Sadiania Sazcinia Sigwenia Smeminia Sonconia Surbania Taigonia Tebcania Tendania Unyrania Cania Conia Fania Henia Jania Jonia Kinia Lonia Mania Ninia Nonia Sania Tenia Tonia Vania

*lan Anualan Binelan Biselan Comelan Donolan Eduulan Iferlan Ilaslan Iudelan Papilan Potalan Srinlan Takilan Tamglan Cemuneilan Gehsyanlan Mecineslan Amurenoilan Aralan Cralan Geilan Inilan Innlan Kerlan Nanlan Sorlan Tnulan Beugeilan Condamlan Cunogslan Gantiulan Geevallan Gienyslan Memsinlan Mertorlan Minnaulan Mururolan Neminolan Sandeslan Sennerlan Titorilan Vertonlan Andenlan Betarlan Ceneslan Cunmelan Curislan Femanlan Geamilan Keberlan Larielan Meloelan Menrulan Molielan Otenelan Redallan SDatelan Selenlan Alan Glan Tlan Bolan Bulan Culan Galan Malan Selan Solan

*land Garland Hasland Ujoland Bandesland Benhelland Bhqlalland Dhinioland Lenkalland Macgalland Vuleslland Caland Feland Maland Saland Anderland Cemerland Geunoland Lutkaland Mowurland Panciland Parraland Anreland Asealand Hzuuland Maerland Masrland Memoland Namaland Navaland Ponoland Tuysland Vetaland

*ana Amynana Balpana Burgana Congana Fuubana Gainana Gaulana Guiiana Somuana Tartana Vehcana Cunheqrana Berniwhpana Antana Argana Buvana Mabana Merana Mobana Relana Rucana Semana Sikana Nteradana Gitanana Hana Lana Mana Sana Giana Guana Gvana Toana

*ica Cinuica Deyrica Goitica Maltica Mannica Merlica Peotica Raryica Sortica Stamica Sumhica Tektica Tiumica Utiuica Bemgbicica Aniica Bapica Narica Sanica Selica Sibica Gatuitica Iuperiica Ventalica Buuntica Bwentica Sorgeica Uica Baica Umica

*can Banecan Celican Jelican Pelecan Deslisacan Hatendacan Leucan Noccan Tircan Tlycan Shaylican Suniracan Cerarcan Emunecan Gepuucan Mamescan Salgican Vongican Ucan

*dan Euvadan Gtardan Monmdan Seundan Srisdan Unendan Banitisdan Ringkeldan Bildan Landan Saldan Soldan Sordan Tamdan Gakgasdan Mremaldan Stelosdan Lapardan Siwesdan Srunadan

*stan Baystan Caistan Velstan Gentiastan Getnicistan Naporrestan Gistan Mastan Tengastan Sinistan

*tar Lalatar Sanktar Simntar Somytar Swettar Temitar Burekertar Jartar Tantar Unitar Gornitar Satar

VIEW ALL

news + thoughts

`k` index: a weightlighting and Crossfit performance measure

Wed 07-06-2017

Similar to the `h` index in publishing, the `k` index is a measure of fitness performance.

To achieve a `k` index for a movement you must perform `k` unbroken reps at `k`% 1RM.

The expected value for the `k` index is probably somewhere in the range of `k = 26` to `k=35`, with higher values progressively more difficult to achieve.

In my `k` index introduction article I provide detailed explanation, rep scheme table and WOD example.

Dark Matter of the English Language—the unwords

Wed 07-06-2017

I've applied the char-rnn recurrent neural network to generate new words, names of drugs and countries.

The effect is intriguing and facetious—yes, those are real words.

But these are not: necronology, abobionalism, gabdologist, and nonerify.

These places only exist in the mind: Conchar and Pobacia, Hzuuland, New Kain, Rabibus and Megee Islands, Sentip and Sitina, Sinistan and Urzenia.

And these are the imaginary afflictions of the imagination: ictophobia, myconomascophobia, and talmatomania.

And these, of the body: ophalosis, icabulosis, mediatopathy and bellotalgia.

Want to name your baby? Or someone else's baby? Try Ginavietta Xilly Anganelel or Ferandulde Hommanloco Kictortick.

When taking new therapeutics, never mix salivac and labromine. And don't forget that abadarone is best taken on an empty stomach.

And nothing increases the chance of getting that grant funded than proposing the study of a new –ome! We really need someone to looking into the femome and manome.

Dark Matter of the Genome—the nullomers

Wed 31-05-2017

An exploration of things that are missing in the human genome. The nullomers.

Julia Herold, Stefan Kurtz and Robert Giegerich. Efficient computation of absent words in genomic sequences. BMC Bioinformatics (2008) 9:167

Clustering

Wed 31-05-2017
Clustering finds patterns in data—whether they are there or not.

We've already seen how data can be grouped into classes in our series on classifiers. In this column, we look at how data can be grouped by similarity in an unsupervised way.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Clustering. (read)

We look at two common clustering approaches: `k`-means and hierarchical clustering. All clustering methods share the same approach: they first calculate similarity and then use it to group objects into clusters. The details of the methods, and outputs, vary widely.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Clustering. Nature Methods 14:545–546.

Background reading

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Logistic regression. Nature Methods 13:541-542.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of Significance: Classifier evaluation. Nature Methods 13:603-604.

...more about the Points of Significance column

What's wrong with pie charts?

Thu 25-05-2017

In this redesign of a pie chart figure from a Nature Medicine article [1], I look at how to organize and present a large number of categories.

I first discuss some of the benefits of a pie chart—there are few and specific—and its shortcomings—there are few but fundamental.

I then walk through the redesign process by showing how the tumor categories can be shown more clearly if they are first aggregated into a small number groups.

(bottom left) Figure 2b from Zehir et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. (2017) Nature Medicine doi:10.1038/nm.4333

Tabular Data

Tue 11-04-2017
Tabulating the number of objects in categories of interest dates back to the earliest records of commerce and population censuses.

After 30 columns, this is our first one without a single figure. Sometimes a table is all you need.

In this column, we discuss nominal categorical data, in which data points are assigned to categories in which there is no implied order. We introduce one-way and two-way tables and the `\chi^2` and Fisher's exact tests.

Altman, N. & Krzywinski, M. (2017) Points of Significance: Tabular data. Nature Methods 14:329–330.

...more about the Points of Significance column