In 2001, the first human genome sequence was published. Now, just over 10 years later, we capable of sequencing a genome in just a few days. Massive parallel sequencing projects now make it possible to study the cancers of thousands of individuals. New data mining approaches are required to robustly interrogate the data for causal relationships among the inherently noisy biology. How does one identify genetic changes that are specific and causal to a disease within the rich variation that is either natural or merely correlated? The problem is one of finding a needle in a stack of needles. I will provide a non-specialist introduction to data mining methods and challenges in genomics, with a focus on the role visualization plays in the exploration of the underlying data.
The title of the talk was drawn from the paper
Gregory M. Cooper & Jay Shendure Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data Nature Reviews Genetics 12, 628-640 (September 2011)
I will be posting a full list of references for the talk shortly.
Our cover on the 11 January 2023 Cell Genomics issue depicts the process of determining the parent-of-origin using differential methylation of alleles at imprinted regions (iDMRs) is imagined as a circuit.
Designed in collaboration with with Carlos Urzua.
Akbari, V. et al. Parent-of-origin detection and chromosome-scale haplotyping using long-read DNA methylation sequencing and Strand-seq (2023) Cell Genomics 3(1).
Browse my gallery of cover designs.
My cover design on the 6 January 2023 Science Advances issue depicts DNA sequencing read translation in high-dimensional space. The image showss 672 bases of sequencing barcodes generated by three different single-cell RNA sequencing platforms were encoded as oriented triangles on the faces of three 7-dimensional cubes.
More details about the design.
Kijima, Y. et al. A universal sequencing read interpreter (2023) Science Advances 9
Browse my gallery of cover designs.
If you sit on the sofa for your entire life, you’re running a higher risk of getting heart disease and cancer. —Alex Honnold, American rock climber
In a follow-up to our Survival analysis — time-to-event data and censoring article, we look at how regression can be used to account for additional risk factors in survival analysis.
We explore accelerated failure time regression (AFTR) and the Cox Proportional Hazards model (Cox PH).
Dey, T., Lipsitz, S.R., Cooper, Z., Trinh, Q., Krzywinski, M & Altman, N. (2022) Points of significance: Regression modeling of time-to-event data with censoring. Nature Methods 19.
My 5-dimensional animation sets the visual stage for Max Cooper's Ascent from the album Unspoken Words. I have previously collaborated with Max on telling a story about infinity for his Yearning for the Infinite album.
I provide a walkthrough the video, describe the animation system I created to generate the frames, and show you all the keyframes
The video recently premiered on YouTube.
Renders of the full scene are available as NFTs.
I am more than my genome and my genome is more than me.
The MIT Museum reopened at its new location on 2nd October 2022. The new Gene Cultures exhibit featured my visualization of the human genome, which walks through the size and organization of the genome and some of the important structures.