Visual Analytics for Genomics Cydney Nielsen ! BC Cancer Agency ! Vancouver, BC, Canada !
Outline Part 1 Introduction to Genomics Part 2 Visual Design for Genomics Part 3 Hands-On Design Exercise
Part 1 Introduction to Genomics
Genomics Workflow genome : the complete genetic material of a cell Part 1. Intro to Genomics
Sequencing Experiment Part 1. Intro to Genomics
Sequencing Experiment Part 1. Intro to Genomics
Sequencing Experiment G - C ! T - A ! Part 1. Intro to Genomics
Genomics Workflow sample data insight Part 1. Intro to Genomics
Genomics Workflow sample experiment sequencing technology ! data insight Part 1. Intro to Genomics
Genomics Workflow sample experiment sequencing technology ! data + analysis visualization ! computation ! insight Part 1. Intro to Genomics
Genomics Workflow sample experiment sequencing technology ! data + analysis visualization ! computation ! insight Part 1. Intro to Genomics
Genomics Workflow sample experiment sequencing technology ! data molecular biology Part 1. Intro to Genomics
Genomics Workflow computational biology / bioinformatics visual analytics data + analysis visualization ! computation ! insight Part 1. Intro to Genomics
Genomics Workflow sample experiment sequencing technology ! data molecular biology Part 1. Intro to Genomics
Sequencing Experiment TACACCGATACACCAGA$ ACCAGATGGATTAGATGTA$ AAAAAAAAAAAAAAGATGT$ AAAGATGTATACCACCAG$ CACCAGTACACCGATA$ Sequencing machine ! Millions of short sequences (“reads”) ! e.g. 75 nt each compared to >3 billion nt in human genome ! Part 1. Intro to Genomics
Sequencing Experiment ~$5,000$ in$2001$ ~10¢$ in$2011$ Part 1. Intro to Genomics
Genomics Workflow computational biology / bioinformatics visual analytics data + analysis visualization ! computation ! insight Part 1. Intro to Genomics
Sequencing Experiments De novo assembly ! AGCTTCAGATGGACAGATAA$ GGCATACAGACTTAGACATA$ CCAGACAAGACAGACACAGTA$ TACAAGACATAAGCAATACAGA$ CCAGACAAGACAGACACAGTA$ Genome$Assembly$ Part 1. Intro to Genomics
Sequencing Experiments De novo assembly ! Re-sequencing ! GGCATACAGACTTAGACATA$ AGCTTCAGATGGACAGATAA$ AGCTTCAGATGGACAGATAA$ CCAGACAAGACAGACACAGTA$ GGCATACAGACTTAGACATA$ CCAGACAAGACAGACACAGTA$ CCAGACAAGACAGACACAGTA$ TACAAGACATAAGCAATACAGA$ TACAAGACATAAGCAATACAGA$ CCAGACAAGACAGACACAGTA$ Reference$Genome$ Genome$Assembly$ Part 1. Intro to Genomics
Sequencing Experiments De novo assembly ! Re-sequencing ! Enrichment ! CCAGACAAGACAGACACAGTA$ GGCATACAGACTTAGACATA$ AGCTTCAGATGGACAGATAA$ AGCTTCAGATGGACAGATAA$ AGCTTCAGATGGACAGATAA$ GGCATACAGACTTAGACATA$ CCAGACAAGACAGACACAGTA$ GGCATACAGACTTAGACATA$ CCAGACAAGACAGACACAGTA$ CCAGACAAGACAGACACAGTA$ CCAGACAAGACAGACACAGTA$ TACAAGACATAAGCAATACAGA$ TACAAGACATAAGCAATACAGA$ TACAAGACATAAGCAATACAGA$ CCAGACAAGACAGACACAGTA$ Reference$Genome$ Reference$Genome$ Genome$Assembly$ Part 1. Intro to Genomics
Sequencing Experiments • What sequence variations appear in cancer patients, but not in unaffected individuals? ! • Are these variations predictive of survival outcome? ! • Are these variations causal for the disease (driver mutations) or not? ! ! Part 1. Intro to Genomics
Part 1 - Summary 1. Large and ever increasing volume of sequencing data ! 2. Improved analysis techniques are essential for biologists and clinicians to make the most of these data ! 3. Great potential for visual analytics to facilitate insight and understanding ! ! Part 1. Intro to Genomics
Part 2 Visual Design for Genomics
Challenge 1 Large number of samples for comparison ! Part 2. Visual Design for Genomics
Challenge 1 Large number of samples for comparison ! “To systematically characterize the genomic changes in hundreds of tumors… and thousands of samples over the next five years” ! ! The Cancer Genome Atlas ! www.cancergenome.nih.gov ! Part 2. Visual Design for Genomics
Genome Browsers Stacked data tracks along a common genome x-axis ! Data samples ! Genome coordinate !
Genome Browsers Home Genomes Blat Tables Gene Sorter PCR PDF/PS Session FAQ Help UCSC Cancer Genomics Heatmaps Glioblastoma Copy Number Abnormality, Agilent 244A array (n=200) Data samples ! r e Tumor vs normal d n e G Genome coordinate ! Zhu et al ., Nature Methods, 2009 ! Part 2. Visual Design for Genomics
Challenge 1 Large number of samples for comparison ! ! Critically consider what you need to display ! ! ! ! e.g. replace primary data with a biologically meaningful summary, such as significant changes between samples ! Part 2. Visual Design for Genomics
Challenge 2 Genomic features are small and sparse ! Part 2. Visual Design for Genomics
Genome Browsers LOCAL VIEW ! Part 2. Visual Design for Genomics
Genome Browsers LOCAL VIEW ! Human chr1, 1 pt corresponds to 480 kb, which is larger than 98% of all human genes! ! Part 2. Visual Design for Genomics
Hilbert Curve GLOBAL VIEW ! a b expressed genes Chromosome 3L Cluster of small 5 ′ 3 ′ Open chromatin domain domains PcG 5 ′ 3 ′ Heterochromatin- like domain 5 ′ 3 ′ heterochromatin Pericentromeric 5 ′ 3 ′ Chromatin states: 1 2 3 4 5 6 7 8 9 Kharchenko et al ., Nature, 2011 ! Anders, Bioinformatics, 2009 ! Part 2. Visual Design for Genomics
Challenge 2 Genomic features are small and sparse ! Connect overview and detail ! Part 2. Visual Design for Genomics
Challenge 3 Genomic features involve non-adjacent positions ! Part 2. Visual Design for Genomics
� � Challenge 3 Structural rearrangements ! a b K ʹ J J J ʹ K K ʹ K J ʹ c J K ʹ K J ʹ d J ʹ e Variant K J J ʹ K ʹ J K ʹ K’ K Reference Part 2. Visual Design for Genomics
� � Challenge 3 Structural rearrangements ! a b K ʹ J J J ʹ K K ʹ K J ʹ c J K ʹ K J ʹ d J ʹ e Variant K J J ʹ K ʹ J K ʹ K’ K Reference Part 2. Visual Design for Genomics
Challenge 3 Structural rearrangements ! Circos, Martin Krzywinski ! Part 2. Visual Design for Genomics
� � Challenge 3 Structural rearrangements ! a b K ʹ J J J ʹ K K ʹ K J ʹ c J K ʹ K J ʹ d J ʹ e Variant K J J ʹ K ʹ J K ʹ K’ K Reference Part 2. Visual Design for Genomics
Challenge 3 Structural rearrangements ! VISTA-Dot ! Part 2. Visual Design for Genomics
Challenge 3 All these representations use a genomic coordinate system, which emphasizes base-pair distance between points. ! ! Is this the best use of positional information? ! Part 2. Visual Design for Genomics
Challenge 3 M. Krzywinski adapted from Mackinlay J (1986) ACM Trans Graph 5: 110-141. ! Part 2. Visual Design for Genomics
� � Challenge 3 Structural rearrangements ! a b K ʹ J J J ʹ K K ʹ K J ʹ c J K ʹ K J ʹ d J ʹ e Variant K J J ʹ K ʹ J K ʹ K’ K Reference Part 2. Visual Design for Genomics
Challenge 3 Genomic features involve non-adjacent positions ! Encode important information in position ! Part 2. Visual Design for Genomics
Challenge 4 Large number of data types ! Part 2. Visual Design for Genomics
Genomic rearrangement in cancer A Deletion-type Tail-to-tail inverted SNU-C1 (colorectal): Chr 15 Tandem dup-type Head-to-head inverted Non-inverted orientation 4 Copy 2 number 0 1 Allelic ratio 0 Inverted orientation 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Genomic location (Mb) Stephens et al. , Cell, 2011 ! Part 2. Visual Design for Genomics
17 mouse genomes N N Z a O O D / / H SNPs S I h L 0 >100,000 D i t L J L SVs C B t P 5 J A 0 742 7 C / / J C B B 2 TEs L 3 A J 1 / B H 0 179 6 / 2 A J / N 9 Uncallable H L CAST/EiJ 1 1 J S B e 2 0 836 2 5 / J 14 9 9 / 13 2 A c 15 1 16 11 S A 18 0 P S J 19 1 7 K v 1 1 / X 2 9 E R J 8 / / 1 7 O S v / J B v 2 6 l r a I d 3 5 m H 4 4 J s d 3 5 6 2 WSB/EiJ 7 1 8 1 9 1 0 2 11 3 12 13 4 14 5 15 16 6 1 7 7 1 8 1 9 8 X 9 10 1 1 1 2 1 2 13 3 14 4 15 5 16 6 17 18 7 19 8 X 9 X 0 1 19 1 1 18 PWK/PhJ 2 1 1 7 13 16 4 15 1 15 1 4 16 13 17 12 18 19 11 10 X 9 8 1 2 7 3 6 4 5 SPRET/EiJ Keane et al ., Nature, 2011 ! b Part 2. Visual Design for Genomics
Challenge 4 Large number of data types ! Exploit domain-specific details in your design ! Part 2. Visual Design for Genomics
Challenge 5 No longer one genome but many ! Part 2. Visual Design for Genomics
Challenge 5 No longer one genome but many ! Part 2. Visual Design for Genomics
Single nucleotide variation Ossowski et al . Genome Research, 2008 ! Part 2. Visual Design for Genomics
Single nucleotide variation Integrative Genomics Viewer (IGV) ! Robinson et al . Nature Biotechnology, 2011 ! Part 2. Visual Design for Genomics
Challenge 5 No longer one genome but many ! Be open to change (genomics is evolving quickly) ! Part 2. Visual Design for Genomics
Recommend
More recommend