Visualization In Biology Alexander Lex CS 171 Guest Lecture, - PowerPoint PPT Presentation

Visualization In Biology Alexander Lex CS 171 Guest Lecture, 18.04.2013

WHA HAT T DO O I M I MEAN: N: VIS ISUALI ALIZA ZATION TION IN IN BI BIOL OLOG OGY? Y? 2

Visualizing the Flight of Bats? [Bergou 2011] 3

Visualizing Bird Populations? [Ferreira 2011] 4

Visualizing Fish Swarms? [Boosherian 2012] 5

Visualizing CT/MRI Data? [Bruckner 2007] 6

NO NO! ! IN N THI HIS LE LECTUR TURE: E: MOL OLECUL CULAR AR BIOLOG OLOGY Y (M (MB) 7

Why is MB important? Causes of Death in the USA 2011 Heart disease Cancer Chronic lower… Stroke Accidents Alzheimer's disease Diabetes Kidney-Related Influenza and Pneumonia Suicide 0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 [Data from CDC Death and Mortality Repot 2011] 8

Why is MB important? Causes of Death in the USA 2011 Heart disease Cancer Chronic lower… Stroke Accidents Alzheimer's disease Diabetes Kidney-Related Influenza and Pneumonia Suicide 0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 [Data from CDC Death and Mortality Repot 2011] 9

Why is MB important? Understanding Fundamentals in Biology Disease Prevention Targeted Diagnosis (BioMarkers) Personalized Medicine Drug Development Targeted Modification of Organisms 10

Why is Vis for MB important? Biology is experiencing a revolution! Transformation from a wet-lab/experimental to computational science Challenge in MB is shifting from Data Acquisition to Data Processing & Analysis 11

Why is Vis for MB important? 12

What does this mean? We can now do very large experiments 13

Why is the Analysis Hard? 20,000 protein coding genes (1.5% of the genome) 3 billion basepairs Gene -> Protein -> Function Each of these steps is influenced by many processes! Very complex interplay of functional aspects. 14

Major Areas for Vis in MB Genome Structure Genome Activity - Omics Data Biological Networks Macromolecular Structures Phylogenetics 15

Genome Structure What is the sequence of bases in a genome? Common “Defects” Chromosomal alterations Scale Copy-number variation Mutations SNPs How do these influence the phenotype? 16

Genome Structure Vis “Track - based” Visualization 17

Circular Layouts [Meyer 2009] 18 [Krzywinski 2009]

Genome Activity Which genes are active? How active are they? Protein Expression Gene Expression Epigenetics: miRNA Expression methylation What is the function of a gene? 19

Heat Maps [Eisen 1999] 20

New Approaches! 21 [Meyer 2010]

Biological Networks How do proteins and other (bio)chemical products interact? Protein-Protein interaction Pathways What are the processes in a cell? 23

Protein Interaction Networks [Cytoscape] 24

Pathways [Kegg] 25

Pathways [Kegg] 26

Pathways – Free Layouts [Barsky 2008] 27

CASE SE STUD UDIES IES http://caleydo.org 28

What is Caleydo? Software for visualizing biomolecular data tabular data numerical & categorical e.g., mRNA, microRNA, copy number variation, methylation, mutation status, etc. clinical data pathways KEGG, WikiPathways 29

Caleydo Core Features Multi-Dataset Analysis. Want to see…. …relationships between multiple datasets? …relationships between tabular and graph data? 30

What is Caleydo? Software for doing research in visualization developed in academic setting platform for trying out radically new visualization ideas Quest for compromise between academic prototyping and ready-to-use software Marc Streit & Alexander Lex 31

Who is Caleydo? Marc Streit Johannes Kepler University Linz, AT Alexander Lex Harvard University, Cambridge, USA Christian Partl Graz University of Technology, AT Samuel Gratzl Johannes Kepler University Linz, AT Nils Gehlenborg Harvard Medical School, Boston, USA Dieter Schmalstieg Graz University of Technology, AT Hanspeter Pfister Harvard University, Cambridge, USA 32

Case Study CANC NCER ER SUB UBTYPE TYPE VISU SUALIZA ALIZATION TION 33

Motivation Cancer types are not homogeneous They are divided into Subtypes different histology different molecular alterations Subtypes have serious implications different treatment for subtypes prognosis varies between subtypes 34

Cancer Subtype Analysis Done using many different types of data , for large numbers of patients. 35

Large-scale project to catalogue genetic mutations responsible for cancer 20 tumor types 500 patient samples each Extensive molecular profiling for each patient 36

TCGA Data microRNA expression clinical mRNA parameters expression methylation mutation pathways levels status copy number status 37

Subtype Identification Patients 38

Our goal is to support tu tumo mor r subtyp btype e ch chara racter cteriz izatio ation through integrative vis isual ual analysis ysis of of ca cance cer r genomi omics cs data ta sets ts. 39

Data-View Integrator Challenge 1 Manage complex setup of multiple datasets , multiple stratifications & multiple views Challenge 2 Visualize complex interdependencies between multiple, heterogeneous, large datasets StratomeX 40

Subtype Identification Process Step 1: Determine candidate subtypes Step 2: Find supporting evidence 41

T abular Data Stratification Patients Candidate Subtypes Genes, Proteins, etc. 42

Stratification of a Single Dataset Cluster A1 Cluster A2 Cluster A3 43

Stratification Subtypes are identified by stratifying datasets, e.g., based on an expression pattern a mutation status a copy number alteration a combination of these 44

Subtype Identification Process Step 1: Determine candidate subtypes Step 2: Find supporting evidence 45

T asks T1 Evaluate whether stratifications support each other T2 Review effect of stratifications on clinical outcomes on pathways T3 Show expression patterns in subtypes 46

Stratification of Multiple Datasets B1 Cluster A1 Cluster A2 B2 T1 Evaluate whether stratifications support each other Cluster A3 Tabular Categorical, e.g., mRNA e.g., mutation status 47

Example: Titanic Dataset Multi-dimensional dataset Age Name Gender Survival status Class 1st class, 2nd class, 3rd class and crew How many male crew members survived ? http://lib.stat.cmu.edu/S/Harrell/data/descriptions/titanic.html 48

Mosaic Plot Matrix [Friendly 1999] How many male crew members survived ? 49

Parallel Sets [Kosara 2006] How many male crew members survived ? 50

Stratification of Multiple Datasets B1 Cluster A1 Cluster A2 B2 T1 Evaluate whether stratifications support each other Cluster A3 Tabular Categorical, e.g., mRNA e.g., mutation status 51

Stratification of Multiple Datasets B1 Cluster A1 Dep. C1 Dep. C2 Cluster A2 B2 T2 Review effect of stratification Cluster A3 Tabular Categorical, Dependent Data, e.g., mRNA e.g., mutation status e.g. clinical data 52

Columns = Genes Band = Rows = Patients Subset of Patients 53

Patients Patients stratified by stratified by Copy Number Clustering 54

Cate- Depen- Table gorical dent T3 Show expression patterns in subtypes 55

Survival EGFR Copy Number Status mRNA Levels Glioma Pathway Survival 56

Live-Demo! http://stratomex.caleydo.org 57

Case Study PATHW HWAY Y & & EXPERIM ERIMENT ENTAL AL DATA 58

Experimental Data and Pathways Pathways represent consensus knowledge for a healthy organism or specific disease Cannot account for variation found in real-world data Branches can be (in)activated due to mutation, changed gene expression, modulation due to drug treatment, etc. 59

Why use Visualization? Efficient communication of information B A C A -3.4 B 2.8 C 3.1 D D -3 E 0.5 E F 0.3 F 60

Experimental Data and Pathways [KEGG] [Lindroos2002] 61

REQU QUIR IREMEN EMENTS TS ANALYS YSIS IS 62

What to Consider when Visualizing Experimental Data and Pathways Five Requirements Ideal visualization technique addresses all Talking about 3 today 63

R I: Data Scale Large number of experiments Large datasets have more than 500 experiments Multiple groups/conditions 64

R II: Data Heterogeneity Different types of data, e.g., mRNA expression numerical mutation status categorical copy number variation ordered categorical metabolite concentration numerical Require different visualization techniques 65

R V: Supporting Multiple T asks B Two central tasks: A C Explore topology of pathway D Explore the attributes of the nodes E (experimental data) F Need to support both! 66

VISU SUALIZA ALIZATION TION TECHNI HNIQUES QUES Alexander Lex | Harvard University 67

Visualization Approaches [Lindroos 2002] On-Node Mapping Separate Linked Views Small Multiples [Meyer 2010] [Junker 2006] Layout Adaption Linearization Path-Extraction Alexander Lex | Harvard University 68

Visualization In Biology Alexander Lex CS 171 Guest Lecture, - PowerPoint PPT Presentation

Visualization In Biology Alexander Lex CS 171 Guest Lecture, 18.04.2013 WHA HAT T DO O I M I MEAN: N: VIS ISUALI ALIZA ZATION TION IN IN BI BIOL OLOG OGY? Y? 2 Visualizing the Flight of Bats? [Bergou 2011] 3 Visualizing Bird

Security Visualization Tim Vidas & Hanan Hibshi UPS 2011 1 Visualization Visualization can

2019-20 DNA Biology New Products RNA Biology PROTEIN Biology MOLECULAR Biology Plant DNA

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of

Data Visualization Brait ispuu Types of Visualization Mathematical Visualization y =

Basics of Molecular biology Molecular biology is the study of biology at molecular level.

Visualization CS 299 Introduction to Data Science Overview 1. What Is Visualization? 2.

Visualization Systems 11-1 Ronald Peikert SciVis 2008 - Visualization Systems Modular

Data Visualization Tools, How do you make a visualization? Is it the right visualization?

Introduction to Fetal Medicine: Genetics and Embryology Question: What do cancer biology,

connections between cs and biology computing science and biology (1) biology is the science

Information Visualization Text: Information visualization, Robert Spence, Addison-Wesley, 2001

Glyph-based Visualization Applications David H. S. Chung Swansea University Outline Glyph

Scientific Visualization : From Data to Insight Vijay Natarajan Indian Institute of Science

Visualization History Visual Programming Visualization History Visual Programming

Code Visualization 2 Code Visualization PaiMei and uDraw(Graph)

Scientific Visualization Algorithms Graphics & Visualization: Principles & Algorithms

A Multitumor Regional Symposium Focused on the Application of Emerging Research Information to the

Mitosis and Meiosis used for any commercial purpose without the written permission of the owners.

Treatment of low risk MDS Matteo G Della Porta Cancer Center IRCCS Humanitas Research Hospital

Lecture 23: Genome Rearrangements Spring 2017 May 4,

Seeing Single Molecules Seeing Single Molecules Dr. Arindam Chowdhury Department of Chemistry

Informed Search and Exploration Berlin Chen 2004 Reference: 1. S. Russell and P. Norvig.

Natural Selection 02-715 Advanced Topics in Computa8onal Genomics

ideogram layout and formatting SESSION 1 MARTIN KRZYWINSKI Genome Sciences Center BC Cancer

Sambuz

Useful Links

Newsletter

Mail Us

Visualization In Biology Alexander Lex CS 171 Guest Lecture, - PowerPoint PPT Presentation

Visualization In Biology Alexander Lex CS 171 Guest Lecture, 18.04.2013 WHA HAT T DO O I M I MEAN: N: VIS ISUALI ALIZA ZATION TION IN IN BI BIOL OLOG OGY? Y? 2 Visualizing the Flight of Bats? [Bergou 2011] 3 Visualizing Bird

Security Visualization Tim Vidas &amp; Hanan Hibshi UPS 2011 1 Visualization Visualization can

2019-20 DNA Biology New Products RNA Biology PROTEIN Biology MOLECULAR Biology Plant DNA

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of

Data Visualization Brait ispuu Types of Visualization Mathematical Visualization y =

Basics of Molecular biology Molecular biology is the study of biology at molecular level.

Visualization CS 299 Introduction to Data Science Overview 1. What Is Visualization? 2.

Visualization Systems 11-1 Ronald Peikert SciVis 2008 - Visualization Systems Modular

Data Visualization Tools, How do you make a visualization? Is it the right visualization?

Introduction to Fetal Medicine: Genetics and Embryology Question: What do cancer biology,

connections between cs and biology computing science and biology (1) biology is the science

Information Visualization Text: Information visualization, Robert Spence, Addison-Wesley, 2001

Glyph-based Visualization Applications David H. S. Chung Swansea University Outline Glyph

Scientific Visualization : From Data to Insight Vijay Natarajan Indian Institute of Science

Visualization History Visual Programming Visualization History Visual Programming

Code Visualization 2 Code Visualization PaiMei and uDraw(Graph)

Scientific Visualization Algorithms Graphics &amp; Visualization: Principles &amp; Algorithms

A Multitumor Regional Symposium Focused on the Application of Emerging Research Information to the

Mitosis and Meiosis used for any commercial purpose without the written permission of the owners.

Treatment of low risk MDS Matteo G Della Porta Cancer Center IRCCS Humanitas Research Hospital

Lecture 23: Genome Rearrangements Spring 2017 May 4,

Seeing Single Molecules Seeing Single Molecules Dr. Arindam Chowdhury Department of Chemistry

Informed Search and Exploration Berlin Chen 2004 Reference: 1. S. Russell and P. Norvig.

Natural Selection 02-715 Advanced Topics in Computa8onal Genomics

ideogram layout and formatting SESSION 1 MARTIN KRZYWINSKI Genome Sciences Center BC Cancer

Sambuz

Useful Links

Newsletter

Mail Us

Security Visualization Tim Vidas & Hanan Hibshi UPS 2011 1 Visualization Visualization can

Scientific Visualization Algorithms Graphics & Visualization: Principles & Algorithms