Characterization, Modeling, and Characterization, Modeling, and - PowerPoint PPT Presentation

Characterization, Modeling, and Characterization, Modeling, and Simulation Simulation of Mouse Microarray Microarray Data Data of Mouse David S. Lalush Bioinformatics Research Center North Carolina State University

Acknowledgments • Assistance from: – Jeff Tucker (NIEHS) – Pierre Bushel (NIEHS) – Bruce Weir (NCSU) • Funded by K01 HG02428, National Human Genome Research Institute

Outline • Microarray Simulation Project • Characterization of Microarray Images • Results of Characterization • Simulations • Conclusion

Microarray in Diagnosis Type I tumors Type II tumors Microarray Microarray Gene Gene expression expression pattern pattern

Microarray in Diagnosis Unknown tumor Microarray Gene Type I or type II? expression pattern Probability of misclassification?

Research Focus • Evaluating classification methods • Studying variability in microarray data Problems: •Many replications are required to evaluate error rates. •Microarray experiments are expensive. •True patterns are unknown in real data.

Microarray Simulation • Creating a realistic simulation of microarray data • Accounting for various sources of variability in the system Advantages: •Generates many replications cheaply. •True patterns are known. •Can control sources of variability.

Microarray System Slide Printing Hybridization Sample Scanning Preparation Image Processing Data Analysis

Simulation Model Sample Array Printing Slide Scanning And Hybridization Pin

Simulation Model Sample •Gene expression variation modeled as multivariate normal •Global expression variations Array Printing Slide Scanning modeled as normal And Hybridization Pin

Simulation Model Sample •Background level modeled as normal (dye-dependent) Array Printing Slide Scanning •Defects modeled as 2D causal And Hybridization Markov random field Pin

Simulation Model Sample Array Printing Slide Scanning • Spot size, shape, and orientation And Hybridization modeled as normal • Spot defects modeled with 2D causal Markov random field Pin

Simulation Model Sample •Instantiates spots based on properties from sample, slide, and pin Array Printing Slide Scanning And Hybridization Pin

Simulation Model Sample Array Printing Slide Scanning And Hybridization •Creates discretized image based on spots, SNR, gain, resolution, and blur parameters Pin

Characterization • Characterization of existing microarray images – Spot properties (size, shape, uniformity) – Pin properties (spot uniformity) – Slide properties (background, signal-to-noise) – Gene properties (mean, variance, covariance)

Characterization • Characterization of mouse kidney dataset – Six mice – Four slides each (2x2 fluor flip) – 24 slides in all – 5520 spots in 16 blocks, 4x4 block pattern

Characterization of Spots • Step 1: Spot Detection

Characterization of Spots • Step 2: Spot Morphology Measures Cast rays from centroid Radius Area Eccentricity

Characterization of Spots • Step 3: Spot Intensity Measures – Mean and standard deviation of spot pixels – Mean and standard deviation of background pixels

Characterization of Spots • Step 4: Secondary Intensity Measures Separability − ( signal background ) 2 2 σ + σ signal background

Characterization of Spots • Step 4: Secondary Intensity Measures Spot Uniformity σ signal signal

Characterization of Spot Defects • Spots often exhibit characteristic nonuniformities – Low center – Spot breaks

Characterization of Spot Defects Consider each spot to have two regions Normal region Defect region

Characterization of Spot Defects Each region acts as a hidden state. Each state has its own distribution of emitted intensities. State 0: N State 1: D

Characterization of Spot Defects The probability of a pixel being in a given state depends on its neighbors. N N D D X P(X | N,N,D,D)

Characterization of Spot Defects Region Model (2D causal MRF): • 16 parameters for state transition • 2 parameters for intensity of D region pixels relative to N region (mean, s.d.) State 0: N State 1: D

Characterization of Spot Defects Applying the Region Model Pixel is in D region if: •It is in the spot •It is below the spot average intensity in BOTH channels State 0: N State 1: D

Characterization of Spot Defects Applying the Region Model •Smooth region boundary •Compute the 18 parameters for each spot State 0: N State 1: D

Characterization of Background • Base level and variation – Modeled as stationary across slide • Background defects – Marks, scratches, bright spots, other features – Modeled with 2D Markov random field

Characterization of Background • Classify all background pixels as normal or defect 0.7 – Defect is 2 σ above background mean 0.6 0.5 • Compute statistics on normal background Probability 0.4 • Apply 2D MRF to model defect state 0.3 – Similar to region model 0.2 – Intensities are modeled as beta distribution 0.1 • Measures taken only by slide 0 0 0.01 0.02 0.03 0.04 0.05 Relative Defect Intensity

Characterization of Gene Expression • Multivariate normal distribution for each sample (test or reference) – Mean vector – Covariance matrix • Linear model to account for global effects from slide to slide and dye effects Sample = (mean gene expression) + slope * (slide perturbation) + (variable expression)

Characterization of Gene Expression • Problem: Covariance matrix is BIG (5200x5200) – In simulation, we will have to diagonalize it. • Model the most significant correlations – Compute correlations between each pair of genes on each slide – Cluster genes by correlation distance – Each gene in a cluster has greater than .48 absolute correlation with every other gene in the cluster

Analyzing Characterization Data • Two-way ANOVA – By slide (fixed) – By pin (random) • Which properties varied more? – By slide – By pin – By spot

Analyzing Characterization Data • Spot morphology measures • Spot secondary intensity measures • Spot defect model parameters • Background defect model parameters (by slide measurement only - no ANOVA) Only spots with separability > 1 used in ANOVA

Results Sometimes the images have their own story to tell.

Results: Spot Morphology • Most variation (75% for size measures) was attributed to variation by spot • Pins behaved similarly (mostly) • Slides showed some differences in last eight slides (mice five and six)

Results: Spot Morphology Spot size vs. Pin Number 9 8 7 Radius (pixels) 6 5 4 3 2 1 0 Pin Number

Results: Spot Morphology Spot size vs. Slide Number 9 8 7 Radius (pixels) 6 Mouse 5 Mouse 6 5 Mouse 1 Mouse 2 Mouse 3 Mouse 4 4 3 2 1 0 Slide Number

Results: Spot Intensities • Most variation in separability (83-90%) was attributed to variation by spot • Spot uniformity varied considerably by slide, mostly due to last eight slides

Results: Spot Intensities Spot uniformity (532nm) vs. Slide Number 0.8 0.7 Uniformity (532 nm) 0.6 0.5 0.4 0.3 Mouse 5 Mouse 6 0.2 Mouse 1 Mouse 4 Mouse 2 Mouse 3 0.1 0 Slide Number

Results: Spot Defect MRF • The 16 region transition probability parameters varied by pin – Model the MRF as a property of a pin, not a slide • The mean intensity of defect region was strongly dependent on the pin. • Mean intensity of defect region varied considerably by slide.

Results: Spot Defect MRF Defect region intensity vs. Slide Number 0.9 relative to normal region mean Low region mean intensity 0.8 0.7 Mouse 3 Mouse 4 0.6 Mouse 1 Mouse 2 0.5 0.4 Mouse 5 Mouse 6 0.3 0.2 0.1 0 Slide Number

Results: Background MRF • Last eight slides had more intense background defects • Last eight also had higher probabilities of generating a defect

Results: Background MRF Background defect intensity vs. Slide Number 0.5 Intensity of Background Defects Relative to Background Mean 0.4 Mouse 6 0.3 Mouse 5 0.2 0.1 Mouse 4 Mouse 1 Mouse 3 Mouse 2 0 Slide Number

Results: General • Slide-pin interactions were small (<5% of variance in all cases) • Therefore, modeling of slide and pin effects separately is justified.

Results: Summary • Characterization shows differences in the properties of slides for mice five and six: – Spots were more likely to be broken. – Spot breaks were more severe. – Background defects were more numerous. – Background defects were more intense. Did this impact the estimated mouse-to-mouse variation?

Characterization, Modeling, and Characterization, Modeling, and - PowerPoint PPT Presentation

Characterization, Modeling, and Characterization, Modeling, and Simulation Simulation of Mouse Microarray Microarray Data Data of Mouse David S. Lalush Bioinformatics Research Center North Carolina State University Acknowledgments

Characterization of the Household Electricity Characterization of the Household Electricity

SITE CHARACTERIZATION Part 1. Non-Intrusive Site Characterization Technologies Tyler E. Gass,

zyxwvutsrqponmlkihgfedcbaWVUTSRQPONMLKIHGFEDCBA Characterization, Characterization, Modeling,

Geomaterial Characterization Sub-topics Chemical characterization pH, TDS, EC, BOD, COD

Sub-topics Chemical characterization Sorption-Desorption (Contaminant Transport in Porous

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

The Detection and Characterization of The Detection and Characterization of Nanoparticles

ISA-Independent W ISA-Independent Workload Characterization and orkload Characterization and

Outline Outline Tissue Modeling and Tissue Modeling and Tissue characteristics Tissue

+ Characterization of Miller Run and Conceptual Plan for Characterization of Miller Run and

Characterization and re- -annotation annotation Characterization and re of common genes found

Fabrication and Characterization Fabrication and Characterization of Organic Semiconductors f for

Formation, Characterization, and Application Formation, Characterization, and Application of Gas-

SYNTHESIS AND CHARACTERIZATION OF SYNTHESIS AND CHARACTERIZATION OF GRAPHENE OXIDE DERIVATIVES VIA

New Proposal: Characterization and New Proposal: Characterization and Manipulation of Ellipsoidal

ACTIVATING THE HUMAN RIGHT TO SCIENCE Big Data and the responsible sharing of genomic and

Human Genetic Databases: Towards a Global Ethical Framework Alexandre Mauron & Andrea

PREPARING FACULTY TO ENGAGE IN TRANSDISCIPLINARY TEAMS - AND CONVERGENT RESEARCH Session 2

CUHK Workshop on Regulation of Emerging 12/06/2019 Technologies My aims today: Tempering hype:

Innovation Seminar Series Bridging science and business 21 July 2017 | OIST Graduate

Towards Synthesizing Artificial Neural Networks that Exhibit Cooperative Intelligent Behavior:

Global surveillance of infectious diseases Open science, open data, open for all Frank M.

Next Next Generation Sequencing: an overview of Generation Sequencing: an overview of

Sambuz

Useful Links

Newsletter

Mail Us

Characterization, Modeling, and Characterization, Modeling, and - PowerPoint PPT Presentation

Characterization, Modeling, and Characterization, Modeling, and Simulation Simulation of Mouse Microarray Microarray Data Data of Mouse David S. Lalush Bioinformatics Research Center North Carolina State University Acknowledgments

Characterization of the Household Electricity Characterization of the Household Electricity

SITE CHARACTERIZATION Part 1. Non-Intrusive Site Characterization Technologies Tyler E. Gass,

zyxwvutsrqponmlkihgfedcbaWVUTSRQPONMLKIHGFEDCBA Characterization, Characterization, Modeling,

Geomaterial Characterization Sub-topics Chemical characterization pH, TDS, EC, BOD, COD

Sub-topics Chemical characterization Sorption-Desorption (Contaminant Transport in Porous

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

The Detection and Characterization of The Detection and Characterization of Nanoparticles

ISA-Independent W ISA-Independent Workload Characterization and orkload Characterization and

Outline Outline Tissue Modeling and Tissue Modeling and Tissue characteristics Tissue

+ Characterization of Miller Run and Conceptual Plan for Characterization of Miller Run and

Characterization and re- -annotation annotation Characterization and re of common genes found

Fabrication and Characterization Fabrication and Characterization of Organic Semiconductors f for

Formation, Characterization, and Application Formation, Characterization, and Application of Gas-

SYNTHESIS AND CHARACTERIZATION OF SYNTHESIS AND CHARACTERIZATION OF GRAPHENE OXIDE DERIVATIVES VIA

New Proposal: Characterization and New Proposal: Characterization and Manipulation of Ellipsoidal

ACTIVATING THE HUMAN RIGHT TO SCIENCE Big Data and the responsible sharing of genomic and

Human Genetic Databases: Towards a Global Ethical Framework Alexandre Mauron &amp; Andrea

PREPARING FACULTY TO ENGAGE IN TRANSDISCIPLINARY TEAMS - AND CONVERGENT RESEARCH Session 2

CUHK Workshop on Regulation of Emerging 12/06/2019 Technologies My aims today: Tempering hype:

Innovation Seminar Series Bridging science and business 21 July 2017 | OIST Graduate

Towards Synthesizing Artificial Neural Networks that Exhibit Cooperative Intelligent Behavior:

Global surveillance of infectious diseases Open science, open data, open for all Frank M.

Next Next Generation Sequencing: an overview of Generation Sequencing: an overview of

Sambuz

Useful Links

Newsletter

Mail Us

Human Genetic Databases: Towards a Global Ethical Framework Alexandre Mauron & Andrea