Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. - PowerPoint PPT Presentation

Diagnostics Gad Kimmel

Outline ● Introduction. ● Bootstrap method. ● Cross validation. ● ROC plot.

Introduction

Motivation ● Estimating properties of an estimator (an estimator is a function of input points). x 1, x 2, ... ,x N − Given data samples , evaluate some estimator, say the average: ∑ x i N − How can we estimate its properties (e.g., its variance)? var  ∑ x i = 1 2 var  ∑ x i  N N ● Model selection. − How many parameters should we use?

Bootstrap Method

Evaluating Accuracy ● A simple approach for accuracy estimation is to provide the bias or variance of the estimator. ● Example: suppose the samples are independently identically distributed (i.i.d.), with finite variance. − We know, by the central limit theorem, that 1 / 2   x n − n  Z ~ N  0,1   − Roughly speaking, is normally distributed with x n  2 / n   expectation and variance .

Assumptions Do Not Hold ● What if the r.v. are not i.i.d. ? ● What if we want to evaluate another estimator (and not )? x n  ● It would be nice to have many different samples of samples. ● In that case, one could calculate the estimator for each sample of samples, and infer its distribution. ● But... we don't have it.

Solution - Bootstrap ● Estimating the sampling distribution of an estimator by resampling with replacement from the original sample. ● Efron, The Annals of Statistics , '79.

Bootstrap - Illustration ● Goal: Sampling from P. P

Bootstrap - Illustration ● Goal: Sampling from P. x 1 , x 2 , x 3 , x 4 , ... , x n P

Bootstrap - Illustration ● Goal: Sampling from P. x 1 , x 2 , x 3 , x 4 , ... , x n P ... in order to estimate the variance of an estimator.

Bootstrap - Illustration Samples Estimator x 1,1 ,x 1,2 , x 1,3 , ... , x 1, n e 1 x 2,1 , x 2,2 , x 2,3 , ... ,x 2, n e 2 x 3,1 , x 3,2 , x 3,3 , ... , x 3, n e 3 P x 4,1 , x 4,2 , x 4,3 , ... ,x 4, n e 4 ... x m , 1 ,x m , 2 , x m, 3 , ... , x m, n e m

Bootstrap - Illustration Samples Estimator x 1,1 ,x 1,2 , x 1,3 , ... , x 1, n e 1 x 2,1 , x 2,2 , x 2,3 , ... ,x 2, n e 2 x 3,1 , x 3,2 , x 3,3 , ... , x 3, n e 3 P x 4,1 , x 4,2 , x 4,3 , ... ,x 4, n e 4 ... x m , 1 ,x m , 2 , x m, 3 , ... , x m, n e m ● What is the variance of ? e

Bootstrap - Illustration Samples Estimator x 1,1 ,x 1,2 , x 1,3 , ... , x 1, n e 1 x 2,1 , x 2,2 , x 2,3 , ... ,x 2, n e 2 x 3,1 , x 3,2 , x 3,3 , ... , x 3, n e 3 P x 4,1 , x 4,2 , x 4,3 , ... ,x 4, n e 4 ... x m , 1 ,x m , 2 , x m, 3 , ... , x m, n e m var  e = 1 m ● Estimate the variance by m ∑ i = 1 2  e i − 

Bootstrap - Illustration ● We only have 1 sample: x 1 , x 2 , x 3 , x 4 , ... , x n P

Bootstrap - Illustration ● Sampling is done from the empirical distribution. Samples Estimator z 1,1 ,z 1,2 , z 1,3 , ... , z 1, n e 1 z 2,1 , z 2,2 , z 2,3 , ... , z 2, n e 2 P z 3,1 , z 3,2 , z 3,3 , ... , z 3, n e 3 x 1 , x 2 , x 3 , x 4 , ... ,x n z 4,1 , z 4,2 , z 4,3 , ... , z 4, n e 4 ... z m , 1 ,z m, 2 , z m, 3 , ... , z m , n e m

Formalization ● The data is . Note that the distribution  x 1, x 2, ... , x n ~ P function P is unknown. ● We sample m samples . Y 1, Y 2, ... ,Y m contains n samples drawn from Y i = z i , 1 , z i , 2 , ... , z i, n  the empirical distribution of the data: # x i Pr [ z j , k = x i ]= n Where is the number of times appears in # x i x i the original data.

The Main Idea ● . Y i ~  P ● We wish that . Is it (always) true? NO. P =  P ● Rather, is an approximation of .  P P

Example 1 ● The yield of the Dow Jones Index over the past two years is ~12%. ● You are considering a broker that had a yield of 25%, by picking specific stocks from the Dow Jones. ● Let x be a r.v. that represents the yield of randomly selected stocks. ● Do we know the distribution of x ?

Example 1 x 1, x 2, ... ,x 10,000 ● Prepare a sample , where each x i is the yield of randomly selected stocks. ● Approximate the distribution of x using this sample.

Evaluation of Estimators ● Using the approximate distribution, we can evaluate estimators. E.g.: − Variance of the mean. − Confidence intervals.

Example 1 ● What is the probability to obtain yield larger than 25% (p-value)?

Example 1 ● What is the probability to obtain yield larger than 25% (p-value)? 30%

Example 2 - Decision tree ● Decision tree - short introduction.

Example 2 ● Building a decision tree.

Example 2 ● Many other trees can be built, using different algorithms. ● For a specific tree one can calculate prediction accuracy: # of elements classified correctly total # of elements

Example 2 ● Many other trees can be built, using different algorithms. ● For a specific tree one can calculate prediction accuracy: # of elements classified correctly total # of elements ● For calculating error bars for this value, we need to sample more, apply the algorithm many times, and each time evaluate the prediction.

Example 2 - Applying Bootstrap Build decision tree for each sample. Calculate prediction for each tree. Evaluate error bars based on predictions.

Example 2 - Applying Bootstrap Build decision T 1 ,T 2 , ... ,T n tree for each sample. Calculate prediction p 1 , p 2 , ... , p n p 1 , p 2 , ... , p n for each tree. Evaluate error bars ± 1.96 STD  p 1 , p 2 , ... , p n  based on predictions.

Example 2 - Applying Bootstrap But we have Build decision only one data tree for each set ! sample. Calculate prediction for each tree. Evaluate error bars based on predictions.

Example 2 - Applying Bootstrap Use bootstrap Build decision to prepare many tree for each samples. sample. Calculate prediction for each tree. Evaluate error bars based on predictions.

Cross Validation

Objective ● Model selection.

Formalization ● Let (x, y) drawn from distribution P . Where n and y ∈ℜ x ∈ℜ ● Let be a learning algorithm, with n  ℜ f  : ℜ parameter(s) . 

Example ● Regression model.

What Do We Want? ● We want the method that is going to predict future data most accurately, assuming they are drawn from the distribution P .

What Do We Want? ● We want the method that is going to predict future data most accurately, assuming they are drawn from the distribution P . ● Niels Bohr: " It is very difficult to make an accurate prediction, especially about the future. "

Choosing the Best Model ● For a sample ( x , y ) which is drawn from the distribution function P : 2  f   x − y  or |  f   x − y  | ● Since ( x , y ) is a r.v. we are usually interested in: 2 ] E [ f   x − y 

Choosing the Best Model (cont.) ● Choose the parameter(s) :  2 ] argmin  E [ f   x − y  ● The problem is that we don't know to sample from P .

Regression − Order of 1 (Linear) 20 18 16 14 12 10 8 6 4 2 0 4 6 8 10 12 14 16

Regression − Order of 2 20 18 16 14 12 10 8 6 4 2 0 4 6 8 10 12 14 16

Regression − Join the Dots 20 18 16 14 12 10 8 6 4 2 0 4 6 8 10 12 14 16

Solution - Cross Validation ● Partition the data to 2 sets: − Training set T . − Test set S . ● Calculate using only the training set T .  ● Given , calculate  1 | S | ∑  x i , y i ∈ S  f   x i − y i  2

Back to the Example ● In our case, we should try different orders for the regression (or different # of params). ● Each time apply the regression only on the training set, and calculate estimation error on the test set. ● The # of parameters will be the one minimizing the error.

Variants of Cross Validation ● Test - set. ● Leave one out. ● k-fold cross validation.

K-fold Cross Validation Train Train Test Train Train

K-fold Cross Validation ● We want to find a parameter that minimizes the cross validation estimate of prediction error: CV  = 1 | N | ∑ L  y i , f − k  i   x i ,  

K-fold Cross Validation ● How to choose K? ● K=N ( = leave one out) - CV is unbiased for true prediction error, but can have high variance. ● When K increases - CV has lower variance, but bias could be a problem (depending on how the performance of the learning method varies with size of training set).

ROC Plot (Receiver Operating Characteristic)

Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. - PowerPoint PPT Presentation

Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. Cross validation. ROC plot. Introduction Motivation Estimating properties of an estimator (an estimator is a function of input points). x 1, x 2, ... ,x N

Innatoss Innovative diagnostics Expert in intracellular infectious diseases Diagnostics for Lyme

Diagnostics Applications, Limitations and Outlook Dr. Dirk Biskup, CeGaT Companion

Diagnostics: A focus on use in development of drugs for MDR pathogens John H. Rex, MD

Regression Diagnostics and the Forward Search 1 A. C. Atkinson, London School of Economics

X- X -ray Diagnostics of ray Diagnostics of Pre- -main Sequence Accretion and main Sequence

Application of Local Influence Diagnostics to the Buckley-James Model Nazrina Aziz 1 and Dong Q

Nonintercepting ODR Diagnostics for Multi-GeV Electron Beams Alex H. Lumpkin ASD Diagnostics

Ion source diagnostics and ion beam diagnostics for ECRIS intensity profile emittance

Multiplexed Diagnostics: Valley Fever assessment using Immunosignatures Phillip Stafford

Introduction Introduction Kyushu University Hospital Department of Endoscopic Diagnostics and

Image Diagnostics, Inc. Manufacturer of quality products for surgical, interventional and

ABOUT BIOMARK Executive Team Rashid Ahmed, MBA Founder, Chief Executive Officer Dr.

Molecular diagnostics for targeted treatments in non small cell lung cancer Winand N.M. Dinjens

Next Generation Sequencing in Molecular Diagnostics Wilfred van IJcken, PhD Erasmus MC Center

Challenges to Develop Diagnostics for Treatment of MDR Pathogens Herman Goossens Department of

DISTRIBUTED BY Diagnostics LIASYS MODELS LIASYS 330 - for POL with 10-40 patients/day LIASYS 450

Running an Open Source Synchrotron Physics Synchrotron BRGHT Clinton Roy 1 Software

Todays Agenda Family PACT Program Overview Provider Enrollment Client Eligibility

Data -Driven Ontologies for an Information Extraction System from Polish Mammography

Case Presentation dense breasts and no mammographic correlate to the palpable finding. October

American Association of Physicists in Medicine Task Group 195 Monte Carlo Reference Data Sets

Managed Health Care Help Center 101 February 27, 2019 Suzanne Sherinian Assistant Chief

CSE 312: Foundations of Computer Science, II CSE 312: Foundations of Computer Science, II

Implementing Shared Decision Making with Low Health Literacy Patients December 9, 2015 1:00