a reader study on a 14 head microscope
play

A Reader Study on a 14head Microscope Brandon D. Gallas, Qi Gong - PowerPoint PPT Presentation

A Reader Study on a 14head Microscope Brandon D. Gallas, Qi Gong FDA/CDRH/OSEL/DIDSR, Silver Spring, MD, US Jamal Benhamida, Matthew G. Hanna, S. Joseph Sirintrapun, Kazuhiro Tabata, Yukako Yagi Memorial Sloan Kettering Cancer Center


  1. A Reader Study on a 14‐head Microscope Brandon D. Gallas, Qi Gong FDA/CDRH/OSEL/DIDSR, Silver Spring, MD, US Jamal Benhamida, Matthew G. Hanna, S. Joseph Sirintrapun, Kazuhiro Tabata, Yukako Yagi Memorial Sloan Kettering Cancer Center (MSKCC), Pathology Informatics, New York, NY, US Partha P. Mitra Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, US

  2. A Reader Study on a 14‐head Microscope Purpose • Purpose of this work – Demonstration … proof of concept … technology demonstration … method development Clinically relevant task • Technology evaluation, not clinical performance Part of every pathologist’s training Task‐based evaluation of image quality Challenging task – (substantial reader variability) Task: Detection and classification of mitotic figures (MFs) – Convenient samples Images: Glass slides and WSI – Readers: Pathologists Agreement … No ground truth – Performance: Within‐ and Between‐Reader Agreement Count differences (calibration) Pairwise Concordance (correlation) “MRMC” analyses account for variability from M ultiple R eaders and M ultiple C ases www.fda.gov 5/23/2018 2

  3. Microscope still the gold standard Remove search from technology eeDAP: Evaluation Environment for Digital evaluation and Analog Pathology • • Eliminate location variability for faster and eeDAP: Evaluation Environment for Digital more precise results. and Analog Pathology • Registration allows pathologists to Technology Evaluation Clinical practice evaluate the same fields of view Pathologists choose All pathologists evaluate same Fields of View Fields of View to evaluate Camera image WSI of glass slide Patch Pathologist 1 Pathologist 2 Pathologist 3 Pathologist 4 www.fda.gov 5/23/2018 3

  4. NIH Mitotic Counting Study Counts come from different tissue! • NIH Study data (Mark Simpson) Clinical practice vs. technology evaluation – FOV locations saved for each pathologist in digital mode – Preliminary agreement results given during WSIWG meeting pHH3 20x H&E 20x H&E 40x www.fda.gov 5/23/2018 4

  5. eeDAP on the road last year … Monitor, Computer, motorized stage with joystick, microscope with mounted camera, reticle in eyepiece www.fda.gov 5/23/2018 5

  6. Mitotic Counting and Classification Install, Demo, Train at MSKCC Study Design • 4 slides from Mark Simpson at NCI eeDAP on loan – HE: canine oral melanoma to MSKCC • 10 ROIs per slide from tumor – ROI – 800 x 800 pixels @ 0.25um/pixel 200um x 200um 17% of the entire FOV (0.24 mm 2 ) • 4 pathologists from MSKCC www.fda.gov 5/23/2018 6

  7. Quick look at first study • Circles: mitotic figures WSI image identified by pathologists. • “Candidate MFs” = marked cells • Each color corresponds to a different pathologist. www.fda.gov 5/23/2018 7

  8. Readers per Candidate MF • 45/92 = 49% marked by only one Readers Per Candidate, total = 92 45 • 21/92 = 23% unanimously marked 0.4 0.3 Density • Build these candidate MFs into 21 0.2 next study: Classification task 14 12 0.1 • Need some low‐probability 0 0.0 candidates from ROIs with zero or 0 1 2 3 4 one candidates ‐> yield 34 readersPerCandidate1 www.fda.gov 5/23/2018 8

  9. Can we use eeDAP on this multi‐headed microscope? • Same microscope frame … 14 heads! • Stage mounts fine • Camera mounts fine • Let’s do it. www.fda.gov 5/23/2018 9

  10. Mitotic counting and Classification: Multi‐head microscope High‐throughput reader study Study Design • 4 slides from Mark Simpson at NCI – HE: canine oral melanoma • 10 ROIs per slide from tumor – ROI = 800 x 800 pixels @ 0.25um/pixel = 200um x 200um = 17% of the entire FOV (0.24 mm 2 ) • 126 (=92+34) Candidate MFs • 10 pathologists* • Collect data on paper – ~1 hour training – ~2 hours for data collection www.fda.gov 5/23/2018 10

  11. Mitotic counting and Classification: Multi‐head microscope High‐throughput reader study Workflow • Mark and count in ROI • Classify candidates in same ROI www.fda.gov 5/23/2018 11

  12. Readers per Candidate: Multi‐head study Readers Per Candidate, total = 158 • Similar characteristics as 79 0.5 before 0.4 • 79/158 = 49% marked by only 0.3 Density one 0.2 21 0.1 12 • 21/158 = 13% unanimously 9 9 7 6 6 5 4 0 marked 0.0 0 2 4 6 8 10 – 13 agree with previous, 8 new ones readersPerCandidate2 www.fda.gov 5/23/2018 12

  13. Counting Results • Each point = Between-reader Scatter Plot – One ROI and a pair of readers micro14 vs. micro14 – Appears twice (transpose x,y) Reader counts: 14-head Microscope – Has noise added for visualization 8 6 • How do we summarize this? 4 Agreement … No ground truth 2 Count differences (calibration) Pairwise Concordance (correlation) 0 0 2 4 6 8 “MRMC” analyses account for variability from Reader counts: 14-head Microscope M ultiple R eaders and M ultiple C ases www.fda.gov 5/23/2018 13

  14. Results: Count Differences • Rotate 45 and Between-reader Scatter Plot micro14 vs. micro14 rescale x‐axis Reader counts: 14-head Microscope 8 ‐> Bland‐Altman plot 6 4 2 0 0 2 4 6 8 Reader counts: 14-head Microscope www.fda.gov 5/23/2018 14

  15. Results: Count differences • Rotate 45 and Between-Reader Differences: micro14 rescale x‐axis N = 3420 ‐> Bland‐Altman plot 6 4 • Limits of agreement Count Differences 2 – Characterize spread of differences – σ = 1.07 0 “MRMC” analyses: -2 account for variability from -4 σ = 1.07 M ultiple R eaders and M ultiple C ases -6 • Not the standard error 0 2 4 6 8 – SE characterizes the spread of the mean difference Count Averages www.fda.gov 5/23/2018 15

  16. Results: Count Differences • Study 1: SE Std of – More MFs with microscope Average Between‐Reader Average – Count differences were larger Study 1: Counts Counts Count Differences Digital 1.22 0.23 1.29 with digital Microscope 1.48 0.27 1.12 Microscope ‐ Digital 0.26 0.12 1.20 • Study 2: – Microscope results consistent with Study 1 www.fda.gov 5/23/2018 16

  17. Results: Count Differences • Study 1: SE Std of – More MFs with microscope Average Between‐Reader Average – Count differences were larger Study 1: Counts Counts Count Differences Digital 1.22 0.23 1.29 with digital Microscope 1.48 0.27 1.12 Microscope ‐ Digital 0.26 0.12 1.20 • Study 2: – Microscope results consistent with Study 1 www.fda.gov 5/23/2018 17

  18. Results: Count Differences • Study 1: SE Std of – More MFs with microscope Average Between‐Reader Average – Count differences were larger Study 1: Counts Counts Count Differences Digital 1.22 0.23 1.29 with digital Microscope 1.48 0.27 1.12 Microscope – Digital 0.26 0.12 1.20 • Study 2: Study 2: 14‐head Microscope 1.54 0.25 1.07 – Microscope results consistent with Study 1 www.fda.gov 5/23/2018 18

  19. Pairwise Concordance A probability that tracks with correlation • Select two ROIs Between-reader Scatter Plot micro14 vs. micro14 • Consider the counts from two pathologists – Reader counts: 14-head Microscope Pathologist 1: X1, X2 – 8 Pathologist 2: Y1, Y2 6 • Possible outcomes No time for concordance results 4 Concordance X1>X2, Y1>Y2 Discordance X1>X2, Y1<Y2 2 Tie for Pathologist 1 X1=X2, Y1≠Y2 0 Tie for Pathologist 2 X1≠X2, Y1=Y2 Tie for both pathologists X1=X2, Y1=Y2 0 2 4 6 8 Reader counts: 14-head Microscope www.fda.gov 5/23/2018 19

  20. Classification scores • Summarize with concordance Between-reader Scatter Plot micro14 vs. micro14 100 • For some cases Reader classification scores – “Definitely Not a MF” 80 – “Definitely Is a MF” No time for 60 concordance results 40 • Can this impact AI training? 20 0 • Can still binarize this data 0 20 40 60 80 100 – 15% in the red zone Reader classification scores www.fda.gov 5/23/2018 20

  21. Generalize to evaluating computational pathology • Readers Per Candidate, total = 158 FDA qualification of images with annotations – 79 MDDT: Medical Device Development Tools 0.5 – Support FDA submissions of computational pathology Similar plot 0.4 AI likelihood instead of • Generate candidates from 0.3 readers per candidate Density – Pathologists AND Reduces 0.2 – Algorithm(s) 21 bias in the 0.1 12 9 9 • comparison 7 Candidates cover 6 6 5 4 0 range in likelihood 0.0 the candidate is a MF 0 2 4 6 8 10 readersPerCandidate2 • Use same agreement measures www.fda.gov 5/23/2018 21

Recommend


More recommend