Agreement Between Saliency Maps and Human-Labeled Regions of Interest Applications to Skin Disease Classifjcation Nalini Singh , Kang Li, David Coz, Christof Angermueller, Aaron Loh, Susan Huang, Yuan Liu 6.15.2020
ISIC Skin Image Analysis Workshop @ CVPR 2020 Project Overview Goal Determine if a skin disease classifjcation model makes decisions for surprising reasons Approach Quantify agreement between model explanations and human-labeled regions of interest
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline ROI Labeling by 3 Majority Human Graders Overlay Consensus Input Image Spearman Rank Correlation Thresholded Dice Score Saliency Saliency Map Segmentation
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline Model Development Dataset* 19,870 de-identifjed adult dermatology cases ● 1-6 consumer-grade camera images + metadata per case ● Input Image Classes: 26 skin conditions + 'Other' ● Labels from aggregated board-ceruifjed dermatologist opinions ● Saliency Evaluation Dataset 1,309 de-identifjed adult dermatology cases sampled at random ● from model development test set *Liu, Y., Jain, A., Eng, C. et al. A deep learning system for difgerential diagnosis of skin diseases. Nat Med (2020).
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline ROI Labeling by 3 Majority Human Graders Overlay Consensus Input Image
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline Model Architecture* Clinical Images (1~6) 27 Classes Acne Inception-v4 Input image Alopecia Areata Average Input Image Cyst .. Eczema Inception-v4 Input image Psoriasis .. Melanoma Metadata (45) Input Feature metadata transform Age: 31 Tinea Concat Softmax .. Other Input Feature Have fever?: No Saliency Map metadata transform *Liu, Y., Jain, A., Eng, C. et al. A deep learning system for difgerential diagnosis of skin diseases. Nat Med (2020).
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline Model Architecture Top-1 accuracy: 66% ● Saliency Map Generation* Input Image Integrated Gradients: ● Saliency Map *Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. "Axiomatic aturibution for deep networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70 . JMLR. org, 2017.
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline Input Image Saliency Saliency Map Segmentation
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline ROI Labeling by 3 Majority Human Graders Overlay Consensus Input Image Spearman Rank Correlation Thresholded Dice Score Saliency Saliency Map Segmentation
ISIC Skin Image Analysis Workshop @ CVPR 2020 Experiment Pipeline Image Spearman Image Thresholded Rank Correlation Dice Score Input Case maximum maximum Case Spearman Case Thresholded Rank Correlation Dice Score
ISIC Skin Image Analysis Workshop @ CVPR 2020 Examples: High Agreement Correctly Classifjed Incorrectly Classifjed 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 Incorrectly Classified
ISIC Skin Image Analysis Workshop @ CVPR 2020 Examples: Low Agreement Correctly Classifjed Incorrectly Classifjed 8 8 8 8 8 8 Incorrectly Classified
ISIC Skin Image Analysis Workshop @ CVPR 2020 Results by Condition Spearman Rank Correlation Dice Score
ISIC Skin Image Analysis Workshop @ CVPR 2020 Results by Condition Androgenetic Alopecia (5) 8 8 8 Spearman Rank Correlation 8 8 8 Acne (1) Dice Score
ISIC Skin Image Analysis Workshop @ CVPR 2020 Results by Condition SK/ISK (18) Spearman Rank Correlation Melanoma (12) Dice Score
ISIC Skin Image Analysis Workshop @ CVPR 2020 Results by Demographics
ISIC Skin Image Analysis Workshop @ CVPR 2020 Summary & Conclusions Quantitatively compared model explanations to human-labeled ROIs: Notably, found that model explanations point to 'normal anatomy' (e.g. ● hair, nails, and lips). Insights from analysis will guide targeted data collection and data ● augmentation strategies. Workfmow could be used to identify difgerences between model ● explanations and human regions of interest for any model.
ISIC Skin Image Analysis Workshop @ CVPR 2020 Related Work Eng, Clara, Y. Liu, and R. Bhatnagar. "Measuring clinician–machine ● agreement in difgerential diagnoses for dermatology." British Journal of Dermatology (2019). Liu, Yuan, et al. "A deep learning system for difgerential diagnosis of skin ● diseases." Nature Medicine (2020): 1-9. Ghorbani, Amirata, et al. "DermGAN: Synthetic Generation of Clinical Skin ● Images with Pathology." NeurIPS ML4H Workshop (2019). Singh, Nalini, et al., “Agreement Between Saliency Maps and ● Human-Labeled Regions of Interest: Applications to Skin Disease Classifjcation.”, CVPR ISIC Workshop (2020).
Recommend
More recommend