FDA R EGULATORY F RAMEWORK AND E VALUATION M ETHODS FOR AI/ML- BASED D ECISION T OOLS Nicholas Petrick. Ph.D. Division of Imaging, Diagnostics and Software Reliability (DIDSR) Office of Science and Engineering Labs Center for Devices and Radiological Health (CDRH) U.S. Food and Drug Administration (FDA) nicholas.petrick@fda.hhs.gov
D ISCLOSURES • None 5/9/2019 2
O UTLINE • Overview of medical device regulatory framework – Regulatory overview – ML-based medical devices – Software as a medical device (SaMD) • Assessment – Imaging-based machine learning (ML) SaMD assessment – Framework for assessing ML SaMD modifications 5/9/2019 3
C ENTER FOR D EVICES AND R ADIOLOGICAL H EALTH • Protect and promote the health of the public by ensuring the safety and effectiveness of medical devices and the safety of radiation-emitting electronic products 5/9/2019 4
D EVICE C LASS & P RE -M ARKET R EQUIREMENTS Premarket Review Device Class Controls Process Class I General Controls Most are exempt (lowest risk) Class II General Controls Premarket Notification Special Controls [510(k)] Class III General Controls Premarket Approval [PMA] (highest risk) Premarket Approval 5/9/2019 5
D EVICE C LASS & P RE -M ARKET R EQUIREMENTS Premarket Review Demonstrate substantial Device Class Controls Process equivalence to predicate Class I General Controls Most are exempt device (lowest risk) Class II General Controls Premarket Notification Special Controls [510(k)] Class III General Controls Premarket Approval [PMA] (highest risk) Premarket Approval 5/9/2019 6
D EVICE C LASS & P RE -M ARKET R EQUIREMENTS Premarket Review Means for new device, without Device Class Controls Process a valid predicate, to be Class I General Controls Most are exempt classified into Class I or II (lowest risk) Class II General Controls De Novo Special Controls Class III General Controls Premarket Approval [PMA] (highest risk) Premarket Approval 5/9/2019 7
D EVICE C LASS & P RE -M ARKET R EQUIREMENTS Premarket Review Device Class Controls Process Class I General Controls Most are exempt Demonstrate reasonable (lowest risk) assurance of safety and Class II General Controls Premarket Notification effectiveness Special Controls [510(k)] Class III General Controls Premarket Approval [PMA] (highest risk) Premarket Approval 5/9/2019 8
E XAMPLES OF ML- BASED M EDICAL S OFTWARE FDA News Release FDA permits marketing of clinical decision support software for alerting providers of a potential stroke in patients February 13, 2018 Viz.Ai FDA News Release FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems April 11 2018 IDx-DR 5/9/2019 9
ML- BASED M EDICAL D EVICES ARE NOT N EW • Mostly imaging or physiological signal analysis applications – ECG signal analysis – Analysis of radiology images – Analysis of cytology/pathology images • Semi-automated cervical cytology slide reader • Reduce false- negatives due to human error • FDA approval in 1994 5/9/2019 10
ML-B ASED M EDICAL D EVICES IDx-DR Potential to fundamentally transform the delivery of health care: E.g., Earlier disease detection, more accurate diagnosis, new insights into human physiology, personalized diagnostics and therapeutics Ability for ML to learn from the wealth of real-world data and improve performance Already seeing ML lead to the development of novel medical devices 5/9/2019 11
ML-B ASED M EDICAL D EVICES : C HALLENGES • Need for large, high quality, well-curated data sets • Explainability of “black box” approaches Identifying and removing bias • Oversight to ML-based • algorithms that learn/change over time QuantX 5/9/2019 12
S OFTWARE AS A M EDICAL D EVICE (S A MD) 5 http://www.imdrf.org/workitems/wi-samd.asp 5/9/2019 13
S A MD R ISK C ATEGORIZATION Increasing Significance Significance of Information Provided by State of SaMD to Healthcare Decision Healthcare Increasing criticality Situation or Treat or Drive Clinical Inform Clinical Condition Diagnose Management Management IV III II Critical III II I Serious II I I Non-Serious 5 http://www.imdrf.org/workitems/wi-samd.asp 5/9/2019 14
F UNDAMENTALS OF IMAGE - BASED ML A SSESSMENT • Device description • Data • Performance assessment – Standalone performance – Reader performance (when appropriate) – … • Human factors or other information/testing as appropriate • … 5/9/2019 16
D EVICE D ESCRIPTION • Device & algorithm descriptions – Device usage (mode of operation, patient population, …) – Algorithm design and function • Including structure of traditional and deep learning networks • Inputs – Type and range of signals/data • Outputs – Training process – Training/test database – Reference standard – … 5/9/2019 17
D ATA • ML algorithms are data-driven – Versus, for example, physics or biology based • ML algorithm development now facilitated by standardized ML platforms – Brings ML to a wider array of users – The good • Access to high-quality data streamlines design of novel ML applications – The bad • Garbage in - garbage out ML Features ML Training 5/9/2019 18
P ERFORMANCE T ESTING • Performance of ML algorithm on an independent data – Ideally, identifies problems with training process Independent Test Data Training Data Learning Process Learned Model Training Set Tuning Set (Fixed) Learned Models Selected Model Test Performance 5/9/2019 19
P ERFORMANCE T ESTING • Standalone performance • Clinical reader performance – Performance of algorithm alone – Assessment of clinical aids – Assesses robustness and – Clinicians’ performance utilizing generalizability of algorithm device • Multi-reader multi-case designs • Compare clinician’s performance Statistical/ Standalone Apply Test Dataset Regression with the ML SaMD aid to ML SaMD Performance Analysis without the aid Clinical w/o aid read Reader Test Statistical Performance Dataset Analysis (Difference) Apply Clinical ML SaMD aided read 5/9/2019 20
P ROPOSED R EGULATORY F RAMEWORK FOR AI/ML A LGORITHMS M ODIFICATIONS • Agency proposing framework to give manufacturers option to submit a plan for AI/ML modifications during initial premarket review • Initial premarket phase would include – Review initial SaMD performance – Review plan for modifications – Review ability to manage/control resultant risks of modifications • FDA asking for community feedback on this document https://www.regulations.gov/document?D=FDA-2019-N-1185-0001 5/9/2019 21
C OMPONENTS OF C HANGE C ONTROL P LAN • Good ML Practices (GMLP): – Accepted practices in AI/ML algorithm design, development, training, and testing that facilitate the quality development and assessment of AI/ML-based algorithms • Based on concepts from quality systems, software reliability, machine learning, and data analysis, etc. • SaMD Pre-Specifications (SPS): – Delineates the proposed types of modifications to the SaMD (i.e., what types of changes the sponsor plans to achieve) • Determine “range of potential changes” around the initial specifications and labeling of original device • Algorithm Change Protocol (ACP): – Describes the methods for performing and validating the changes pre-specified in SPS (i.e. how the sponsor intends to achieve the changes) • Typically specific to the device and type of change Expected to contain a step-by-step delineation of the procedures to be followed • 5/9/2019 22
C URRENT AI/ML W ORKFLOW Good Machine Learning Practices Legend Data selection and Data for re-training Model training management and tuning AI Model Development Data for re-training Model validation AI Production Model Performance evaluation o Clinical evaluation o AI Device Modifications Premarket Assurance of Safety and Effectiveness for Modified AI/ML algorithm Model monitoring Deployed Model New (Live) Data Log and track o Evaluate performance o 5/9/2019 23
P ROPOSED TPLC A PPROACH O VERLAYED ON AI/ML W ORKFLOW Good Machine Learning Practices 1 Legend Data selection and Data for re-training Model training management and tuning AI Model Development Culture of Quality and Data for re-training Model validation AI Production Model Organizational Performance evaluation o Excellence Clinical evaluation o AI Device Modifications Review of SaMD Pre- Premarket 2 3 Specifications and Proposed TPLC Approach Assurance of Safety Algorithm Change and Effectiveness Protocol Model monitoring 4 Deployed Model New (Live) Data Log and track o Evaluate performance o Real-World Performance Monitoring 5/9/2019 24
Recommend
More recommend