Introspection-Based Fault Tolerance for Future On-Board Computing Systems Mark L. James and Hans P. Zima Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA {mjames,zima}@jpl.nasa.gov High Performance Embedded Computing (HPEC) Workshop MIT Lincoln Laboratory, 23-25 September 2008
Contents Requirements and Challenges for Space 1. Requirements and Challenges for Space 1. Requirements and Challenges for Space 1. Missions Missions Missions Emerging Multi- -Core Systems Core Systems 2. Emerging Multi 2. Emerging Multi-Core Systems 2. High Capability Computation in Space 3. High Capability Computation in Space 3. High Capability Computation in Space 3. An Introspection Framework for Fault Tolerance 4. An Introspection Framework for Fault Tolerance 4. An Introspection Framework for Fault Tolerance 4. Concluding Remarks 5. Concluding Remarks 5. Concluding Remarks 5.
More than 50 NASA Missions Explore Our Solar System Spitzer studying stars and Spitzer studying stars and Cassini studying Saturn Cassini studying Saturn galaxies in the infrared galaxies in the infrared Ulysses studying the Ulysses studying the sun sun CALIPSO studying Earth’ ’s s CALIPSO studying Earth GALEX surveying galaxies GALEX surveying galaxies climate climate in the ultraviolet in the ultraviolet Aqua studying Earth’ Aqua studying Earth ’s s Mars Odyssey, rovers Mars Odyssey, rovers oceans oceans “Spirit Spirit” ” and and “ “Opportunity Opportunity” ” “ studying Mars studying Mars Two Voyagers on an Two Voyagers on an interstellar mission interstellar mission MESSENGER on its way to MESSENGER on its way to Mercury Mercury QuikScat, Jason 1, CloudSat, and GRACE , Jason 1, CloudSat, and GRACE QuikScat (plus ASTER, MISR, AIRS, MLS and TES (plus ASTER, MISR, AIRS, MLS and TES instruments) monitoring Earth. instruments) monitoring Earth. Aura studying Earth’ Aura studying Earth ’s s atmosphere atmosphere Hubble studying the universe Hubble studying the universe Chandra studying the Chandra studying the x- -ray universe ray universe x New Horizons on its New Horizons on its way to Pluto way to Pluto
Space Challenges: Environment Constraints on Spacecraft Hardware � Radiation � Radiation 0 Total Ionizing Dose (TID)—amount of ionizing radiation over time: 0 Total Ionizing Dose (TID)—amount of ionizing radiation over time: can lead to long-term cumulative degradation, permanent damage can lead to long-term cumulative degradation, permanent damage 0 Single Event Effects—caused by a single high-energy particle 0 Single Event Effects—caused by a single high-energy particle traveling through a semiconductor and leaving a ionized trail traveling through a semiconductor and leaving a ionized trail � Single Event Latchup (SEL)—catastrophic failure of the device (prevented by � Single Event Latchup (SEL)—catastrophic failure of the device (prevented by Silicon-On-Insulator (SOI) technology) Silicon-On-Insulator (SOI) technology) � Single Event Upset (SEU) and Multiple Bit Upset (MBU)—change of bits in � Single Event Upset (SEU) and Multiple Bit Upset (MBU)—change of bits in memory: a transient effect, causing no lasting damage memory: a transient effect, causing no lasting damage � Temperature � Temperature 0 wide range (from -170 C on Europa to >400 C on Venus) 0 wide range (from -170 C on Europa to >400 C on Venus) 0 short cycles (about 50 C on MER) 0 short cycles (about 50 C on MER) � Vibration � Vibration 0 launch 0 launch 0 Planetary Entry, Descent, Landing (EDL) 0 Planetary Entry, Descent, Landing (EDL)
Space Challenges: Communication and Navigation Constraints on mission operations � Bandwidth � Bandwidth 0 6 Mbit/s maximum, but typically much less (100 b/s) 0 6 Mbit/s maximum, but typically much less (100 b/s) 0 spacecraft transmitter power less than light bulb in 0 spacecraft transmitter power less than light bulb in a refrigerator a refrigerator � Latency (one way) � Latency (one way) 0 20 minutes to Mars 0 20 minutes to Mars 0 13 hours to Voyager 1 0 13 hours to Voyager 1 � Navigation � Navigation 0 Position 0 Position 0 Velocity 0 Velocity
Space Challenges: Engineering � Only flight qualified parts are typically used � Only flight qualified parts are typically used 0 systems are at least 5 years out of date when launched—two 0 systems are at least 5 years out of date when launched—two generations behind commercial state-of-the-art generations behind commercial state-of-the-art � Power and Mass Restrictions � Power and Mass Restrictions 0 20-30 W for a flight computer 0 20-30 W for a flight computer � Often test of final system possible only when it is flown � Often test of final system possible only when it is flown 0 importance of modeling and simulation 0 importance of modeling and simulation � Long mission duration challenges maintainability of � Long mission duration challenges maintainability of ground assets in operations phase ground assets in operations phase 0 Voyager is based on custom flight computer designed with MSI 0 Voyager is based on custom flight computer designed with MSI parts and ferrite core memory of the late 1960’s (programmed in parts and ferrite core memory of the late 1960’s (programmed in assembler) assembler)
Duck Bay: Site of Opportunity’s descent into Victoria Crater
NASA/JPL: Potential Future Missions Artist Concept Mars Sample Return Neptune Triton Explorer Europa Astrobiology Europa Titan Explorer Explorer Laboratory
Future Mission Applications � New Types of Science � � New Types of Science New Types of Science 0 0 Opportunistic science (event detection: e.g., dust devils or volcanic eruptions) Opportunistic science (event detection: e.g., dust devils or volcanic eruptions) 0 0 Model-based autonomous mission planning Model-based autonomous mission planning 0 0 Smart high resolution sensors (e.g., Gigapixel, SAR,…) Smart high resolution sensors (e.g., Gigapixel, SAR,…) 0 0 Hyperspectral imaging Hyperspectral imaging � Entry Descent & Landing � � Entry Descent & Landing Entry Descent & Landing 0 0 Flight control through disparate flight regimes Flight control through disparate flight regimes 0 0 Landing zone identification Landing zone identification 0 0 Lateral winds Lateral winds 0 0 Soft touchdown Soft touchdown � Surface Mobility � � Surface Mobility Surface Mobility 0 0 Terrain traversal, obstacle avoidance Terrain traversal, obstacle avoidance 0 0 Science Target identification Science Target identification 0 0 Image/video Compression Image/video Compression � Communication with Earth is a limiting factor � � Communication with Earth is a limiting factor Communication with Earth is a limiting factor 0 0 Small bandwidth requires reduction of data transfer volume; on-board data analysis, Small bandwidth requires reduction of data transfer volume; on-board data analysis, filtering, and compression filtering, and compression
New Requirements New applications and the limited downlink to Earth lead to two major new requirements: 1. Autonomy 2. High-Capability On-Board Computing Such missions require on-board computational power ranging from tens of Gigaflops to hundreds of Teraflops
The Traditional Approach will not Scale � Traditional approach based on radiation Traditional approach based on radiation- -hardened hardened � � Traditional approach based on radiation-hardened processors and fixed redundancy (e.g.,Triple Modular processors and fixed redundancy (e.g.,Triple Modular processors and fixed redundancy (e.g.,Triple Modular Redundancy— —TMR) TMR) Redundancy—TMR) Redundancy 0 Current Generation (Phoenix and Mars Science Lab 0 Current Generation (Phoenix and Mars Science Lab –’09 Launch) 0 Current Generation (Phoenix and Mars Science Lab – –’ ’09 Launch) 09 Launch) � Single BAE Rad 750 Processor � Single BAE Rad 750 Processor � 256 MB of DRAM and 2 GB Flash Memory (MSL) � 256 MB of DRAM and 2 GB Flash Memory (MSL) � 200 MIPS peak, 14 Watts available power (14 MIPS/W) � 200 MIPS peak, 14 Watts available power (14 MIPS/W) � Radiation � Radiation- -hardened processors today lag commercial hardened processors today lag commercial � Radiation-hardened processors today lag commercial architectures by a factor of about 100 (and growing) architectures by a factor of about 100 (and growing) architectures by a factor of about 100 (and growing) � By 2015: a single � By 2015: a single rad rad- -hard processor may deliver about hard processor may deliver about � By 2015: a single rad-hard processor may deliver about 1 GFLOPS— —orders of magnitude below requirements orders of magnitude below requirements 1 GFLOPS 1 GFLOPS—orders of magnitude below requirements
Contents Requirements and Challenges for Space 1. Requirements and Challenges for Space 1. Requirements and Challenges for Space 1. Missions Missions Missions Emerging Multi- -Core Systems Core Systems 2. Emerging Multi 2. Emerging Multi-Core Systems 2. High Capability Computation in Space 3. High Capability Computation in Space 3. High Capability Computation in Space 3. An Introspection Framework for Fault Tolerance 4. An Introspection Framework for Fault Tolerance 4. An Introspection Framework for Fault Tolerance 4. Concluding Remarks 5. Concluding Remarks 5. Concluding Remarks 5.
Recommend
More recommend