cs626 data analysis and simulation
play

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, - PowerPoint PPT Presentation

CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Office hours: Monday,Wednesday 2-4 pm Today: Overview & Introduction 1 Organizational Issues Class materials


  1. CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Office hours: Monday,Wednesday 2-4 pm Today: Overview & Introduction 1

  2. Organizational Issues Class materials http://www.cs.wm.edu/~kemper  PDF files of slides  Homework, projects, …  Supplementary material: tutorials, references, … Schedule  MWF: 10.00 – 10.50 am  Office hours: Monday, Wednesday 2pm-4pm and by appointment  No class February 18 - gives extra time for assignments - due to DSN PC Meeting Just ask if you are interested what I am doing there, I am happy to tell you more … 2

  3. References We will use texts from multiple sources! A lot of documents/books are available online in the SWEM library! Data Analysis: Bertholt, Borgelt, Hoeppner, Klawonn, Guide to Intelligent Data Analysis NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/ handbook/ P. Dalgaard, Introductory statistics with R, Springer 2002, online at SWEM B. Everitt, A handbook of statistical analyses using R, CRC Press 2010, online at SWEM Simulation: Law/Kelton, Simulation Modeling and Analysis, McGrawHill 3

  4. Big Picture: Model-based Analysis of Systems portion/facet real world perception transfer solution to real world problem real world problem description decision formal model transformation presentation probability model, solution, rewards, stochastic process qualitative and formal / computer aided quantitative analysis properties 4

  5. CS 626 – Focus: Stochastic Models Stochastic models rely on probability theory What is probability? A much beloved topic for students A particular type of functions f : S -> [0,1] A mathematical mean to process something we are not sure about: 5

  6. Calculating with something we are not sure about ? What to give as input ?  Separate “system” from “environment”, clarify on interactions  Separate “subsystems” and their dependencies within “system”  Quantify likelihood of relevant elementary “events” to happen What to know ?  How to calculate with probabilities?  How to handle dependencies? What to gain ?  Information on likelihood of overall behavior, quantified information on expected behavior to evaluate a single system or to compare systems 6

  7. Grading: How to get an „A“ for CS 626 Class participation: 0% Seems useless but is key to meet the other criteria  It will be more fun if you actively participate   Ask questions, make suggestions, contribute … Homework: 30% About 5-6  How to get an A? Just do it, hand in on time, present your results …  Projects: 20% Will require some effort, time and creativity  Adheres to „learning by doing“ approach  How to get an A? Start early, get things done, hand in on time, reflect  what you are doing, present your results … In-class Exams: 50% Midterm: 20%  Final: 30%  How to get an A? The usual game …  7

  8. Overview – this is the plan Probability Theory Primer & Statistics Concepts Tools: Stochastic Input Modeling - Mobius  Different types of stochastic workloads - BioPEPA  Relies heavily on data analysis - AnyLogic Simulation Models - R  Static - KNIME  Dynamic  Discrete Event Dynamic Systems  Continuous, ODE models Output Analysis  Data analysis strikes back again! Verification, Validation, Testing of Simulation Models Data Analysis Classics:  Preparation,  Finding Patterns, Explanations, and Predictors 8

  9. Probability Theory has its origins in an interest in games Cards, roulette, dices,   Gerolamo Cardano, 1501-1576 From Wikipedia: … notoriously short of money and kept himself solvent  by being an accomplished gambler and chess player. His book about games of chance, Liber de ludo aleae , written in the 1560s but published only in 1663 after his death, contains the first systematic treatment of probability, as well as a section on effective cheating methods. Pierre-Simon Laplace, 1749-1827  � "It is remarkable that a science which began with the consideration of games of chance should have become the most important object of human knowledge." Theorie Analytique des Probabilite , 1812. � � Other VIPs of probability theory: Andrey Kolmogorov Andrey Markov 9

  10. So where to start? With an application example! My choice: Dependability of a LEO satellite network Reference E. Athanasopoulou, P. Thakker, W.H. Sanders. Evaluating the dependability of a LEO satellite network for scientific applications. In Proc. 2nd int. Conf. Quantitative Evaluation of Systems, pp 95-104, IEEE, 2005. What to learn from this: An impression on what can be achieved with stochastic models  Some terminology, techniques and tools we need to give a closer look  Please keep in mind: LEO satellite modeling is one application among many  Dependability modeling is one application area among many  The terms and techniques are a subset of what is known  => there is more to learn here, and it is interesting! 10

  11. A Low Earth Orbit Satellite Network The story:  University of Illinois student-developed satellite network  Based on Illinois Observing Nanosatellite (ION)  Purpose: collect scientific data  E.g. natural disaster monitoring, earthquake monitoring, mapping of Earth’s magnetic field, measuring radiation flux for space weather …  In particular: measurement of light emissions from oxygen chemistry in the atmosphere  Several mission objectives:  Testing new thrusters, a new processor for small satellites, a new CMOS camera, demonstration of attitude control on a CubeSat Some issues from the list of challenges:  What minimum radiation shielding is necessary ?  What level of redundancy is necessary ? … to achieve a 6 mth target lifetime with COTS components …  How does sharing resources through a network could improve communication with the ground ? 11

  12. Dependability Assessment Properties of interest, goals of study Reliability of network R(0,t), probability of no permanent critical system  failures during time interval [0,t] Interval availability A(0,t), fraction of time system delivers proper service  during time interval [0,t] System vs Environment System:   Ground station, 45° inclination, northern hemis., repairable failures  4 satellites  7 critical subsystems (5V/9V regulators, battery, solar panel, comm hardware, processor, telemetry) + experiment hardware with temp and permanent failures  Orbits:  Sat 1-3: 90 min period, Sat 4: 720 min period  Inclination: Sat 1 90° , Sat 2 90° orth, Sat 3 45° , Sat 4 0° Environment: “the rest” with a foreseen influence based on   Lightning storms etc cause preamp failures at ground station  Radiation causes failures for satellite processors and CMOS devices 12

  13. Communication Communication Intersatellitelink (ISL) Satellite - satellite:  Gateway Link if within communication range (GWL) GWL Satellite - ground station:  if within footprint We discretize orbits, identify matching periods Footprint ε Elevation: Angle wrt to center of radiation cone and earth surface 13

  14. Clarification: Inclination Plane of satellite orbit Satellite orbit Closest point to earth δ Inclination δ Equatorial plane 14

  15. More input data Radiation dose, shielding and its mapping to failure rates, 0.007 failures per year for processor and CMOS components  0.0001 failures per year for discrete components  Scale factor r=1 for 1mm shielding for higher orbit Consider several model configurations! Communication: Data collection rate: 2 GB/yr while memory available  all data lost at failure Uplink communication considered negligible  Simple routing mechanism  ISL communication rate: 115 kbps with 50% overhead, 226665 MB/yr  GWL rate: double ISL rate  15 ISL with commercial satellite networks 

  16. Where are the probabilities ? Dependability study: Ground station: rates of failure / repair actions  Satellite subsystems: rates of failure / repair actions  are modeled with a random variable that follows a negative exponential distribution with a given rate. Rate 5.0 means on average 5 events per time unit. Total Ionizing Dose (TID): is taken into account by a scaling factor r  towards failure rates of components. What is then analyzed Reliability and availability  For different levels of radiation shielding  For different levels of redundancy of components  What type of analysis is used Transient analysis of Markov chains   For single satellite design Discrete event simulation of stochastic models   valid alternative, used for evaluation of overall network 16

  17. Outcome of calculations Satellite reliability in time interval [0,t] with t in months for different levels of radiation shielding  Base radiation rate r = 1 (matches 1 mm of shielding)  Increased shielding: r = 0.4, r = 0.2  Reduced shielding: r = 2, r = 5 Probability of no permanent critical system failures during [0,t] 17

  18. Some more results wrt reliability Different levels of redundancy for  Batteries Note, failures are unrelated to radiation Two batteries seem ok  Regulators Redundancy does not improve TID immunity since processor or communication system fail due to TID much earlier 18

Recommend


More recommend