det detec ectin ing an anom omal alou ous com omputat
play

Det Detec ectin ing An Anom omal alou ous Com omputat ation - PowerPoint PPT Presentation

Det Detec ectin ing An Anom omal alou ous Com omputat ation ion wit ith RN RNNs on on GP GPU-Ac Accel eler erat ated ed HPC PC Mac Machin ines es Pengfei Zou, Ro Rong Ge Clemson University Ang Li, Kevin Barker Pacific


  1. Det Detec ectin ing An Anom omal alou ous Com omputat ation ion wit ith RN RNNs on on GP GPU-Ac Accel eler erat ated ed HPC PC Mac Machin ines es Pengfei Zou, Ro Rong Ge Clemson University Ang Li, Kevin Barker Pacific Northwest National Laboratory ICPP2020 1

  2. Ov Over erview ew 4 Th The new threat in HPC p Illicit workloads exploit powerful GPUs committed to HPC workloads 4 Our approa oach p Leverage identifiable patterns of HPC workloads p Treat illicit workload detection as a classification problem p Devise RNN models to infer workloads from high-level profiles 4 Con ontribution on p An online illicit workload detection suitable for practical use v > 95% accuracy, with system level light weight profiling only p Techniques to handle data heterogeneity, irregularity and loss p Advanced RNN modeling for inference accuracy ICPP2020 2

  3. Illicit Applications on HPC Systems Il 4 Illicit com omputation ons begin running on on HPC systems p Crypto mining p Password cracking p Denial-of-service (DoS) attacks 4 Com ommon on characteristics p For-profit or malicious attacks instead of science p Resource intensive v Powerful GPU accelerators are ideal p Long execution time: days to weeks or longer 4 Risks and security issues to o HPC p Mission-critical applications deprived of computing cycles p data leaking, system damage, etc p Empowered hacks and attacks ICPP2020 3

  4. A Unique, New w Thread 4 Penetrating log ogin nod odes impos oses the risks p HPC systems only protect login nodes 4 Author orized users can run illicit com omputation ons p Authorization and authentication easily passed 4 Lit Little ba barrie iers and d gua uards ds exis ist p Due to performance priority in HPC systems p Little or no network traffic monitoring and host auditing 4 Com omputation ons masked and of offloa oaded to o accelerator ors p CPU-side monitoring and detection measures would fail Novel security measures needed to detect illicit computation in HPC ICPP2020 4

  5. Opportun Op unities es and Challen llenges ges 4 HPC wor orkloa oads have unique patterns identifiable by ML p A small set of programs with specific resource usage patterns p Certain kernels and functions, e.g., FFT, BLAS 4 Accurate ML mod odels use many HW cou ounters as input p Large overhead for online detection p Intrusive to user applications ICPP2020 5

  6. Our Approach Our 4 Online illicit wor orkloa oad detection on p Illicit GPU computation detection as classification problems p Light-weight, common system level profiling for model input p Multiple input sequences for inference accuracy p Synergistic multi-RNNs to handle complex, heterogeneous inputs Aperiodic Periodic ICPP2020 6

  7. Data Het Da eter erogen genei eity 4 Heterog ogeneity in data sequences p Varying sample losses in resource utilization sequences p Asynchronism between the types 4 Irregularity of of event-ba based data seque quence ICPP2020 7

  8. Sa Sample Los Losses in in Util tiliz ization tion Data ta 4 Nv Nvidia-sm smi prof ofiling los oses samples p E.g., 30% on average 4 Los osses depend on on application on and sampling interval p Different temporal information from different training apps ICPP2020 8

  9. LSTM La LST Layers for Advanc nced Training ining 4 Split Layers for or the event-ba based driver run untime 4 Interpol olation on layer for or the resou ource utilization on sequences ICPP2020 9

  10. Model Training and Va Mo Validation 4 Wor orkloa oads p 83 authorized applications v Rodinia, Parboil, SHOC, PolyBench, exascale Proxy Apps, etc p 17 unauthorized applications from GitHub and BitBucket v Crypto mining, password cracking, brute force attacking… 4 Data col ollection on p Periodic resource utilization v Power, core utilization, memory footprint, memory bandwidth p Event based driver runtime v Kernel events: starting time, duration, configuration v Data transfer events: starting time, latency, direction, bandwidth p HW performance counters for counterpart comparison 4 Three generation ons of of GPUs: K40, P100, and V100 ICPP2020 10

  11. Selected Ev Evaluation Results Accuracy False NR vs. HMC based ICPP2020 11

  12. Co Conc nclus usio ion 4 A A ne new th thread in n HPC p Illicit computation takes execution cycles and empowers attacks 4 Our prop opos osed on online detection on p Lightweight profiling p Accurate detection with fused LSTMs using multiple data sequences 4 Ou Our r findings p Illicit workloads have different patterns from HPC workloads p Multiple system-level profiling is sufficient for accurate detection p Fused RNNs are suitable for online detection ICPP2020 12

Recommend


More recommend