Det Detec ectin ing An Anom omal alou ous Com omputat ation ion wit ith RN RNNs on on GP GPU-Ac Accel eler erat ated ed HPC PC Mac Machin ines es Pengfei Zou, Ro Rong Ge Clemson University Ang Li, Kevin Barker Pacific Northwest National Laboratory ICPP2020 1
Ov Over erview ew 4 Th The new threat in HPC p Illicit workloads exploit powerful GPUs committed to HPC workloads 4 Our approa oach p Leverage identifiable patterns of HPC workloads p Treat illicit workload detection as a classification problem p Devise RNN models to infer workloads from high-level profiles 4 Con ontribution on p An online illicit workload detection suitable for practical use v > 95% accuracy, with system level light weight profiling only p Techniques to handle data heterogeneity, irregularity and loss p Advanced RNN modeling for inference accuracy ICPP2020 2
Illicit Applications on HPC Systems Il 4 Illicit com omputation ons begin running on on HPC systems p Crypto mining p Password cracking p Denial-of-service (DoS) attacks 4 Com ommon on characteristics p For-profit or malicious attacks instead of science p Resource intensive v Powerful GPU accelerators are ideal p Long execution time: days to weeks or longer 4 Risks and security issues to o HPC p Mission-critical applications deprived of computing cycles p data leaking, system damage, etc p Empowered hacks and attacks ICPP2020 3
A Unique, New w Thread 4 Penetrating log ogin nod odes impos oses the risks p HPC systems only protect login nodes 4 Author orized users can run illicit com omputation ons p Authorization and authentication easily passed 4 Lit Little ba barrie iers and d gua uards ds exis ist p Due to performance priority in HPC systems p Little or no network traffic monitoring and host auditing 4 Com omputation ons masked and of offloa oaded to o accelerator ors p CPU-side monitoring and detection measures would fail Novel security measures needed to detect illicit computation in HPC ICPP2020 4
Opportun Op unities es and Challen llenges ges 4 HPC wor orkloa oads have unique patterns identifiable by ML p A small set of programs with specific resource usage patterns p Certain kernels and functions, e.g., FFT, BLAS 4 Accurate ML mod odels use many HW cou ounters as input p Large overhead for online detection p Intrusive to user applications ICPP2020 5
Our Approach Our 4 Online illicit wor orkloa oad detection on p Illicit GPU computation detection as classification problems p Light-weight, common system level profiling for model input p Multiple input sequences for inference accuracy p Synergistic multi-RNNs to handle complex, heterogeneous inputs Aperiodic Periodic ICPP2020 6
Data Het Da eter erogen genei eity 4 Heterog ogeneity in data sequences p Varying sample losses in resource utilization sequences p Asynchronism between the types 4 Irregularity of of event-ba based data seque quence ICPP2020 7
Sa Sample Los Losses in in Util tiliz ization tion Data ta 4 Nv Nvidia-sm smi prof ofiling los oses samples p E.g., 30% on average 4 Los osses depend on on application on and sampling interval p Different temporal information from different training apps ICPP2020 8
LSTM La LST Layers for Advanc nced Training ining 4 Split Layers for or the event-ba based driver run untime 4 Interpol olation on layer for or the resou ource utilization on sequences ICPP2020 9
Model Training and Va Mo Validation 4 Wor orkloa oads p 83 authorized applications v Rodinia, Parboil, SHOC, PolyBench, exascale Proxy Apps, etc p 17 unauthorized applications from GitHub and BitBucket v Crypto mining, password cracking, brute force attacking… 4 Data col ollection on p Periodic resource utilization v Power, core utilization, memory footprint, memory bandwidth p Event based driver runtime v Kernel events: starting time, duration, configuration v Data transfer events: starting time, latency, direction, bandwidth p HW performance counters for counterpart comparison 4 Three generation ons of of GPUs: K40, P100, and V100 ICPP2020 10
Selected Ev Evaluation Results Accuracy False NR vs. HMC based ICPP2020 11
Co Conc nclus usio ion 4 A A ne new th thread in n HPC p Illicit computation takes execution cycles and empowers attacks 4 Our prop opos osed on online detection on p Lightweight profiling p Accurate detection with fused LSTMs using multiple data sequences 4 Ou Our r findings p Illicit workloads have different patterns from HPC workloads p Multiple system-level profiling is sufficient for accurate detection p Fused RNNs are suitable for online detection ICPP2020 12
Recommend
More recommend