Learn more from your logfiles Learn more from your logfiles using machine learning using machine learning [DEV1156] Adam.Spiers @suse.com Dirk.Mueller @suse.com CC BY-NC 2.0 Thomas Hawk
We are SUSE OpenStack Cloud software engineers We are SUSE OpenStack Cloud software engineers
We love green CI We love green CI
We care about upstream OpenStack CI too We care about upstream OpenStack CI too
OpenStack Health OpenStack Health
?
Did you find it? Did you find it?
Manual Process Manual Process
Idea: Reducing scrolling by pattern matching Idea: Reducing scrolling by pattern matching warning /(? i ) warning / error / Traceback \( most recent call last \)/ error /(? i ) error / error /(? i )\ bfail ( ure | ed )?\ b / error /(? i ) fatal / error / $ h 1!! /
Dealing with false positives Dealing with false positives # Successful tempest run ok / ^ - ( Expected Fail | Failed ): 0 $ / ok / Warning : Turning on '--gpg-auto-import-keys' / ok / Warning : Permanently added .* to the list of known hosts / ok / WARNING : Device for PV .* not found or rejected by a filter / ok / WARNING : \ w + signature detected on .* offset \ d + . Wipe it ? / ok /grep - v failed \ b / # rpms containing "Error" ok / perl - Error [ -]| libsamba - errors | mariadb - errormessages / # https://bugzilla.suse.com/show_bug.cgi?id=1030822 warning / Cleaning up ( vip - admin -\ S + ) on \ S + , removing fail - count -\ 1 / # https://bugzilla.suse.com/show_bug.cgi?id=971832 ok / Failed to try - restart vsftpd @. service : Unit name vsftpd @. service is not va
Vision: Machine Learning Vision: Machine Learning
Log-Classify Log-Classify
Today's plan Today's plan Intro to Machine Learning Log-Classify Implementation Demo
AI vs ML vs DL AI vs ML vs DL
Why Machine Learning? Why Machine Learning? g e s t e a q u r e a n p c e s n y e m s e v n l a e s e p r o s a s l i o r t unknown a l l e i a e c i m c m v health l o l e t addition n o e i u t s t t m u i s c n o a u Applications r e e t o t i h p u Research r accuracy discovery m s 8 s performance e approaches 1 r 0 whether applied s 2 training n d rule Positive l i u d e b l i system t generalization similarity knowledge u a d b R goal a e systems fields e s t e relationships n i s many desired r problem o n v cluster g I i i e methods r t r n performing a unsupervised one o c given labels n c machine s o n l l object i n t u a a m c r w logic language e s d y like i e c Negative l c i e e types p t e i e p t n n n using a trained representation n r a a c l g e t g l e p n y use r i t r i statistics e l predictions e i models t b f s anomaly theoretical medical a u n time l Semisupervised algorithmic c i article f outputs o intelligence trees tasks u e features n computing False artificial k d brain rules r e signal x o s d p Natural w e n e r r study used i a o t e two Theory i n t neural i m probabilistic c h people different s e i g take e l specific c i r l n field e mining examples i decision d s without contains s a y d restricted r d e e task Bayesian a h r e c supervised r p r c l support often a a e Software a n r e human s r o r e i class thus related e r u t c neurons m t v rulebased genetic e c e a reinforcement s mathematical e u w s regression a r e d t new s o n d statistical s m inputs i perform programming information H i e s a Classification n c k f p i Main g i n o e set show n g a r v either a i bias t AI s c predict t e t Computer e i example s dictionary o r s represent o logical r e s association way n n inductive l u algorithms e called c k See o machines may detection Networks f Sparse represented a techniques m e learned v algorithm i instances t c Similar various based n o function i vision o r r program d i m d include m test model d e a e process l feature y r known p p recognition method r n r problems user o a tree learn m v m vector network e analysis r i also c e input t find n Optimization previously Relation o output i t a v e d Typically linear a r n m deep l image e e u p m approach e biases i r Journal complexity within o a f m p g i observations s o r i c y s e e t f i a r a t r values computational layers n l y e y s density c d m i r r o e e p t neuron t r leading s i e i s e h d Speech n connection t b e a l t n y n i a o c t i k i l r n i a e s b t a c r b e o n r c p o c
CI Logfiles: ML Challenges CI Logfiles: ML Challenges • Each Instance of a CI Logfile execute the same steps Install, Build, Test – – Result is recorded (success, failures) • The individual Logfiles are quickly evolving Every check-in changes it 😑 – Each run has a lot of completely unique noise 😓 • – Timestamps, UUIDs, Passwords and – ordering due to parallel execution
Learning model Variations Learning model Variations Instance-based Generalizing • Directly store instances of training • Abstracting a model from training data • Derives hypotheses directly from training • Requires much longer training phase instances • Can not "untrain" previously learned data • Model can be quickly react to new training input Artifical Neural Networks (DL) • Model can be incrementally updated discarding old training input k-Nearest-Neighbor
Overfitting / Underfitting Overfitting / Underfitting
Machine Learning Variations Machine Learning Variations Supervised Unsupervised Classification Clustering Naive Bayes K-Means NearestNeighbor Hidden Markov Model Support Vector Machines (SVM) Neural Networks ... Neural Networks ... Regression Decision Trees Linear Regression Neural Networks ...
Supervised Learning: Classification Supervised Learning: Classification Banana Banana
Using machine learning for CI log files Using machine learning for CI log files
Machine Learning Workflow Machine Learning Workflow • Build : an individual CI log file • Baseline : Collection of log files from good CI runs • Target : The failed CI log run logfile to be analyzed
log-classify: Analogy using pictures log-classify: Analogy using pictures
Generic Training Workflow Generic Training Workflow
Generic Testing Workflow Generic Testing Workflow
Generic Testing Workflow Generic Testing Workflow
Log Input transformation example Log Input transformation example Splitting by lines Mar 11 02:43:28 localhost sudo [5195]: pam _ unix ( sudo : session ): session opened for user root by ( uid = 5) Tokenization DATE localhost sudo pam _ unix sudo session session opened for user root uid Hashing hash ( DATE ) hash ( localhost ) hash ( sudo ) hash ( pam _ unix ) hash ( sudo ) hash ( session ) hash ( opened ) ... Transformation [0, ...., 0, 1, 0, ..., 0, 1, 0, ...]
Input transformation: Replace irrelevant pieces with fixed strings Input transformation: Replace irrelevant pieces with fixed strings Token Raw text months/days/date DATE UUIDs RNGU IPv4 or IPv6 addresses RNGI words that are exactly 32, 64 or 128 chars RNGN numbers of at least 3 digits RNGD
Example matrix of a CI logfile Example matrix of a CI logfile
k-Nearest Neighbors (k=1) k-Nearest Neighbors (k=1)
Example distance calculation in kNeighbors queries Example distance calculation in kNeighbors queries • VARIABLE IS NOT DEFINED is not part of the baseline
Limitations Limitations • Nearest Neighbor performs linear search in model • Complexity grows linearly with samples size • Unfiltered Noise may distract from important information • Logs containing too many features
Unique vectors over training set instances Unique vectors over training set instances
Lookup time per sample size Lookup time per sample size
Introducing log-classify Introducing log-classify
Log-classify Log-classify scikit http://scikit-learn.org/ not yet : https://www.tensorflow.org/ ( ) https://github.com/facebookresearch/pysparnn Python 3 • • Multiple Text Extraction Models • Assumes text, line based log-like input
scikit-learn scikit-learn
log-classify: Installation log-classify: Installation openSUSE Leap/Tumbleweed/SLE 15 SUSE Package Hub: $ zypper install python 3- logreduce Others install from PyPI: $ pip 3 install -- user logreduce NOTE : • log-classify is the new name • Rename from logreduce hasn't been completed yet
Recommend
More recommend