An Introduction to Neural Network Rule Extraction Algorithms By - PowerPoint PPT Presentation

An Introduction to Neural Network Rule Extraction Algorithms By Sarah Jackson

Can we trust magic? ✗ Neural Networks ✗ Machine learning black boxes ✗ Magical, unexplainable results ✗ Problems ✗ People won't trust Neural Networks since it is difficult for them to understand them ✗ End result isn't always the only thing we are looking for ✗ Unacceptable risk for certain scenarios

Why do we want them then? ✗ Neural Networks have been shown to accurately classify data ✗ Neural Networks are capable of learning and classifying in ways that other machine learning techniques may not be

Who cares about rules? ✗ Rules help to bridge the gap between connectionist and symbolic methods ✗ Rule extraction from Neural Networks will increase their acceptance ✗ Rules will also improve usefulness of data gathered from Neural Networks

What do we do with these rules? ✗ Validation ✗ We can tell something has been learned ✗ Integration ✗ Can be used with symbolic systems ✗ Theory discovery ✗ May not have been seen otherwise ✗ Explanation ability ✗ Allows exploration of knowledge in network

Are the rules good? ✗ Accuracy ✗ Correctly classify unseen examples ✗ Fidelity ✗ Same behavior as Neural Network ✗ Consistency ✗ Classify unseen examples the same ✗ Comprehensibility ✗ Size of rule set and number of clauses per rule

How does extraction work? ✗ Knowledge in Neural Networks represented by numerical weights ✗ Extraction algorithms attempt to directly or indirectly analyze the numerical data ✗ Neural Network behavior is explained through new methods

Decompositional Algorithms ✗ Knowledge is extracted from each node in the network individually ✗ Each node's rules are based on previous layers ✗ Usually simply described and accurate ✗ Require threshold approximation for each node ✗ Restricted generalization and scalability ✗ Special training procedure ✗ Special network architecture ✗ Require sigmoidal transfer functions for hidden nodes

Global Algorithms ✗ Describe output nodes as functions of input nodes ✗ Internal structure of network is not important ✗ Represent networks as decision trees ✗ Extract rules from constructed decision trees ✗ May not be efficient as complexity of network grows

Combinatorial Algorithms ✗ Uses aspects of decompositional and global algorithms ✗ Network architecture and value of weights are necessary ✗ Attempts to gain advantages of each without the disadvantages

TREPAN ✗ Trees Parroting Networks ✗ Global method ✗ Represents network knowledge through a decision tree ✗ Uses same construction as C4.5 and CART ✗ Uses breadth-first search to construct the tree instead of depth-first search

TREPAN ✗ Classes used for decision tree are those defined by the neural network ✗ List of leaf nodes kept with related data ✗ Subset of training data ✗ Set of complementary data ✗ Set of constraints ✗ Data sets used to determine if node should be further divided or left as terminal leaf ✗ Data sets meet constraints

TREPAN ✗ Nodes are removed from list when split or become terminal leaves ✗ Never added to list again ✗ Children are added to list ✗ Decision function determines type of decision tree constructed ✗ M-of-N – Each node represents an m-of-n test ✗ 1-of-N – Each node represents a 1-of-n test ✗ Simple – Each node represents a test for one attribute (true of false)

TREPAN ✗ Comparison on UCI Tic-Tac-Toe Data ✗ 27 inputs, 20 hidden nodes, 2 outputs

TREPAN ✗ Typically, shortest tree is easiest to understand ✗ M-of-N has fewest nodes, but is very difficult to understand ✗ TREPAN provides higher quality information

TREPAN

Another Global Algorithm ✗ Only uses training data to construct decision tree ✗ TREPAN uses training data and may use artificially generated data ✗ Uses CN2 and C4.5 algorithms

BDT ✗ Bound Decomposition Tree ✗ Decomposition Algorithm ✗ Designed with goals of no retraining, high accuracy and low complexity ✗ Algorithm works for Multi-Layer Perceptrons

BDT ✗ Maximum upper bounds on any neuron ✗ All inputs that have positive weight have a value of 1 ✗ Inputs with negative weight have a value of 0 ✗ Minimum lower bounds on any neuron ✗ Only inputs that have negative weight have a value of 1 ✗ Inputs with positive weight have a value of 0

BDT ✗ Each neuron has its own minimum and maximum bounds ✗ Minimum is found by adding the bias plus all negative weights ✗ Maximum is found by adding the bias plus all positive weights Weight Min Bound Max Bound I1 -0.25 -0.25 I2 0.65 0.65 I3 -0.48 -0.48 I4 0.72 0.72 Bias (-1) 1 -1 -1 -1.73 0.37

BDT ✗ Each neuron (cube) is divided into two subcubes based on the first input ✗ One subcube assumes 0 as the value and the other assumes 1 ✗ Remaining inputs are used to construct the input vectors for each subcube ✗ Bounds are calculated for each subcube ✗ Positive subcube – lower bound is positive ✗ Negative subcube – upper bound is negative ✗ Uncertain subcube – lower bound is negative and upper bound is positive

BDT ✗ Positive subcubes will always fire ✗ Represents a rule for the neuron ✗ Negative subcubes will never fire ✗ Uncertain subcubes must be further subdivided until positive and/or negative subcubes are reached ✗ Rules for a neuron are the set of all input vectors on positive subcubes ✗ Can have a Δ over 0 to prune the neuron

Sources Milare, R., De Carvalho, A., & Monard, M. (2002). An Approach to Explain Neural Networks Using Symbolic Algorithms. International Journal of Computational Intelligence and Applications . 2(4), 365-376. Heh, J. S., Chen, J. C., & Chang, M. (2008). Designing a decompositional rule extraction algorithm for neural networks with bound decomposition tree. Neural Computing and Applications . 17, 297- 309. Nobre, C., Martinelle, E., Braga, A., De Carvalho, A., Rezende, S., Braga, J. L. & Ludermir, T. (1999). Knowledge Extraction: A Comparison between Symbolic and Connectionist Methods. International Journal of Neural Systems . 9(3), 257-264.

An Introduction to Neural Network Rule Extraction Algorithms By - PowerPoint PPT Presentation

An Introduction to Neural Network Rule Extraction Algorithms By Sarah Jackson Can we trust magic? Neural Networks Machine learning black boxes Magical, unexplainable results Problems People won't trust Neural Networks since

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Rule Changes - Non rule change year Review of 2017 rule changes - just the easy to forgot

Common Rule Advanced Notice of Proposed Rulemaking (ANPRM) IRB Investigator Advanced Notice

2nd RULE: You MUST TALK about BOOK CLUB. 2nd RULE: You DO NOT talk about 3rd RULE: PERSEVERE -- If

Rule #1: Have a takeaway. Rule #2: Keep It Simple. Rule #3: Repetition is Good. Rule #4: Be

Counting Rules, etc Product Rule Generalized Product Rule Division Rule Bijection

Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

The Chain Rule Given a composite function: The Chain Rule Given a composite function: h ( x ) =

Product and Quotient Rule September 16, 2016 1 Product and Quotient Rule September 16, 2016 2

PERIMETER WALL MAINTENANCE & REPAIRS NEIGHBORHOOD UPDATE - HAMILTON COMM. CENTER MAY 15,

StreamWorks A System for Real-Time Graph Pattern Matching on Network Traffic GEORGE CHIN,

Differentiable Tree Planning for Deep RL Greg Farquhar 1 In Collaboration With Tim

Locality-Preserving Blockchain Implementation Maxime Sierro DEDIS Lab Supervisors : Kelong

SO SOUTHWESTERN MED EDIC ICAL DIS ISTRIC ICT STREETSCAPE MASTER PL PLAN A PR PRESCRIP

The Importance of Tree Canopy in Urban Conservation Amy Miller & Sarah Hurteau The Nature

Network Reliability: Approximation Algorithms Elizabeth Moseman in collaboration with Isabel

Community Detection in Social Networks Lei Tang Properties of Complex Network Power Law

An Introduction to Neural Network Rule Extraction Algorithms By - PowerPoint PPT Presentation

An Introduction to Neural Network Rule Extraction Algorithms By Sarah Jackson Can we trust magic? Neural Networks Machine learning black boxes Magical, unexplainable results Problems People won't trust Neural Networks since

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Rule Changes - Non rule change year Review of 2017 rule changes - just the easy to forgot

Common Rule Advanced Notice of Proposed Rulemaking (ANPRM) IRB Investigator Advanced Notice

2nd RULE: You MUST TALK about BOOK CLUB. 2nd RULE: You DO NOT talk about 3rd RULE: PERSEVERE -- If

Rule #1: Have a takeaway. Rule #2: Keep It Simple. Rule #3: Repetition is Good. Rule #4: Be

Counting Rules, etc Product Rule Generalized Product Rule Division Rule Bijection

Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

The Chain Rule Given a composite function: The Chain Rule Given a composite function: h ( x ) =

Product and Quotient Rule September 16, 2016 1 Product and Quotient Rule September 16, 2016 2

PERIMETER WALL MAINTENANCE &amp; REPAIRS NEIGHBORHOOD UPDATE - HAMILTON COMM. CENTER MAY 15,

StreamWorks A System for Real-Time Graph Pattern Matching on Network Traffic GEORGE CHIN,

Differentiable Tree Planning for Deep RL Greg Farquhar 1 In Collaboration With Tim

Locality-Preserving Blockchain Implementation Maxime Sierro DEDIS Lab Supervisors : Kelong

SO SOUTHWESTERN MED EDIC ICAL DIS ISTRIC ICT STREETSCAPE MASTER PL PLAN A PR PRESCRIP

The Importance of Tree Canopy in Urban Conservation Amy Miller &amp; Sarah Hurteau The Nature

Network Reliability: Approximation Algorithms Elizabeth Moseman in collaboration with Isabel

Community Detection in Social Networks Lei Tang Properties of Complex Network Power Law

PERIMETER WALL MAINTENANCE & REPAIRS NEIGHBORHOOD UPDATE - HAMILTON COMM. CENTER MAY 15,

The Importance of Tree Canopy in Urban Conservation Amy Miller & Sarah Hurteau The Nature