Recursive Binary Partitioning Old Dogs with New Tricks KDD - PowerPoint PPT Presentation

Recursive Binary Partitioning Old Dogs with New Tricks KDD Conference 2009 David J. Slate and Peter W. Frey

Background • D. J. Slate and L. R. Atkin, “Chess 4.5 – the Northwestern University chess program”. In P. W. Frey (Ed.), Chess Skill in Man and Machine, Springer Verlag, 1977, 1978, 1983. • P.W. Frey,”Algorithmic Strategies for Improving the Performance of Game-Playing Programs”. In D. Farmer, A. Lapedes, N. Packard and B. Wendroff (Eds.), Evolution, Games and Learning, North-Holland Physics Publishing, Amsterdam, 1986. • P. W. Frey and D. J. Slate, “Letter Recognition Using Holland-Style Adaptive Classifiers”, Machine Learning, 6, 1991, 161-182.

Database Characteristics • Hundreds of Thousands of Records • Missing Data • Erroneous Data Entries

Forecasting Challenges • Categorical Attributes and/or Outcomes • Non-Monotonic Relationships between Attributes and the Outcome • Skewed or Bimodal Numerical Distributions • Non-Additive Attribute Influence on Outcomes • Multiple Attribute Combinations that Produce Desirable Outcomes

Recursive Binary Partitioning J.A. Sonquist and J.N. Morgan, “The Detection of Interaction Effects”, Institute of Social Research Monograph no. 35, Chicago: University of Michigan, 1964 G. V. Kass, An Exploratory Technique for Investigating Large Quantities of Categorical Data. Journal of Applied Statistics, 29:2, 1980, 119-127. L. Breiman, J. H. Friedman, R. A. Olshen and C. J. Stone, Classification and Regression Trees, Pacific Grove, CA: Wadsworth, 1984.

Advantages of RBP • Rational Treatment of Missing Data • Numerical Distribution Is Not Relevant • Monotonic Relationship Not Required • Okay with Multiple “Flavors” of a Good Outcome • Non-Additive Relationships Are Not a Problem • Large Data Sets Are an Advantage • Computational Time Is Reasonable • Methodological Transparency

Problems With RBP • A Greedy, Myopic Algorithm • Overfits the Training Sample • Overshadowing of Useful Attributes

Attacking the Problems • Look-Ahead Search • Minimum Record Count for Leaf Node • Minimum Split Score for Leaf Node • Random Perturbation of Attribute Availability at Each Node • Random Perturbation of Record Availability at Each Node

Ensemble RBP • Split Rule • Terminal Nodes • Leaf Node Values • Missing Values • Ensemble of Decision Trees • Parameter Tuning

KDD Cup: Preprocessing • Removed Attributes with a Constant Value • No Normalization • Retained Missing Values • No Limit on Range of Numerical Attributes • Retained Duplicate Attributes • No Generation of Additional Features • No Modification of Categoric Attributes

KDD Cup: Attribute Selection • Preliminary Ensemble Construction for Selection of Attributes • Preliminary Traditional RBP for Selection of Attributes

KDD Cup: Model Building • Ensemble RBP methodology using Random Attribute Omission at Each Node • 40,000 Record Construction Set • 10,000 Record Test Set • 5-Fold Cross Validation to Select Parameters • Final Models Built on 50,000 records

Observations • 15,000 Attributes and 50,000 records • Binary rather than Numeric Outcomes • Categoric Attributes without Identifying Information

Recursive Binary Partitioning Old Dogs with New Tricks KDD - PowerPoint PPT Presentation

Recursive Binary Partitioning Old Dogs with New Tricks KDD Conference 2009 David J. Slate and Peter W. Frey Background D. J. Slate and L. R. Atkin, Chess 4.5 the Northwestern University chess program. In P. W. Frey (Ed.), Chess

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

The Agile PMP: Teaching an Old Dog New Tricks The Agile PMP: Teaching an Old Dog New Tricks

Teaching old type systems Teaching old type systems new tricks with type providers new tricks

TEACHING OLD COMPILERS NEW TRICKS TEACHING OLD COMPILERS NEW TRICKS Transpiling C ++ 17 to C ++ 11

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Recursive Methods Recursive problem solution Problems that are naturally solved by

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

What are Painted Dogs? What are the conservation challenges? Painted Dog Research

Welcome to Dogs Trust Dog School Why run training classes? Prevent problem behaviours - Less

Obfuscation vs. Deobfuscation ISSISP 2018 Christian Collberg University of Arizona 1.

Lets Stay Together: Towards Traffic Aware Virtual Machine Placement in Data Centers Manar

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer)

Livepatching FreeBSD kernel Maciej Grochowski Maciej.Grochowski[at]protonmail.com EuroBSDcon

Compiler Construction Lecture 16: Introduction to optimizations 2020-03-03 Michael Engel

GCC and Assembly language during the construction of an operating system kernel, microkernel, or

CPSC 213 Concurrent Programming With Threads 2nd: 12.3 1st: 13.3 Introduction to

Affinity Group 2 March 26, 2019 The University of Wisconsin Service Center will Serve the

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Recursive Binary Partitioning Old Dogs with New Tricks KDD - PowerPoint PPT Presentation

Recursive Binary Partitioning Old Dogs with New Tricks KDD Conference 2009 David J. Slate and Peter W. Frey Background D. J. Slate and L. R. Atkin, Chess 4.5 the Northwestern University chess program. In P. W. Frey (Ed.), Chess

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

The Agile PMP: Teaching an Old Dog New Tricks The Agile PMP: Teaching an Old Dog New Tricks

Teaching old type systems Teaching old type systems new tricks with type providers new tricks

TEACHING OLD COMPILERS NEW TRICKS TEACHING OLD COMPILERS NEW TRICKS Transpiling C ++ 17 to C ++ 11

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Recursive Methods Recursive problem solution Problems that are naturally solved by

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&amp;M-Spring02 1 System

What are Painted Dogs? What are the conservation challenges? Painted Dog Research

Welcome to Dogs Trust Dog School Why run training classes? Prevent problem behaviours - Less

Obfuscation vs. Deobfuscation ISSISP 2018 Christian Collberg University of Arizona 1.

Lets Stay Together: Towards Traffic Aware Virtual Machine Placement in Data Centers Manar

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer)

Livepatching FreeBSD kernel Maciej Grochowski Maciej.Grochowski[at]protonmail.com EuroBSDcon

Compiler Construction Lecture 16: Introduction to optimizations 2020-03-03 Michael Engel

GCC and Assembly language during the construction of an operating system kernel, microkernel, or

CPSC 213 Concurrent Programming With Threads 2nd: 12.3 1st: 13.3 Introduction to

Affinity Group 2 March 26, 2019 The University of Wisconsin Service Center will Serve the

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System