Discovering Conditionally Salient Features with Statistical - PowerPoint PPT Presentation

Discovering Conditionally Salient Features with Statistical Guarantees Jaime Roquero Gimenez, James Zou Stanford University 1 / 11

Feature Selection Setting the problem: Dataset with d features X 1 , . . . , X d Response variable Y Goal : Find set of important variables H 1 ⊂ { 1 , . . . , d } A variable j ∈ H 0 is null (i.e. irrelevant for predicting Y ) if X j ⊥ ⊥ Y | X − j Otherwise, we say that that j ∈ H 1 is non-null. Construct a procedure that outputs an estimate ˆ S of H 1 False Discovery Rate control as statistical guarantee: � | ˆ S ∩ H 0 | � FDR = E | ˆ S | ∨ 1 2 / 11

Feature Selection in Linear Model Fit a linear model to the data: Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + · · · + β d X d + ǫ Which variables are important? Those whose corresponding coefficients are non-zero. β 1 , β 3 � = 0 ⇒ 1 , 3 ∈ H 1 β 2 = β 4 = · · · = β d = 0 ⇒ 2 , 4 , . . . , d ∈ H 0 In this model, non-null features are global non-nulls . We have H 1 = { 1 , 3 } , regardless of the value of X 3 / 11

Global vs. Local non-nulls What if a feature is non-null depending on the value of other features ? � Y = X 2 + ǫ if X 1 > c Y = X 3 + ǫ if X 1 ≤ c � H 1 = { 1 , 2 } if X 1 > c ” ⇒ ” H 1 = { 1 , 3 } if X 1 ≤ c From a global perspective, H 1 = { 1 , 2 , 3 } . Can we generate a procedure that selects non-null features locally , while retaining statistical guarantees? Potentially yes if model interactions in parametric models of Y | X . What if such models are not available? 4 / 11

Local Definition of Null Variable A variable j ∈ H 0 is null if X j ⊥ ⊥ Y | X − j We define / construct: the sets of local nulls H 0 ( x ) , local non-nulls H 1 ( x ) at points in feature space a procedure to return a local estimate ˆ S ( x ) of the local non-nulls a generalization of FDR to a local FDR How to retain FDR control in a local setting, without using a parametric model for Y | X ? 5 / 11

Local Definition of Null Variable A variable j ∈ H 0 ( x ) is a local null at X = x if X j ⊥ ⊥ Y | X − j = x − j We define / construct: the sets of local nulls H 0 ( x ) , local non-nulls H 1 ( x ) at points in feature space a procedure to return a local estimate ˆ S ( x ) of the local non-nulls a generalization of FDR to a local FDR How to retain FDR control in a local setting, without using a parametric model for Y | X ? 5 / 11

Knockoff Procedure Most feature selection procedures construct scores T j for each feature: X 1 , X 2 , . . ., X d , Y ↓ T 1 , T 2 , . . ., T d Then scores are ranked and some cutoff leads to ˆ S . Need a statistical model to have statistical guarantees on FDR If high-dimensional setting, statistical assumptions may fail. If wanted to do local feature selection, subsetting data could limit the power and break assumptions based on asymptotic behavior. These limitations make local feature selection a hard problem for usual methods. 6 / 11

Knockoff Procedure The knockoff procedure generates a new, synthetic dataset ˜ X , and constructs scores as previously: X 1 , X 2 , . . ., X d , ˜ X 1 , ˜ X 2 , . . . , ˜ X d , Y ↓ ↓ T 1 , ˜ ˜ T 2 , . . . , ˜ T 1 , T 2 , . . ., T d , T d Ranking the differences W j = T j − ˜ T j allows to select features with FDR control. Does not require modeling Y | X for FDR control. Statistical guarantees only depend on the validity of the process to generate ˜ X . 7 / 11

Localize the Knockoff Procedure Our work generalizes the Knockoff procedure to tackle local feature selection: Generalize the distributional properties of the knockoff variables ˜ X to the local setting, without additional constraints. Generalize the construction of the scores to capture local dependence. By generating ˜ X as in the usual knockoff procedure, using the whole dataset, the statistical guarantees hold for the localized procedure . 8 / 11

Example: Switch variable model Three switch features X s 0 , X s 1 , X s 2 and four different sets of local non-nulls S 00 , S 01 , S 10 , S 11 . Y has a linear response in X S ij . 9 / 11

Local FDR control 1.0 0.8 0.6 Power 0.4 0.2 Average Global Power - Full Space Average Local Power - Medium radius (2 Partitions) Average Local Power - Small radius (4 Partitions) 0.0 5000 10000 20000 30000 40000 50000 0.4 Average Global FDR - Full Space Average Local FDR - Medium radius (2 Partitions) 0.3 Average Local FDR - Small radius (4 Partitions) FDR 0.2 0.1 0.0 5000 10000 20000 30000 40000 50000 Number of samples 10 / 11

Thank you 11 / 11

Discovering Conditionally Salient Features with Statistical - PowerPoint PPT Presentation

Discovering Conditionally Salient Features with Statistical Guarantees Jaime Roquero Gimenez, James Zou Stanford University 1 / 11 Feature Selection Setting the problem: Dataset with d features X 1 , . . . , X d Response variable Y Goal : Find

A Review on Salient Object Detection Feng Lin Salient Object Detection Target Detect and

Reunert Year-end Results 12 months ended 30 September 2009 1 Salient features Strong balance

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Mobile Application Development and Transition | Salient CRGT Proprietary |

Fusing Generic Objectness and Visual Saliency for Salient Object Detection Yasin KAVAK

Learning to Detect A Salient Object Tie Liu, Jian Sun, Nan-Ning Zheng, Xiaoou Tang, and

Discovering Gods Word (Part-2) Discovering Gods Word (Part-2) Hermeneutics = The science

Results for the six months ended 31 December 2014 Investor Presentation: 19 February 2015 Salient

REUNERT 1H2020 RESULTS PRESENTATION For the six months ended 31 March 2020 AGENDA 01 Salient

1 GST Benefits and Salient Features GST Updates on legal front GST -

The 3 rd Covenant Re-Discovering the Word of God within the words of the Bible Re-Discovering The

~ Discovering gold in the Cortez gold-trend of Nevada ~ NUG:V NULGF:QX Discovering gold in

Discovering Mammalian Endocytic Discovering Mammalian Endocytic Pathways with High- -Throughput

DISCOVERING OF CHILDREN NEEDS DISCOVERING OF CHILDREN NEEDS AND POTENTIALS: MAP SUPPORT IN

Discovering Flight Chapter Overview Discovering Flight The Early Days of Flight Chapter

Discovering Gods Word (Part-1) Discovering Gods Word The Inspired Word (Part-1) 2

WHERE ARE THEY LOOKING? Adria recasens, MIT Presenter: Dongguang You 1 RELATED WORK The

Si nanopar tic le s: Ne w photonic and e le c tr onic mate r ial Dr. Munir H. Na yfe h De

Cormorant: COvaRiant MOleculaR Artificial Neural neTworks Spotlight Presentation Brandon M.

Tips for preparing a clear talk Kristen Grauman Facebook AI Research University of Texas at

free 18-May-17 Towards Weakly Supervised Image Understanding 1/50 Towards Weakly Supervised

Managing Performance Across Payers Getting Different Populations on the Same Page October 21,

A General Approach to Discovering, Registering, and Extracting Features from Raster Maps Craig

Renewal Monte Carlo: Renewal theory based reinforcement learning Jayakumar Subramanian and