measures Aggregated tests of independence based on HSIC publics ou - PDF document

HAL Id: cea-02617133 scientifjques de niveau recherche, publiés ou non, pendence based on HSIC measures. EMS 2019 - European Meeting of Statisticians, Bernoulli Society, Anouar Meynaoui, Mélisande Albert, Beatrice Laurent, Amandine Marrel. Aggregated tests of inde- To cite this version: Anouar Meynaoui, Mélisande Albert, Beatrice Laurent, Amandine Marrel measures Aggregated tests of independence based on HSIC publics ou privés. recherche français ou étrangers, des laboratoires émanant des établissements d’enseignement et de destinée au dépôt et à la difgusion de documents https://hal-cea.archives-ouvertes.fr/cea-02617133 L’archive ouverte pluridisciplinaire HAL , est abroad, or from public or private research centers. teaching and research institutions in France or The documents may come from lished or not. entifjc research documents, whether they are pub- archive for the deposit and dissemination of sci- HAL is a multi-disciplinary open access Submitted on 25 May 2020 Jul 2019, Palerme, Italy. ฀cea-02617133฀

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect INSA de Toulouse Institut de Mathématiques de Toulouse, France CEA, DEN, DER, France Aggregated tests of independence based on HSIC measures (part 2) European Meeting of Statisticians, 2019 Anouar Meynaoui, Mélisande Albert, Béatrice Laurent, Amandine Marrel European Meeting of Statisticians, 2019 1 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Outline Introduction The aggregated testing procedure Simulation results Conclusion and Prospect European Meeting of Statisticians, 2019 2 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Introduction We recall that we study the independence of two real random vec- � X (1) , . . . , X ( p ) � � Y (1) , . . . , Y ( q ) � tors X = and Y = with marginal densities resp. denoted f 1 and f 2 and joint density f . We recall that we have an i.i.d. sample Z n = ( X i , Y i ) 1 ≤ i ≤ n of ( X , Y ). We rely on HSIC-based independence tests with Gaussian kernels k λ and l µ resp. associated to X and Y . In the previous talk, we first proposed for each couple of values ( λ, µ ) a theoretical HSIC test of independence of level α in (0 , 1), followed by a non-asymptotic permutation-based test, of the same level α . The power of the permuted test is shown to be approximately the same as theoretical power if enough permutations are used. European Meeting of Statisticians, 2019 3 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Introduction When f − f 1 ⊗ f 2 belongs to a Sobolev ball with regularity δ in (0 , 2], sharp upper bounds of the uniform separation rate w.r.t. the values of λ and µ are provided. The HSIC test with the optimal upper bound is shown to be mini- max over Sobolev balls. This optimal test is not adaptive, since it depends on the regularity δ . In this talk, we provide an adaptive procedure of testing independence which doesn’t depend on the regularity δ . This procedure is based on the aggregation of a collection of HSIC- tests with a collection of different bandwidths λ and µ . Numerical studies to assess the performance of the procedure and to compare methodological choices are then provided. European Meeting of Statisticians, 2019 4 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect The aggregated testing procedure Single HSIC-based test leads to the question of the choice of kernel bandwidths λ and µ . Heuristic choices are adopted in practice , with no theoretical justifications. We propose here an aggregated testing procedure combining a collection of single tests based on different bandwidths. We consider a finite or countable collection Λ × U of bandwidths in � (0 , + ∞ ) p × (0 , + ∞ ) q and a collection of positive weights ω λ,µ / � such that � ( λ,µ ) ∈ Λ × U e − ω λ,µ ≤ 1. ( λ, µ ) ∈ Λ × U European Meeting of Statisticians, 2019 5 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect The aggregated testing procedure For a given α ∈ (0 , 1), we define the aggregated test ∆ α which rejects ( H 0 ) if there is at least one ( λ, µ ) ∈ Λ × U such that � HSIC λ,µ > q λ,µ 1 − u α e − ωλ,µ , where u α is the less conservative value such that the test is of level α , and is defined by � � � � � � � HSIC λ,µ − q λ,µ u α = sup u > 0 ; P f 1 ⊗ f 2 sup > 0 ≤ α . 1 − ue − ωλ,µ ( λ,µ ) ∈ Λ × U The test function ∆ α associated to this aggregated test, takes values in { 0 , 1 } and is defined by � � � HSIC λ,µ − q λ,µ ⇐ ⇒ ∆ α = 1 sup > 0 . 1 − u α e − ωλ,µ ( λ,µ ) ∈ Λ × U European Meeting of Statisticians, 2019 6 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Oracle type conditions for the second kind error The aggregated testing procedure ∆ α is of level α . The second kind error of the aggregated testing procedure ∆ α verifies the inequality � � �� ∆ λ,µ P f (∆ α = 0) ≤ P f inf α e − ωλ,µ = 0 , ( λ,µ ) ∈ Λ × U α e − ωλ,µ is the single test of level α e − ω λ,µ associated to the where ∆ λ,µ bandwidths ( λ, µ ) The aggregated testing procedure has a second kind at most equal to β , if there exists at least one ( λ, µ ) ∈ Λ × U such that the test ∆ λ,µ α e − ωλ,µ has a probability of second kind error at most equal to β . European Meeting of Statisticians, 2019 7 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Oracle type conditions for the second kind error Theorem � � Let α, β ∈ (0 , 1) , ( k λ , l µ ) / ( λ, µ ) ∈ Λ × U a collection of Gaussian � � ω λ,µ / ( λ, µ ) ∈ Λ × U kernels and a collection of positive weights, such that � ( λ,µ ) ∈ Λ × U e − ω λ,µ ≤ 1 . We assume that f , f 1 and f 2 are bounded. We also assume that all bandwidths ( λ, µ ) in Λ × U verify the following conditions � 1 � � max ( λ 1 ...λ p , µ 1 ...µ q ) < 1 and n λ 1 ...λ p µ 1 ...µ q > log > 1 . α � � ∆ α , S δ Then, the uniform separation rate ρ p + q ( R ) , β , where δ ∈ (0 , 2] and R > 0 can be upper bounded as follows European Meeting of Statisticians, 2019 8 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Oracle type conditions for the second kind error � � � �� 2 ≤ C ( M f , p , q , β, δ ) 1 ∆ α , S δ ρ p + q ( R ) , β inf � n λ 1 ...λ p µ 1 ...µ q ( λ,µ ) ∈ Λ × U � p � � � � q � � log( 1 λ 2 δ µ 2 δ α ) + ω λ,µ + + i j i =1 j =1 where M f = max ( � f � ∞ , � f 1 � ∞ , � f 2 � ∞ ) and C ( M f , p , q , β, δ ) is a positive constant depending only on its arguments. This theorem gives an oracle type condition of the uniform separation rate. Indeed, without knowing the regularity of f − f 1 ⊗ f 2 , we prove that the uniform separation rate of ∆ α is of the same order as the smallest uniform separation rate over ( λ, µ ) ∈ Λ × U , up to ω λ,µ . European Meeting of Statisticians, 2019 9 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Adaptive procedure of testing independence We consider the bandwidth collections Λ and U defined by Λ = { (2 − m 1 , 1 , . . . , 2 − m 1 , p ) ; ( m 1 , 1 , . . . , m 1 , p ) ∈ ( N ∗ ) p } , (1) U = { (2 − m 2 , 1 , . . . , 2 − m 2 , q ) ; ( m 2 , 1 , . . . , m 2 , q ) ∈ ( N ∗ ) q } . (2) We associate to every λ = (2 − m 1 , 1 , . . . , 2 − m 1 , p ) in Λ and µ = (2 − m 2 , 1 , . . . , 2 − m 2 , q ) in U the positive weights � � � � p q � � m 1 , i × π m 2 , j × π √ √ ω λ,µ = 2 log + 2 log , (3) 6 6 i =1 j =1 so that � ( λ,µ ) ∈ Λ × U e − ω λ,µ = 1. European Meeting of Statisticians, 2019 10 / 20

Introduction The aggregated testing procedure Simulation results Conclusion and Prospect Adaptive procedure of testing independence Corollary Assuming that log log( n ) > 1 , α, β ∈ (0 , 1) and ∆ α the aggregated testing procedure, with the particular choice of Λ , U and the weights ( ω λ,µ ) ( λ,µ ) ∈ Λ × U defined in (1) , (2) and (3) . Then, the uniform separation � � ∆ α , S δ rate ρ p + q ( R ) , β of the aggregated test ∆ α over Sobolev spaces where δ in (0 , 2] , can be upper bounded as follows � log log( n ) � 2 δ � � 4 δ +( p + q ) ∆ α , S δ ρ p + q ( R ) , β ≤ C ( M f , p , q , α, β, δ ) , n where M f = max ( � f � ∞ , � f 1 � ∞ , � f 2 � ∞ ) . The rate of the aggregation procedure over the classes of Sobolev balls is in the same order of the smallest rate of single tests, up to a loglog ( n ) factor. This combined with the result on the lower bound over Sobolev shows that the aggregated test is adaptive over these regularity classes. European Meeting of Statisticians, 2019 11 / 20

measures Aggregated tests of independence based on HSIC publics ou - PDF document

HAL Id: cea-02617133 scientifjques de niveau recherche, publis ou non, pendence based on HSIC measures. EMS 2019 - European Meeting of Statisticians, Bernoulli Society, Anouar Meynaoui, Mlisande Albert, Beatrice Laurent, Amandine Marrel.

Frame- -Aggregated Concurrent Aggregated Concurrent Frame Matching Switch Matching Switch Bill

Weakening Aggregated Traffic of Weakening Aggregated Traffic of DHCP Discover Messages draft

CS70: Jean Walrand: Lecture 23. Bayes Rule, Independence, Mutual Independence 1. Conditional

Data structure Mapping data What data you need to entry Exact location (case specific)

Decomposition Behavior in Aggregated Data Sets Sarah Berube Karl-Dieter Crisman Gordon College

Analyzing Aggregated AR(1) Processes Jon Gunnip Supervisory Committee Professor Lajos Horv

Comparing User-Provided Tests to Developer-Provided Tests Ren Just, Chris Parnin, Ian Drosos,

Quasi-Exact Tests for the dichotomous Rasch Model conditional independence (local

Chapter 5.6: Tests for Independence Previously, we used parametric tests, e.g. is there any

Order Independence Krzysztof R. Apt CWI and University of Amsterdam Order Independence p.

Higher independence Vera Fischer University of Vienna February 4th, 2020 Vera Fischer

Categorical data Modelling and Independence R.W. Oldford Eikosograms - Dependence/independence

In vitro tests and experimental animal In vitro tests and experimental animal In vitro tests and

Generalized Measurement Invariance Tests for Proposed Proposed Tests Tests Factor Analysis

Hypothesis Tests using Excel T.TEST function V1e 11/12/2013 Two group hypothesis tests using

Hypothesis Tests using Z.TEST function in Excel 2008 V1c 11/16/2012 Hypothesis Tests [Excel

PIP-II LB650 RF TESTS AT CEA Claude MARCHAND CEA, DRF/Irfu/DACM/LISAH June 26, 2018 www.cea.fr

The Age of AI: The Emerging Regulatory Landscape Around the World Speakers Tariq Ahmad Jenny

Public Charge Update December 7, 2018 Health Consumer Center Free legal assistance: (800)

Disclosures Cancer Screening for Women What works? I have no conflicts of interest What

Recent Developments in the CONRAD Code regarding Experimental Corrections . Archier 1 ere 1 O.

Model-Based Testing: an Approach with SDL/RTDS and DIVERSITY {julien.deltour,emmanuel.gaudin}

Real Behavior of Floating Point Numbers SMT 2017 | Bruno Marre, Bobot Franois, Zakaria Chihani

5G CHAMPION 28 GHz 5G Proof-of-Concepts at 2018 Winter Olympic games 5G Communication with a

Sambuz

Useful Links

Newsletter

Mail Us