statistical inference in high dimension application to
play

Statistical inference in high-dimension & application to brain - PowerPoint PPT Presentation

Statistical inference in high-dimension & application to brain imaging Imaging and machine learning workshop Bertrand Thirion, bertrand.thirion@inria.fr 03/04/2019 Bertrand Thirion 1 Cognitive neuroscience How are cognitive activities


  1. Statistical inference in high-dimension & application to brain imaging Imaging and machine learning workshop Bertrand Thirion, bertrand.thirion@inria.fr 03/04/2019 Bertrand Thirion 1

  2. Cognitive neuroscience How are cognitive activities affected or controlled by neural circuits in the brain ? 03/04/2019 Bertrand Thirion 2

  3. The brain, the mind and the scanner Cognitive theories Brain Experimental paradigm Scanner Brain stimuli mapping FMRI data 03/04/2019 Bertrand Thirion 3

  4. The brain, the mind and the scanner Cognitive theories Brain Experimental paradigm Scanner Brain stimuli mapping Encoding FMRI data Decoding 03/04/2019 Bertrand Thirion 4

  5. Encoding: mapping cognitive functions to brain activity Sentence - right hand- Grasping- Button press - checkerboards left hand orientation reading judgement listen-read false belief – mechanistic False belief – (auditory) Hand – side mechanistic judgement (visual) Speech-non expression - intention speech Computation - instructions saccades - fixation Guess the Face Intention - gender trustworthiness random 03/04/2019 Bertrand Thirion 5

  6. Resolution increases 2007: 2021: 2014: 3 mm 0.5 mm ? 1.5 mm p = 50,000 p = 400,000 p = 10 7 03/04/2019 Bertrand Thirion 6

  7. better estimators for large-scale brain imaging ● A causal framework for brain activity decoding ● Dimension reduction for images ● Fast regularized ensembles of models ● Statistical inference for high-dimensional models 03/04/2019 Bertrand Thirion 7

  8. Causal reasoning on encoding/decoding Causal decoding models Causal encoding models P(B|X) P(X|T) Task Brain activity Behavior Anti-causal decoding models Anti-causal encoding models P(T|X) P(X|B) [Weichwald et al Nimg 2015] 03/04/2019 Bertrand Thirion 8

  9. Causal interpretation X 1 Encoding: causal X 2 Decoding: anti-causal Task ... X p X 1 X 2 Encoding: anti-causal Behavior Decoding: causal ... X p 03/04/2019 Bertrand Thirion 9

  10. Causal reasoning on encoding/decoding X 1 X 2 Task ... X p [Weichwald et al. NIMG 2015] 03/04/2019 Bertrand Thirion 10

  11. Causal reasoning on encoding/decoding X 1 X 2 Task ... X p [Weichwald et al. NIMG 2015] 03/04/2019 Bertrand Thirion 11

  12. Causal reasoning on encoding/decoding [Weichwald et al. NIMG 2015] 03/04/2019 Bertrand Thirion 12

  13. Causal reasoning on encoding/decoding [Weichwald et al. NIMG 2015] 03/04/2019 Bertrand Thirion 13

  14. Joint encoding and decoding “Decoding” “Encoding” [Schwartz et al. NIPS 2013, Varoquaux et al. PCB 2018] 03/04/2019 Bertrand Thirion 14

  15. Decoding maps 03/04/2019 Bertrand Thirion 15

  16. Joint encoding and decoding [Schwartz et al. NIPS 2013, Varoquaux et al. PCB 2018] 03/04/2019 Bertrand Thirion 16

  17. Statistical associations and causal reasoning ● Problems: – Establish non-independence based on finite datasets → statistical tests – Large number of conditioning variables – Encoding models: Multiple comparison issues – Decoding problem: statistical tests in multiple regression 03/04/2019 Bertrand Thirion 17

  18. Brain activity decoding X 1 w X 2 y ... X p ● behavior = f (brain activity) 03/04/2019 Bertrand Thirion 18

  19. Outline ● A causal framework for brain activity decoding ● Dimension reduction for images ● Fast regularized ensembles of Models ● Statistical inference for high-dimensional models 03/04/2019 Bertrand Thirion 19

  20. Compression in the image domain ● Reduce the complexity of learning algorithms: p→k ≪ p ● Random projections = fast generic solution, but – Sub-optimal for structured signals – Not invertible when p and k are large ● Local redundancy → feature grouping strategies / clustering: “super-pixels” – Fast clustering procedures needed (large-k regime) 03/04/2019 Bertrand Thirion 20

  21. Superpixels as an image operator 03/04/2019 Bertrand Thirion 21

  22. Crafting good image compression ● Key assumption: signal of interest L-Lipschitz ● Feature grouping matrix ● almost trivially: ● Worst case Need a fast method to learn balanced clusters 03/04/2019 Bertrand Thirion 22

  23. Denoising properties ● Noisy signal model ● Denoising ● Equal-size clusters 03/04/2019 Bertrand Thirion 23

  24. Recursive neighbor Agglomeration [Thirion et al. Stamlins 2015, Hoyos Idrobo PAMI 2018] Based on local decisions = fast (linear time) – avoid percolation ReNA 03/04/2019 Bertrand Thirion 24

  25. Effect on data analysis tasks Impressive speed-up and increased accuracy with respect to non-compressed representation – Clustering has a denoising effect [Hoyos Idrobo IEEE PAMI 2018] 03/04/2019 Bertrand Thirion 25

  26. Outline ● A causal framework for brain activity decoding ● Dimension reduction for images ● Fast regularized ensembles of Models ● Statistical inference for high-dimensional models 03/04/2019 Bertrand Thirion 26

  27. Bagging of clustered models y X Clustering Solve ... (create regression contiguous on cluster- average regions) based representation various clusterings 03/04/2019 Bertrand Thirion 27

  28. Computationally efficient structure “fast regularized ensembles of models” State of the art solution: not very stable, but cheap 03/04/2019 Bertrand Thirion 28

  29. Computationally efficient structure 03/04/2019 Bertrand Thirion 29

  30. Effect on prediction accuracy [Hoyos Idrobo et al PRNI 2015, Neuroimage 2017, PAMI 2018] “fast regularized ensembles of models” 03/04/2019 Bertrand Thirion 30

  31. More results [Hoyos Idrobo et al PRNI 2015, Neuroimage 2017, PAMI 2018] 03/04/2019 Bertrand Thirion 31

  32. Outline ● A causal framework for brain activity decoding ● Dimension reduction for images ● Fast regularized ensembles of Models ● Statistical inference for high-dimensional models 03/04/2019 Bertrand Thirion 32

  33. Statistical inference on w ● Inference: find {j: w j > 0} with some statistical guarantees ● Standard solutions for high-dimensional linear models (p ≅ n) – Corrected ridge [Bühlmann 2013] – Desparsified Lasso [Zhang & Zhang 2014, Montanari 2014] – Multi-split [Meinshausen 2009] , knockoffs [Candès 2015+] ● Fail for p ≫ n 03/04/2019 Bertrand Thirion 33

  34. Desparsified Lasso [Zhang & Zhang 2014 Series B Stat Meth] 03/04/2019 Bertrand Thirion 34

  35. Desparsified Lasso 03/04/2019 Bertrand Thirion 35

  36. Preliminary assessment 03/04/2019 Bertrand Thirion 36

  37. Large p → need dimension reduction p=2000, n=100 CDL tames variance Large p kills statistical power [Chevalier et al. subm. To MICCAI] 03/04/2019 Bertrand Thirion 37

  38. Adaptation to brain imaging Step 1: compression by clustering Step 2: inference on compressed representations C lustered D esparsified L asso E nsemble of Step 3: ensembling iterate with different parcellations C lustered → aggregate p-values ( see also FReM) D esparsified L asso 03/04/2019 Bertrand Thirion 38

  39. From CDL to ECDL DL p-values from different clusterings aggregation 03/04/2019 Bertrand Thirion 39

  40. ECDL for brain imaging 03/04/2019 Bertrand Thirion 40

  41. δ-error control 03/04/2019 Bertrand Thirion 41

  42. δ-error control 03/04/2019 Bertrand Thirion 42

  43. δ-FWER control 03/04/2019 Bertrand Thirion 43

  44. δ-FWER-control 03/04/2019 Bertrand Thirion 44

  45. Simulations: ECDL > CDL [Chevalier et al. MICCAI 2018] 03/04/2019 Bertrand Thirion 45

  46. Experiments: PR and FWER control Better PR with ECDL + More accurate FWER control [Chevalier et al. MICCAI 2018] 03/04/2019 Bertrand Thirion 46

  47. Effects on real data HCP dataset, n=900 Social cognition Visual feature discrimination Language vs maths [Nguyen et al. IPMI 2019, Chevalier et al. MICCAI 2018] 03/04/2019 Bertrand Thirion 47

  48. Conclusion ● Causal reasoning → conditional association analysis ● Large-p data bring challenges: – Computation cost – Difficulty of statistical inference ● Solutions: ensembling, subsampling, compression WIP ● Classification setting ● Efficient stochastic regularizers ● Use of bootstrap ● Ongoing comparison with knockoff [Nguyen et al. IPMI 2019] [Aydore et al. subm] 03/04/2019 Bertrand Thirion 48

  49. From good ideas to good practices: software ● Machine learning in Python ● Machine learning for neuroimaging http://nilearn.github.io ● BSD, Python, OSS – Classification of (neuroimaging) data – Network analysis 03/04/2019 Bertrand Thirion 49

  50. Parietal Acknowledgements G. Varoquaux, A. Gramfort, P. Ciuciu, Other collaborators D. Wassermann, R. Poldrack, D. Engemann, J. Haxby B. Nguyen A.L. Grilo Pinho, C. F. Gorgolevski E. Dohmatob, J. Salmon A. Mensch, S. Arlot J.A. Chevalier, A. Hoyos idrobo, M. Lerasle D. Bzdok, J. Dockès, P. Cerda, C. Lazarus D. La Rocca G. Lemaitre L. El Gueddari O. Grisel M. Massias P. Ablin H. Janati J. Massich K. Dadi H. Richard C. Petitot 03/04/2019 Bertrand Thirion 50

Recommend


More recommend