Statistical Analysis of Persistent Homology Genki Kusano (Tohoku University, D1) Topology and Computer 2016, Oct 28 @ Akita. Collaborators : Kenji Fukumizu (The Institute of Statistical Mathematics ) Yasuaki Hiraoka (Tohoku University, AIMR) Persistence weighted Gaussian kernel for topological data analysis . Proceedings of the 33rd ICML, pp. 2004–2013, 2016
Self introduction • Interests : Applied topology, topological data analysis B3 : Homology group, homological algebra B4 : Persistent homology, computational homology M1 : Applied topology to sensor network “ Relative interleavings and applications to sensor networks ”, JJIAM, 33(1),99-120, 2016. M2 : Statistics, machine learning, kernel methods “ Persistence weighted Gaussian kernel for topological data analysis ”, ICML, pp. 2004–2013, 2016. D1(now) : Time series analysis, dynamics, information geometry, … • Announcement : Joint Mathematics Meetings, January 4, 2017, Atlanta ★ Statistical Methods in Computational Topology and Applications Sheaves in Topological Data Analysis
? Motivation of this work T opological D ata A nalysis (TDA, 位相的データ解析 ) Mathematical methods for characterizing “shapes of data” 30 30 10 25 25 5 20 20 0 15 15 − 5 10 10 − 10 5 5 − 15 0 0 25 10 25 15 20 20 25 25 5 10 15 20 15 20 15 15 5 10 10 0 10 10 0 5 5 5 5 − 5 − 5 0 0 0 0 − 10 Liquid Glass Solid Atomic configurations of liquid, glass, and solid state of silica ( , silica — composed of silicon and oxygen) SiO 2 At the configuration level, it is difficult to distinguish liquid and glass state.
? ! Motivation of this work Persistent homology / Persistence diagram Topological descriptor of data 30 30 10 25 25 5 20 20 0 15 15 − 5 10 10 − 10 5 5 − 15 0 0 25 10 25 15 20 20 25 25 5 10 15 20 15 20 15 15 5 10 10 0 10 10 0 5 5 5 5 − 5 − 5 0 0 0 0 − 10 Liquid Glass Solid 3 3 3 2 2 2 1.8 1.8 1.8 2.5 2.5 2.5 1.6 1.6 1.6 1.4 1.4 1.4 2 2 2 Multiplicity Multiplicity Multiplicity 1.2 1.2 1.2 A 2 ] [A ] A 2 ] 1.5 1.5 1.5 [ ˚ 1 1 [ ˚ 1 0.8 0.8 0.8 1 1 1 0.6 0.6 0.6 0.4 0.4 0.4 0.5 0.5 0.5 0.2 0.2 0.2 0 0 0 − 0.5 0 0.5 1 1.5 2 2.5 3 − 0.5 0 0.5 1 1.5 2 2.5 3 − 0.5 0 0.5 1 1.5 2 2.5 3 [ ˚ A 2 ] [ ˚ A 2 ] [ ˚ A 2 ] Y. Hiraoka et al., Hierarchical structures of amorphous solids characterized by persistent homology, PNAS, 113(26):7035–7040, 2016.
Motivation of this work Classification problem Liquid Glass 3 2 3 2 1.8 1.8 2.5 2.5 1.6 1.6 1.4 1.4 2 2 Multiplicity Multiplicity 1.2 1.2 A 2 ] A 2 ] 1.5 1 1.5 1 [ ˚ [ ˚ 0.8 0.8 1 1 0.6 0.6 0.4 0.4 0.5 0.5 0.2 0.2 0 0 − 0.5 0 0.5 1 1.5 2 2.5 3 − 0.5 0 0.5 1 1.5 2 2.5 3 [ ˚ A 2 ] [ ˚ A 2 ] Q. Can we distinguish them mathematically? A. Make a statistical framework for persistence diagrams
Section 1 What is a persistence diagram
Persistence diagram X β R 2 b β d β x β β β d β X r = S x i ∈ X B ( x i ; r ) b β B ( x ; r ) = { z | d ( x, z ) ≤ r }
Persistence diagram X D 1 ( X ) R 2 b β x α x β x β β α d β b α D 1 ( X ) = { x α , x β , x γ , · · · | x a = ( b α , d α ) }
Definition of persistence diagram Definition For a filtration , X : X 1 ⊂ X 2 ⊂ · · · ⊂ X n we compute homology groups with a field coefficient K and obtain a sequence H q ( X ) : H q ( X 1 ) → H q ( X 2 ) → · · · → H q ( X n ) . This sequence is called a persistent homology. In this talk, we set a filtration by the union of balls X r = S x i ∈ X B ( x i ; r ) A persistent homology can be seen as a representation of H q ( X ) -quiver. A n
Definition of persistence diagram From Gabriel and Krull-Remak-Schmidt theorem, there is the following decomposition: M H q ( X ) ∼ I [ b i , d i ] ( I is a finite set) = i ∈ I b d I [ b, d ] : 0 ← → 0 ← → 0 ← → 0 ← → 0 F ← → F ← F ← → · · · ← → → · · · ← → → · · · ← M From the decomposition , H q ( X ) ∼ I [ b i , d i ] = i ∈ I D q ( X ) = { ( b i , d i ) | i ∈ I } the persistence diagram is defined by . Remark: can be seen as a module over and L r H q ( X r ) K [ t ] it can be decomposed from the structure theorem for PID.
Persistence Definition The lifetime of a cycle x α D 1 ( X ) pers( x α ) = d α − b α R 2 is called persistence. x α β x β x β γ x γ pers( x ) = k x � ∆ k ∞ α ∆ = { ( a, a ) | a ∈ R } A cycle with small persistence can be seen as a small cycle, and sometimes noisy cycle.
Metric structure of persistence diagram The set of persistence diagrams is defined by D = { D | D is a multiset in R 2 , ul and | D | < ∞ } R 2 ul = { ( b, d ) | b ≤ d ∈ R } where . The bottleneck ( -Wasserstein) metric ∞ , d B ( D, E ) = inf sup k x � γ ( x ) k ∞ ( γ : D [ ∆ ! E [ ∆ is bijective) γ x ∈ D ∪ ∆ where is the diagonal set, ∆ = { ( a, a ) | a ∈ R } becomes a distance on the set of persistence diagrams. ( D , d B ) Remark: is a metric space.
Stability theorem Theorem[Cohen-Steiner et al., 2007] d B ( D q ( X ) , D q ( Y )) ≤ d H ( X, Y ) X, Y ⊂ R d For finite subsets , , ⇢ � where is the Hausdorff distance. d H ( X, Y ) = max max p ∈ X min q ∈ Y d ( p, q ) , max q ∈ Y min p ∈ X d ( p, q ) Significant property This map is Lipchitz continuous. X → D q ( X ) ( Betti number is not continuous.) X → β q ( X ) = dim H q ( X ) X Y Death Birth
Stability theorem Theorem[Cohen-Steiner et al., 2007] d B ( D q ( X ) , D q ( Y )) ≤ d H ( X, Y ) X, Y ⊂ R d For finite subsets , , ⇢ � where is the Hausdorff distance. d H ( X, Y ) = max max p ∈ X min q ∈ Y d ( p, q ) , max q ∈ Y min p ∈ X d ( p, q ) Significant property This map is Lipchitz continuous. X → D q ( X ) ( Betti number is not continuous.) X → β q ( X ) = dim H q ( X ) α 1 X Y β 1 Death β 1 α 1 Birth
Statistical Topological Data Analysis Prediction Persistence Data diagram Classification D q ( X ) X Testing Facts Estimation (1)A persistence diagram is not a vector. (2)Standard statistical method is for vectors (multivariate analysis)
Statistical Topological Data Analysis Prediction Persistence Data Vector diagram Classification D q ( X ) X This work Testing Facts Estimation (1)A persistence diagram is not a vector. (2)Standard statistical method is for vectors (multivariate analysis) Make a vector representation of persistence diagram by kernel method
Section 2 Kernel method ~Statistical method for non-vector data~
Statistics for non-vector data Let be a data set and be obtained data. Ω x 1 , · · · , x n ∈ Ω To consider statistical properties of the data, it is sometimes needed to calculate summaries, like mean/average: n x 1 , · · · , x n → 1 X x i n i =1 Ω To calculate statistical summaries, the data set is desired to have structures of addition, multiplication by scalers, and inner product, that is, should be an inner product space. Ω The space of persistence diagrams is not an inner space.
Statistics for non-vector data While does not always have an inner product, by defining a Ω map , where is an inner product space, we can φ : Ω → H H consider statistical summaries in . H n x 1 , · · · , x n → φ ( x 1 ) , · · · , φ ( x n ) → 1 X φ ( x i ) ∈ H n i =1 (well-defined) Fact Many statistical summaries and machine learning techniques are calculated from the value of inner product: h φ ( x i ) , φ ( x j ) i H
Kernel method In kernel method, a positive definite kernel is used k : Ω × Ω → R as “non-linear’’ inner product on the data set. k ( x, y ) = h φ ( x ) , φ ( y ) i H x ∈ Ω k ( · , x ) : Ω → R For an element , is a function and a vector in the functional space . C ( Ω ) In many cases, what we need is just the Gram matrix ( k ( x i , x j )) i,j =1 , ··· ,n H φ ( x i ) = k ( · , x i ) Ω Machine learning ( k ( x i , x j )) x i Statistics
Kernel method In kernel method, a positive definite kernel is used k : Ω × Ω → R as “non-linear’’ inner product on the data set. k ( x, y ) = h φ ( x ) , φ ( y ) i H x ∈ Ω k ( · , x ) : Ω → R For an element , is a function and a vector in the functional space . C ( Ω ) In many cases, what we need is just the Gram matrix ( k ( x i , x j )) i,j =1 , ··· ,n H φ ( x i ) = k ( · , x i ) Ω Machine learning ( k ( x i , x j )) x i Kernel trick Statistics
Recommend
More recommend