a primer in persistent homology
play

A primer in persistent homology Bastian Rieck Motivation What is - PowerPoint PPT Presentation

A primer in persistent homology Bastian Rieck Motivation What is the shape of data? Bastian Rieck A primer in persistent homology 1 A simple example What is the shape of this set of points? Technically, a set of points does not have a


  1. A primer in persistent homology Bastian Rieck

  2. Motivation What is the ‘shape’ of data? Bastian Rieck A primer in persistent homology 1

  3. A simple example What is the shape of this set of points? Technically, a set of points does not have a ‘shape’ . Still, we perceive the points to be arranged in a circle. How can we quantify this? Bastian Rieck A primer in persistent homology 2

  4. A simple example What is the shape of this set of points? We can ‘squint’ our eyes and look at how the connectivity of the points changes. The more we squint, the more connections we see. Bastian Rieck A primer in persistent homology 3

  5. A simple example What is the shape of this set of points? We can ‘squint’ our eyes and look at how the connectivity of the points changes. The more we squint, the more connections we see. Bastian Rieck A primer in persistent homology 3

  6. A simple example What is the shape of this set of points? We can ‘squint’ our eyes and look at how the connectivity of the points changes. The more we squint, the more connections we see. Bastian Rieck A primer in persistent homology 3

  7. A simple example What is the shape of this set of points? We can ‘squint’ our eyes and look at how the connectivity of the points changes. The more we squint, the more connections we see. Bastian Rieck A primer in persistent homology 3

  8. What did we see? Points are arranged in a circle, as long as the radius of the disks we use to cover them does not exceed a certain critical threshold. How can we formulate this more precisely? Bastian Rieck A primer in persistent homology 4

  9. Algebraic topology The branch of mathematics that is concerned with fjnding invariant properties of high-dimensional objects. Simple invariants determinants are equal. Bastian Rieck A primer in persistent homology 5 1 Dimension: R 2 � = R 3 because 2 � = 3 2 Determinant: If matrices A and B are similar, their

  10. Betti numbers 1 0 0 Circle 1 1 0 Sphere 0 Point 1 Torus 1 2 1 Bastian Rieck A primer in persistent homology 1 6 A topological invariant . Informally, they count the number of holes in difgerent dimensions that occur in an object. Connected components Tunnels Voids . . . . . Space β 0 β 1 β 2 β 0 β 1 β 2

  11. Calculating Betti numbers To defjne this formally, we require a notion of ‘holes’ in simplicial complexes. This, in turn, requires the concepts of boundaries and cycles. Technically, I should write simplicial homology group every time. I am not going to do this. Instead, let us fjrst talk about simplicial complexes . Bastian Rieck A primer in persistent homology 7 The k th Betti number β k is the rank of the k th homology group H k ( X ) of the topological space X .

  12. Simplicial complexes simplicial complex if: The elements of a simplicial complex are called simplices . A Bastian Rieck A primer in persistent homology 8 A family of sets K with a collection of subsets S is called an abstract 1 { v } ∈ S for all v ∈ K . 2 If σ ∈ S and τ ⊆ σ , then τ ∈ K . k -simplex consists of k + 1 indices.

  13. Simplicial complexes Example Valid Invalid Bastian Rieck A primer in persistent homology 9

  14. We need chain groups to algebraically express the concept of a Chain groups boundary . Bastian Rieck A primer in persistent homology 10 Given a simplicial complex K , the p th chain group C p of K contains all linear combinations of p -simplices in the complex. Coeffjcients are in Z 2 , hence all elements of C p are of the form � j σ j , for σ j ∈ K . The group operation is addition with Z 2 coeffjcients.

  15. Boundary homomorphism the homomorphism that assigns each between the chain groups. Bastian Rieck A primer in persistent homology 11 Given a simplicial complex K , the p th boundary homomorphism is simplex σ = { v 0 , . . . , v p } ∈ K to its boundary: � ∂ p σ = { v 0 , . . . , ˆ v i , . . . , v k } i In the equation above, ˆ v i indicates that the set does not contain the i th vertex. The function ∂ p : C p → C p − 1 is thus a homomorphism

  16. Fundamental lemma & chain complex Bastian Rieck A primer in persistent homology 12 themselves . This leads to the chain complex : For all p , we have ∂ p − 1 ◦ ∂ p = 0 : Boundaries do not have a boundary ∂ n +1 ∂ n − 1 ∂ n → . . . ∂ 2 ∂ 1 ∂ 0 0 − − − → C n − → C n − 1 − − − − → C 1 − → C 0 − → 0

  17. Cycle and boundary groups every boundary is also a cycle. Bastian Rieck A primer in persistent homology 13 Cycle group Z p = ker ∂ p Boundary group B p = im ∂ p +1 We have B p ⊆ Z p in the group-theoretical sense. In other words,

  18. Homology groups & Betti numbers ‘removing’ cycles that are boundaries from a higher dimension: Intuitively: Calculate all boundaries, remove the boundaries that come from higher-dimensional objects, and count what is left. Bastian Rieck A primer in persistent homology 14 The p th homology group H p is a quotient group, defjned by H p = Z p / B p = ker ∂ p / im ∂ p +1 , With this defjnition, we may fjnally calculate the p th Betti number: β p = rank H p

  19. Real-world multivariate data Often: Unstructured point clouds Manifold hypothesis Bastian Rieck A primer in persistent homology 15 n items with D attributes; n × D matrix Non-random sample from R D There is an unknown d -dimensional manifold M ⊆ R D , with d ≪ D , from which our data have been sampled.

  20. Converting unstructured data into a simplicial complex Bastian Rieck A primer in persistent homology 16 Rips graph R ǫ Use a distance measure dist ( · , · ) such as the Euclidean distance and a threshold parameter ǫ . Connect u and v if dist ( u, v ) ≤ ǫ .

  21. Bastian Rieck A primer in persistent homology 17 How to get a simplicial complex from R ǫ ? Construct the Vietoris–Rips complex V ǫ by adding a k -simplex whenever all of its ( k − 1) -dimensional faces are present.

  22. How to calculate Betti numbers? Direct calculations are unstable Bastian Rieck A primer in persistent homology 18 ǫ = 0 . 35 ǫ = 0 . 53 ǫ = 0 . 88 ǫ = 1 . 05 1 β 1 0 0 . 5 0 1 ϵ

  23. Persistent homology Note that the ‘correct’ Betti number of the data persists over a assume that simplices in the Vietoris–Rips complex are added one after the other with an associated weight. This gives rise to a fjltration , Bastian Rieck A primer in persistent homology 19 certain range of the threshold parameter ǫ . To formalize this, ∅ = K 0 ⊆ K 1 ⊆ · · · ⊆ K n − 1 ⊆ K n = K , such that each K i is a valid simplicial subcomplex of K . We write w ( K i ) to denote the weight of K i .

  24. Similar to what we have previously seen, this gives rise to a sequence of homomorphisms, A primer in persistent homology Bastian Rieck 20 and a sequence of homology groups, i.e. f i,j p : H p ( K i ) → H p ( K j ) , f 0 , 1 f 1 , 2 f n − 2 ,n − 1 f n − 1 ,n p p p p 0 = H p ( K 0 ) → H p ( K 1 ) → H p ( K n − 1 ) → H p ( K n ) = H p ( K ) , − − − − − − → . . . − − − − − − − − − − − − where p denotes the dimension of the homology groups.

  25. Persistent homology group defjned as We may now track the difgerent homology classes through the individual homology groups. Bastian Rieck A primer in persistent homology 21 Given two indices i ≤ j , the p th persistent homology group H i,j p is H i,j := Z p ( K i ) / ( B p ( K j ) ∩ Z p ( K i )) , p which contains all the homology classes of K i that are still present in K j .

  26. Tracking of homology classes is defjned as A primer in persistent homology Bastian Rieck occurs. and measures the ‘scale’ at which a certain topological feature 22 ∈ H i − 1 ,i Creation in K i : c ∈ H p ( K i ) , but c / p Destruction in K j : f i,j − 1 ∈ H i − 1 ,j − 1 p ( c ) ∈ H i − 1 ,j and f i,j ( c ) / p p p The persistence of a class c that is created in K i and destroyed in K j pers ( c ) = | w ( K j ) − w ( K i ) | ,

  27. Here, the topological feature is the circle that underlies the data. It In general, a high persistence indicates relevant features. Bastian Rieck A primer in persistent homology 23 ǫ = 0 . 35 ǫ = 0 . 53 ǫ = 0 . 88 ǫ = 1 . 05 persists from ǫ = 0 . 53 to ǫ = 1 . 05 , so its persistence is: pers = 1 . 05 − 0 . 53 = 0 . 52

  28. How to represent topological information? Persistence diagram This summarizing description is always two-dimensional, regardless of the dimensionality of the input data! Bastian Rieck A primer in persistent homology 24 Given a topological feature created in K i and destroyed in K j , add a point with coordinates ( w ( K i ) , w ( K j )) to a diagram:

  29. Uses for persistence diagrams Well-defjned distance measures Persistence diagrams from the same object. Some noise has been added to the object, resulting in spurious topological features. Large-scale features remain the same, though! Bastian Rieck A primer in persistent homology 25

  30. Distance measure Second Wasserstein distance inf Bastian Rieck A primer in persistent homology 26 � � � x − η ( x ) � 2 W 2 ( X, Y ) = ∞ η : X → Y x ∈ X

Recommend


More recommend