Persistent Homology in Data Science Salzburg University of Applied Sciences, Austria May 13, 2020 1 Not at Dornbirn, Austria due to COVID-19. Partially supported by Digitiales Transferzentrum, Salzburg. Stefan Huber: Persistent Homology in Data Science 1 of 15 Stefan Huber < stefan.huber@fh-salzburg.ac.at > iDSC 2020 1 — 127.0.0.1
Data has shape Topological Data Analysis: Often data displays some shape that carries valuable information. Stefan Huber: Persistent Homology in Data Science 2 of 15 ◮ Persistent homology gives us the notion of components, holes, tunnels, cavities, and so on and quantifjes their “ signifjcance ” . Fourier analysis : signal � = persistent homology : shape
An intuitive approach: Mountains and volcanoes Stefan Huber: Persistent Homology in Data Science 3 of 15 Let f : [ 0 , 1 ] 2 → [ 0 , 1 ] be in C 0 , say, a height profjle of a geographic map. What mathematical notion is natural to capture “ mountains ” or “ volcanoes ” ? ◮ Mountains are local maxima in f . Data has noise. How to fjlter to get “ real mountains ” ? ◮ What about signifjcance, which is not height? What about volcanoes?
Topological evolution In our simple setting, the method of persistent homology is known as watershed transformation: Persistent homology keeps track of the topological evolution of U c . Stefan Huber: Persistent Homology in Data Science 4 of 15 ◮ The super-level set U c is the landmass above sea level c : U c = f − 1 ([ c , 1 ]) = { x ∈ [ 0 , 1 ] 2 : f ( x ) ≥ c } ◮ U c grows as c declines, starting at c = 1.
Topological evolution In our simple setting, the method of persistent homology is known as watershed transformation: Persistent homology keeps track of the topological evolution of U c . Stefan Huber: Persistent Homology in Data Science 4 of 15 ◮ The super-level set U c is the landmass above sea level c : U c = f − 1 ([ c , 1 ]) = { x ∈ [ 0 , 1 ] 2 : f ( x ) ≥ c } ◮ U c grows as c declines, starting at c = 1.
Topological evolution In our simple setting, the method of persistent homology is known as watershed transformation: Persistent homology keeps track of the topological evolution of U c . Stefan Huber: Persistent Homology in Data Science 4 of 15 ◮ The super-level set U c is the landmass above sea level c : U c = f − 1 ([ c , 1 ]) = { x ∈ [ 0 , 1 ] 2 : f ( x ) ≥ c } ◮ U c grows as c declines, starting at c = 1.
Topological evolution In our simple setting, the method of persistent homology is known as watershed transformation: Persistent homology keeps track of the topological evolution of U c . Stefan Huber: Persistent Homology in Data Science 4 of 15 ◮ The super-level set U c is the landmass above sea level c : U c = f − 1 ([ c , 1 ]) = { x ∈ [ 0 , 1 ] 2 : f ( x ) ≥ c } ◮ U c grows as c declines, starting at c = 1.
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
General setting An n -simplex is the convex hull of n points: signifjcance. 1 Independent classes in the persistent homology group. Stefan Huber: Persistent Homology in Data Science 5 of 15 We have a simplicial complex S as underlying space. ◮ A fjltration ( S i ) is a sequence of simplicial complexes ∅ = S 0 ⊂ · · · ⊂ S m = S Think of ( S i ) as iteratively adding adding simplices. ◮ At each step a feature is born or dies. ◮ The lifespan of a feature (component, hole, . . . ) is its
The p -th persistence diagram is a summary description: We place a point t i t j p . Persistent Homology in Data Science Stefan Huber: t i . Persistence is t j i j with multiplicity Persistence diagram time t j . 6 of 15 t i We associate at timestamp t i ∈ R to the i -th step in the fjltration ( S i ) with t j t 0 ≤ t 1 ≤ · · · ≤ t m ◮ The persistent Betti number µ i , j p counts how many p -dimensional features were born at time t i and died at
Persistence diagram time t j . Persistent Homology in Data Science Stefan Huber: 6 of 15 t i We associate at timestamp t i ∈ R to the i -th step in the fjltration ( S i ) with t j t 0 ≤ t 1 ≤ · · · ≤ t m ◮ The persistent Betti number µ i , j p counts how many death p -dimensional features were born at time t i and died at ( t i , t j ) t j persistence The p -th persistence diagram is a summary description: ◮ We place a point ( t i , t j ) with multiplicity µ i , j p . ◮ Persistence is t j − t i . birth t i
Application: Peak detection for signal analysis The function P stems from a system identifjcation for a closed-loop controller in motion control. Persistent Homology in Data Science Stefan Huber: 7 of 15 Can be computed in a few dozen lines of code in C, as fast as sorting numbers. 0-th persistence diagram of super-levelset fjltration of P . ◮ Task: Detect peak at non-zero frequency, which is the natural frequency of the system. 1.5 P Amplitude 1.0 0.5 0.0 0 20 40 60 80 100 Frequency
Recommend
More recommend