Representing Data with Graphs Gregory Clark gclark@math.utk.edu March 12, 2014 Gregory Clark () Representing Data with Graphs March 12, 2014 1 / 16
Introduction Given a large number of data points sitting in some high-dimensional space, we can use graphs (or generalizations of graphs) in two main ways to help analyze the data: We can represent each data point with a graph-like-structure in order 1 to compare data points more easily. (Our main goal may be to reduce the dimension of the data.) http://comptop.stanford.edu/u/preprints/mapper_slides.pdf We can represent the entire point cloud with a graph-like-structure in 2 order to observe features of the shape of the data and visualize it. Gregory Clark () Representing Data with Graphs March 12, 2014 2 / 16
Motivation Neural networks are useful for data processing, but they are like black boxes in that it’s difficult to tell exactly what they are doing. We want to be able to analyze data and understand everything about our results. http://en.wikipedia.org/wiki/Artificial_neural_network Gregory Clark () Representing Data with Graphs March 12, 2014 3 / 16
Comparing 3D Shapes (1) http://comptop.stanford.edu/u/preprints/mapperPBG.pdf Gregory Clark () Representing Data with Graphs March 12, 2014 4 / 16
Comparing 3D Shapes (2) In order to compare 3D shapes, Singh et. al. created a weighted graph from each 3D point cloud by: defining a “filter function” to be the distance the center of the shape, covering the range of the filter function with overlapping intervals, clustering the preimage of each interval, representing each cluster with a vertex, and connecting vertices whose clusters overlap. http: The resulting graphs were then compared //comptop.stanford.edu/u/preprints/mapperPBG.pdf with each other using a sophisticated distance function (Gromov-Hausdorff.) Gregory Clark () Representing Data with Graphs March 12, 2014 5 / 16
Diabetes Study The Miller-Reaven diabetes study discovered that there are two types of diabetes (left), and Singh et al. demonstrated that their approach with graphs demonstrates the same two “flares” in the data using Gaussian density as the filter function (right). http://comptop.stanford.edu/u/preprints/mapper_slides.pdf Gregory Clark () Representing Data with Graphs March 12, 2014 6 / 16
The graphs generated above are known as the Reeb graphs of the data that they came from. They are useful, but graphs are limited in the amount of information they can represent. Suppose, our data is spherical. If we reduce the data to a graph, we will lose that information. Gregory Clark () Representing Data with Graphs March 12, 2014 7 / 16
Simplices A simplex is a generalized edge that connects any number of vertices. A 0-simplex contains one vertex. A 1-simplex contains two vertices. And in general, an n -simplex contains ( n + 1) vertices. Gregory Clark () Representing Data with Graphs March 12, 2014 8 / 16
Abstract Simplicial Complexes We can generalize the notion of a graph using simplices instead of edges. Definition A graph is an ordered pair G = ( V , E ) where V is a set of vertices, and E is a set of 2-element subsets of V . Definition An abstract simplicial complex is an ordered pair S = ( V , A ) where V is a set of vertices, and A is a finite collection of subsets of V such that α ∈ A and β ⊂ α implies β ∈ A . Gregory Clark () Representing Data with Graphs March 12, 2014 9 / 16
Why “abstract?” We often blur the distinction between a graph G = ( V , E ) and its embedding in R n . G = ( { a , b , c , d } , {{ a , b } , { b , c } , { c , d }} ) Similarly, we rarely distinguish between an abstract simplicial complex and its embedding in R n . A = ( { a , b , c , d } , {{ a } , { b } , { c } , { d } , { a , b } , { b , c } , { c , d } , { b , a } , { a , b , c }} ) Gregory Clark () Representing Data with Graphs March 12, 2014 10 / 16
Example of a Simplicial Complex Gregory Clark () Representing Data with Graphs March 12, 2014 11 / 16
Complexes in Topology Simplicial complexes are often used in topology to model well-behaved surfaces and higher dimensional manifolds. Here is a simplicial complex that is homeomorphic to a torus. from Computational Topology by Edelsbrunner and Harer Gregory Clark () Representing Data with Graphs March 12, 2014 12 / 16
Ayasdi In 2008, Gunnar Carlsson, Gurjeet Singh, and Harlan Sexton founded the company Ayasdi. This company applies the state of the art in topological data analysis. Here is a video of Gunnar Carlsson talking about topological data analysis: http://www.youtube.com/embed/XfWibrh6stw Gregory Clark () Representing Data with Graphs March 12, 2014 13 / 16
Persistent Homology While creating a simplicial complex from a point cloud, vertices that are closer than some length δ are connected, and vertices further apart than δ are not connected. The choice of δ is often arbitrary, and choosing the wrong δ can lead to artificial features. from Topology and Data by Gunnar Carlsson The features that are present for a large range of choices for δ are more likely to be significant features of the data. We can plot the features for varying δ as “barcodes” to see which are significant features and which are artifacts. Gregory Clark () Representing Data with Graphs March 12, 2014 14 / 16
References G. Carlsson. Topology and Data . Bull. Amer. Math. Soc., 46:255-308, 2009. H. Edelsbrunner, J. L. Harer, Computational Topology: An Introduction , Amer. Math. Soc. 2010. G. Reeb, Sur les points singuliers d’une forme de Pfaff compl` etement egrable ou d’une fonction numrique , C. R. Acad. Sci. Paris 222, int´ 847849. 1946. G. Singh, F. Memoli and G. Carlsson, Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition , Point Based Graphics 2007, Prague, September 2007. A. J. Zomorodian, Topology for Computing , Cambridge University Press, 2005. Gregory Clark () Representing Data with Graphs March 12, 2014 15 / 16
Homework Write out the collection of sets that makes up the simplicial complex 1 shown. (The entire tetrahedral simplex { E , F , G , H } is included.) Let S = ( V , A ) with V = { a , b , c , d , e , f } and 2 A = {{ a , b } , { b , c } , { c , d } , { d , f } , { c , d , f } , { a , b , d }} . Explain why S is not an abstract simplicial complex, and then add simplices to A so that S is an abstract simplicial complex. Gregory Clark () Representing Data with Graphs March 12, 2014 16 / 16
Recommend
More recommend