topology and data topological data analysis and manifold learning - PowerPoint PPT Presentation

by Joshua Tan, for Ufora & NYU Capstone, 12/16/2014 a library for � topology and data topological data analysis � and manifold learning

what is… dimensionality reduction Given some input space X and a sample set S , dimensionality reduction seeks to find a lower-dimensional manifold M s.t. S ⊂ M ⊂ X. � � Also known as manifold learning.

examples ❖ Kernel PCA projects up into the feature space, projects down onto the components, ranks by eigenvalues � ❖ Isomap (i.e. MDS) embeds high-d points to low-d space while preserving a dissimilarity (distance) matrix � ❖ Projection pursuit projects to the most “interesting” components according to some objective function � ❖ DBSCAN , which considers not only distances but some “density-reachability” from a cluster

Mapper ❖ Like DBSCAN, Mapper is a clustering/dimensionality reduction algorithm based on varying both a distance parameter s well as a “density” parameter � ❖ Unlike DBSCAN, Mapper is designed to be less dependent on the choice of parameters

example: breast from Nicolau et al. 2011 cancer

computing Mapper 1. generate a sample data set as a DataFrame object � 2. compute a 1-d dissimilarity matrix of distances � 3. evaluate the points using a knn-neighbors filter function � 4. define a covering of the resulting image � 5. use the pre-image of this covering to define a covering of the original data � 6. from the covering, generate a clustering of the data � 7. visualize the result as a graph � For more complicated filter functions f : X \to R^2, the generated graph will be a simplicial complex.

“connecting” the dots figures borrowed from Michael Lesnick, on IAS eNews

persistent homology ❖ Persistent homology is a technique—read, a technical tool — for computing the “shape” of data sets � ❖ In some sense, the global counterpart to Mapper

computing persistent homology ❖ Take your point cloud S and turn it into a nested sequence of simplicial complexes, a.k.a. a filtration. � � � � ❖ Zomorodian and Carlsson (2004) specify a natural algorithm for computing the homology of a filtered d-dimensional simplicial complex K , assuming we evaluate the homology over a field . � ❖ This returns a “persistent bar code”.

example: natural image statistics Data from Mumford et al.: 4167 images, randomly sample 5000 3 pixel by 3 pixel images from each image. Take the ones with highest contrast, obtain 8,000,000 points in R^9. � � Normalize w.r.t. mean intensity, project onto high- contrast images (those away from the origin). Obtain points on S^7. � � M[k,T] is the subset of M in the upper T percent of density as measured by δ k (the k-nn distance). �

Ufora ❖ Ufora is a data analytics startup based in NYC � ❖ For my project, I implemented both the Mapper algorithm and a persistent homology library in their proprietary language, Fora � ❖ https://dev.ufora.com/#/projects/mapper/HEAD/ mapper

future directions

bibliography ❖ Carlsson, Gunnar. “Topology and data”. � ❖ Zomorodian, Afra. “Computing persistent homology”. � ❖ Ghrist, Robert. “Barcodes: the persistent homology of data”. � ❖ Singh, Gurjeet. “Topological methods for the analysis of high dimensional data sets and 3D object recognition”. � ❖ Mullner, Daniel. Python Mapper at danifold.net/mapper � ❖ Blum, Avrim. “Thoughts on clustering”.

topology and data topological data analysis and manifold learning - PowerPoint PPT Presentation

by Joshua Tan, for Ufora & NYU Capstone, 12/16/2014 a library for topology and data topological data analysis and manifold learning what is dimensionality reduction Given some input space X and a sample set S , dimensionality

Topological data analysis and topology-based visualization Leila De Floriani Topology-based

Combinatorics and topology of toric arrangements II. Topology of arrangements in the complex torus

Topology Discovery Correlating different network topology layers in heterogeneous environments

Order Topology Definition Let ( X , < ) be an ordered set. Then the order topology on X is the

I2RS Service Topology Draft-hares-i2rs-service-topo-dm-05 I2RS Service Topology Model Why

Ch01. Point-Set Topology and Calculus Ping Yu Faculty of Business and Economics The University

Geometry and Topology, Lecture 4 The fundamental group and covering spaces Text: Andrew Ranicki

Topology and Domain Theory Interfaces Jimmie Lawson July, 2018 Jimmie Lawson Topology and

Pizzas, Bagels, Pretzels, and Euler's Magical ---- an informal introduction to topology What

Topology, Geometry, and Physics John Morgan University of Haifa, Israel March 28 30, 2017

Undecidability in group theory, topology, and F.p. groups Word problem Markov properties

Hard Problems in 3-Manifold Topology School on Low-Dimensional Geometry and Topology: Discrete

Router Microarchitecture and Scalability of Ring Topology in On-Chip Networks John Kim, Hanjoon

Hard Problems in 3-Manifold Topology Einstein Workshop on Discrete Geometry and Topology Arnaud de

WHAT DO WE KNOW ABOUT THE TOPOLOGY? WHAT DO WE KNOW ABOUT THE TOPOLOGY? Number of nodes and

Alex Suciu Northeastern University Topology Seminar Brandeis University March 29, 2016 A LEX S

Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction

I TALIAN SST O PERATIONS C ENTER (ISOC) T HE USE OF MILITARY TRACKING RADAR IN S PACE S URVEILLANCE

Comparison between test field data and Gaussian plume model Dr. Laura Urso Helmholtz Zentrum

Fluxes and Footprints David Carruthers, Martin Seaton, Kate Johnson, Amy Stidworthy, Jenny

DISABILITY AND LABOUR MARKET Esmeral eralda Gerri ritse tse Robert rt Plasma man

A general, community-level test of the Janzen-Connell hypothesis C. E. Timothy Paine 1 , Natalia

Assignm ent of Assignm ent of Design Assurance Levels Design Assurance Levels SAE S1 8 &

Rural-Urban Disparities in a Time of Growth Sudipta Ghosh 1 Viktoria Hnatkovska 1 Amartya Lahiri 2

topology and data topological data analysis and manifold learning - PowerPoint PPT Presentation

by Joshua Tan, for Ufora & NYU Capstone, 12/16/2014 a library for topology and data topological data analysis and manifold learning what is dimensionality reduction Given some input space X and a sample set S , dimensionality

Topological data analysis and topology-based visualization Leila De Floriani Topology-based

Combinatorics and topology of toric arrangements II. Topology of arrangements in the complex torus

Topology Discovery Correlating different network topology layers in heterogeneous environments

Order Topology Definition Let ( X , &lt; ) be an ordered set. Then the order topology on X is the

I2RS Service Topology Draft-hares-i2rs-service-topo-dm-05 I2RS Service Topology Model Why

Ch01. Point-Set Topology and Calculus Ping Yu Faculty of Business and Economics The University

Geometry and Topology, Lecture 4 The fundamental group and covering spaces Text: Andrew Ranicki

Topology and Domain Theory Interfaces Jimmie Lawson July, 2018 Jimmie Lawson Topology and

Pizzas, Bagels, Pretzels, and Euler's Magical ---- an informal introduction to topology What

Topology, Geometry, and Physics John Morgan University of Haifa, Israel March 28 30, 2017

Undecidability in group theory, topology, and F.p. groups Word problem Markov properties

Hard Problems in 3-Manifold Topology School on Low-Dimensional Geometry and Topology: Discrete

Router Microarchitecture and Scalability of Ring Topology in On-Chip Networks John Kim, Hanjoon

Hard Problems in 3-Manifold Topology Einstein Workshop on Discrete Geometry and Topology Arnaud de

WHAT DO WE KNOW ABOUT THE TOPOLOGY? WHAT DO WE KNOW ABOUT THE TOPOLOGY? Number of nodes and

Alex Suciu Northeastern University Topology Seminar Brandeis University March 29, 2016 A LEX S

Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction

I TALIAN SST O PERATIONS C ENTER (ISOC) T HE USE OF MILITARY TRACKING RADAR IN S PACE S URVEILLANCE

Comparison between test field data and Gaussian plume model Dr. Laura Urso Helmholtz Zentrum

Fluxes and Footprints David Carruthers, Martin Seaton, Kate Johnson, Amy Stidworthy, Jenny

DISABILITY AND LABOUR MARKET Esmeral eralda Gerri ritse tse Robert rt Plasma man

A general, community-level test of the Janzen-Connell hypothesis C. E. Timothy Paine 1 , Natalia

Assignm ent of Assignm ent of Design Assurance Levels Design Assurance Levels SAE S1 8 &amp;

Rural-Urban Disparities in a Time of Growth Sudipta Ghosh 1 Viktoria Hnatkovska 1 Amartya Lahiri 2

Order Topology Definition Let ( X , < ) be an ordered set. Then the order topology on X is the

Assignm ent of Assignm ent of Design Assurance Levels Design Assurance Levels SAE S1 8 &