Data Mining and Machine Learning: Fundamental Concepts and - PowerPoint PPT Presentation

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer Science Universidade Federal de Minas Gerais, Belo Horizonte, Brazil Chapter 6: High-dimensional Data Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 1 / 21

High-dimensional Space Let D be a n × d data matrix. In data mining typically the data is very high dimensional. Understanding the nature of high-dimensional space, or hyperspace , is very important, especially because it does not behave like the more familiar geometry in two or three dimensions. Hyper-rectangle: The data space is a d -dimensional hyper-rectangle d � � � R d = min( X j ) , max( X j ) j = 1 where min( X j ) and max ( X j ) specify the range of X j . Hypercube: Assume the data is centered, and let m denote the maximum attribute value � � d n m = max max | x ij | j = 1 i = 1 The data hyperspace can be represented as a hypercube , centered at 0, with all sides of length l = 2 m , given as � � � ∀ i , x i ∈ [ − l / 2 , l / 2 ] x = ( x 1 , x 2 ,..., x d ) T � H d ( l ) = The unit hypercube has all sides of length l = 1, and is denoted as H d ( 1 ) . Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 2 / 21

Hypersphere Assume that the data has been centered, so that µ = 0. Let r denote the largest magnitude among all points: � � r = max � x i � i The data hyperspace can be represented as a d -dimensional hyperball centered at 0 with radius r , defined as � d � � x 2 j ≤ r 2 � � � B d ( r ) = x | � x � ≤ r or B d ( r ) = x = ( x 1 , x 2 ,..., x d ) � j = 1 The surface of the hyperball is called a hypersphere , and it consists of all the points exactly at distance r from the center of the hyperball � � S d ( r ) = x | � x � = r � d � ( x j ) 2 = r 2 � � or S d ( r ) = x = ( x 1 , x 2 ,..., x d ) � j = 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 3 / 21

bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC b bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Iris Data Hyperspace: Hypercube and Hypersphere l = 4 . 12 and r = 2 . 19 2 1 X 2 : sepal width r 0 − 1 − 2 − 2 − 1 0 1 2 X 1 : sepal length Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 4 / 21

High-dimensional Volumes Hypercube: The volume of a hypercube with edge length l is given as vol( H d ( l )) = l d Hypersphere The volume of a hyperball and its corresponding hypersphere is identical The volume of a hypersphere is given as In 3D: vol( S 3 ( r )) = 4 In 2D: vol( S 2 ( r )) = π r 2 3 π r 3 In 1D: vol( S 1 ( r )) = 2 r � d � π vol( S d ( r )) = K d r d = 2 r d In d -dimensions: � d � Γ 2 + 1 where �� d � ! if d is even � d � 2 Γ 2 + 1 = √ π � � d !! if d is odd 2 ( d + 1 ) / 2 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 5 / 21

bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Volume of Unit Hypersphere With increasing dimensionality the hypersphere volume first increases up to a point, and then starts to decrease, and ultimately vanishes. In particular, for the unit hypersphere with r = 1, d π 2 d →∞ vol( S d ( 1 )) = lim lim 2 + 1 ) → 0 Γ( d d →∞ 5 4 vol( S d ( 1 )) 3 2 1 0 0 5 10 15 20 25 30 35 40 45 50 d Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 6 / 21

Hypersphere Inscribed within Hypercube Consider the space enclosed within the largest hypersphere that can be accommodated within a hypercube (which represents the dataspace). The ratio of the volume of the hypersphere of radius r to the hypercube with side length l = 2 r is given as vol( H 2 ( 2 r )) = π r 2 vol( S 2 ( r )) 4 r 2 = π In 2 dimensions: 4 = 78 . 5 % 4 3 π r 3 vol( S 3 ( r )) 8 r 3 = π In 3 dimensions: vol( H 3 ( 2 r )) = 6 = 52 . 4 % π d / 2 vol( S d ( r )) In d dimensions: lim vol( H d ( 2 r )) = lim 2 + 1 ) → 0 2 d Γ( d d →∞ d →∞ As the dimensionality increases, most of the volume of the hypercube is in the “corners,” whereas the center is essentially empty. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 7 / 21

Hypersphere Inscribed inside a Hypercube − r r − r 0 0 r Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 8 / 21

Conceptual View of High-dimensional Space Two, three, four, and higher dimensions All the volume of the hyperspace is in the corners, with the center being essentially empty. High-dimensional space looks like a rolled-up porcupine! (a) 2D (b) 3D (c) 4D (d) d D Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 9 / 21

Volume of a Thin Shell The volume of a thin hypershell of width ǫ is given as vol( S d ( r ,ǫ )) = vol( S d ( r )) − vol( S d ( r − ǫ )) = K d r d − K d ( r − ǫ ) d . The ratio of volume of the thin shell to the volume of the outer sphere: r vol( S d ( r )) = K d r d − K d ( r − ǫ ) d vol( S d ( r ,ǫ )) 1 − ǫ � d � = 1 − r K d r d r − ǫ As d increases, we have ǫ vol( S d ( r ,ǫ )) 1 − ǫ � d � lim vol( S d ( r )) = lim d →∞ 1 − → 1 r d →∞ Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 10 / 21

Diagonals in Hyperspace Consider a d -dimensional hypercube, with origin 0 d = ( 0 1 , 0 2 ,..., 0 d ) , and bounded in each dimension in the range [ − 1 , 1 ] . Each “corner” of the hyperspace is a d -dimensional vector of the form ( ± 1 1 , ± 1 2 ,..., ± 1 d ) T . Let e i = ( 0 1 ,..., 1 i ,..., 0 d ) T denote the d -dimensional canonical unit vector in dimension i , and let 1 denote the d -dimensional diagonal vector ( 1 1 , 1 2 ,..., 1 d ) T . Consider the angle θ d between the diagonal vector 1 and the first axis e 1 , in d dimensions: e T 1 1 e T 1 1 1 = 1 cos θ d = � e 1 � � 1 � = √ = √ √ √ � e T 1 1 T 1 d d 1 e 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 11 / 21

Diagonals in Hyperspace As d increases, we have 1 d →∞ cos θ d = lim lim √ → 0 d →∞ d which implies that d →∞ θ d → π lim 2 = 90 ◦ Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 12 / 21

Angle between Diagonal Vector 1 and e 1 1 1 1 1 θ e 1 0 0 θ e 1 − 1 − 1 − 1 − 1 0 1 1 0 0 1 − 1 (a) In 2D (b) In 3D In high dimensions all of the diagonal vectors are perpendicular (or orthogonal) to all the coordinates axes! Each of the 2 d − 1 new axes connecting pairs of 2 d corners are essentially orthogonal to all of the d principal coordinate axes! Thus, in effect, high-dimensional space has an exponential number of orthogonal “axes.” Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 13 / 21

Density of the Multivariate Normal Consider the standard multivariate normal distribution with µ = 0, and Σ = I − x T x 1 � � f ( x ) = √ 2 π ) d exp 2 ( The peak of the density is at the mean. Consider the set of points x with density at least α fraction of the density at the mean f ( x ) f ( 0 ) ≥ α � − x T x � exp ≥ α 2 x T x ≤ − 2 ln( α ) d ( x i ) 2 ≤ − 2 ln( α ) � i = 1 The sum of squared IID random variables follows a chi-squared distribution χ 2 d . Thus, � f ( x ) � P f ( 0 ) ≥ α = F χ 2 d ( − 2 ln( α )) Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 14 / 21 where F χ 2 is the CDF.

Density Contour for α Fraction of the Density at the Mean: One Dimension Let α = 0 . 5, then − 2 ln( 0 . 5 ) = 1 . 386 and F χ 2 1 ( 1 . 386 ) = 0 . 76. Thus, 24% of the density is in the tail regions. 0 . 4 0 . 3 α = 0 . 5 0 . 2 0 . 1 | | − 4 − 3 − 2 − 1 0 1 2 3 4 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 15 / 21

b Density Contour for α Fraction of the Density at the Mean: Two Dimensions Let α = 0 . 5, then − 2 ln( 0 . 5 ) = 1 . 386 and F χ 2 2 ( 1 . 386 ) = 0 . 50. Thus, 50% of the density is in the tail regions. f ( x ) 0.15 0.10 α = 0 . 5 0.05 − 4 − 3 − 2 0 − 1 0 X 2 1 − 4 − 3 − 2 2 − 1 0 1 3 2 X 1 3 4 4 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 6: High-dimensional Data 16 / 21

Data Mining and Machine Learning: Fundamental Concepts and - PowerPoint PPT Presentation

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer Science

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Data mining Machine Intelligence Thomas D. Nielsen September 2008 Data mining September 2008

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Data Mining: Concepts and Techniques Chapter 1 Introduction 1 August 19, 2013

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 1 of Data Mining by

Introduction What is data mining? to Data mining functionalities Data Mining Major

DATA MINING LECTURE 2 What is data? The data mining pipeline What is Data Mining? Data

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Data Mining: Concepts and Techniques Web Mining Li Xiong Slides credits: Jiawei Han and

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

{ name: "MongoDB", tags: [ "agile", "scalable", "noSQL",

CS654 Advanced Computer Architecture Lec 5 Performance + Pipeline Review Peter Kemper

Type Systems 3. Labeled Variants 4. Lists Lecture 4 Nov. 10th, 2004 5. Normalization

Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/28/2020 1 A

Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of

Constructive Arithmetics in Ore Localizations with Enough Commutativity Johannes Hoffmann, Viktor

In In NTDLL I I Trust Process Reimaging and Endpoint Security Solution Bypass Eoin Carroll

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Annoucements

Data Mining and Machine Learning: Fundamental Concepts and - PowerPoint PPT Presentation

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer Science

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Data mining Machine Intelligence Thomas D. Nielsen September 2008 Data mining September 2008

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Data Mining: Concepts and Techniques Chapter 1 Introduction 1 August 19, 2013

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 1 of Data Mining by

Introduction What is data mining? to Data mining functionalities Data Mining Major

DATA MINING LECTURE 2 What is data? The data mining pipeline What is Data Mining? Data

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Data Mining: Concepts and Techniques Web Mining Li Xiong Slides credits: Jiawei Han and

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

{ name: &quot;MongoDB&quot;, tags: [ &quot;agile&quot;, &quot;scalable&quot;, &quot;noSQL&quot;,

CS654 Advanced Computer Architecture Lec 5 Performance + Pipeline Review Peter Kemper

Type Systems 3. Labeled Variants 4. Lists Lecture 4 Nov. 10th, 2004 5. Normalization

Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/28/2020 1 A

Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of

Constructive Arithmetics in Ore Localizations with Enough Commutativity Johannes Hoffmann, Viktor

In In NTDLL I I Trust Process Reimaging and Endpoint Security Solution Bypass Eoin Carroll

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Annoucements

{ name: "MongoDB", tags: [ "agile", "scalable", "noSQL",