Application of Heterogeneous Parallel Computing to EO and Remote Sensing Antonio Plaza, David Valencia, Javier Plaza & Pablo Martínez Department of Technology of Computers and Communications Computer Science Department, University of Extremadura Contact e-mail: aplaza@unex.es URL: http://www.umbc.edu/rssipl/people/aplaza Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007
Talk outline � Introduction to EO & remote sensing � Detection algorithms � Classification algorithms � Heterogeneous implementations � Use of HeteroMPI � Conclusions � Future lines Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 2 JRCA 2006
Levels of information in EO & RS Remote sensing technology has evolved from panchromatic and multispectral � data, with only a few bands, to hyperspectral imagery with hundreds of bands. The evolution in sensor technology has introduced changes in algorithm design: � Quantification: Determines the abundance Quantification: Determines the abundance Hyperspectral Hyperspectral of materials (e.g. chemical/biological). of materials (e.g. chemical/biological). (100’s or 1000’s of (100’s or 1000’s of Identification: Determines the unique Identification: Determines the unique bands) bands) identity of the foregoing generic categories identity of the foregoing generic categories (i.e. material identification). (i.e. material identification). Discrimination: Determines generic Discrimination: Determines generic categories of the foregoing classes. categories of the foregoing classes. Multispectral Multispectral Multispectral Classification: Separates materials into Classification: Separates materials into spectrally similar groups. (10’s of bands) spectrally similar groups. (10’s of bands) Detection: Determines the presence of Detection: Determines the presence of Panchromatic Panchromatic materials, objects, activities, or events. materials, objects, activities, or events. Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 3 JRCA 2006
Hyperspectral imaging concept One of the most relevant problems is the presence of mixed pixels (in which � several substances may be present at sub-pixel levels). 4000 Reflectance 3000 Mixed pixel 2000 (soil + rocks) 1000 0 300 600 900 1200 1500 1800 2100 2400 Wavelength (nm) 4000 Pure pixel Reflectance 3000 (water) 2000 1000 0 300 600 900 1200 1500 1800 2100 2400 Wavelength (nm) 5000 Reflectance 4000 Mixed pixel 3000 (vegetation + soil) 2000 1000 0 300 600 900 1200 1500 1800 2100 2400 Wavelength (nm) Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 4 JRCA 2006
Hyperspectral applications � Hyperspectral image processing algorithms are very expensive in computational terms. � High computing performance is essential in may applications (environmental monitoring, fire tracking, chemical and biological detection, target detection in military applications, etc.) AVIRIS scene over New York WTC Debris and dust map (USGS) Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 IEEE International Conference on Cluster Computing – HeteroPar’2006, Barcelona 5 JRCA 2006
Why heterogeneous computing? Problems: High computational complexities in data processing algorithms. Large amounts of collected hyperspectral data sets are never used: Analyses and information mining should be conducted in reasonable processing times. Results might allow for the extraction of relevant knowledge (e.g. spectral libraries, etc.). Solutions: High-performance computers at low cost. Commodity computers made up of off-the-shelf, low-cost computing components. Networks of workstations interconnecting distributed platforms (Grid computing). Applications: Data mining and information extraction from large data repositories. Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 6 JRCA 2006
Classic analysis methodology The standard analysis methodology relies on the following steps: � Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 7 JRCA 2006
Detection algorithms � One of the most robust sub-pixel analysis techniques consists of extracting extreme “pure” pixels (endmembers) and then model mixed pixels as combinations of pure spectral signatures: e 1 Banda j e 3 3 = ∑ ⋅ + ε s c e e i i 2 = i 1 Banda i Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 8 JRCA 2006
Pixel purity index (PPI) The PPI is one of the most popular endmember detection algorithms � (available in Kodak’s Research Systems ENVI software): Skewer 1 Extreme pixel Skewer 2 Extreme pixel Skewer 3 Extreme pixel Extreme pixel Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 9 JRCA 2006
Morphological classification � Mathematical morphology is a very well-consolidated technique in the spatial domain that can be extended to the spectral domain. � It relies on a (partial) ordering relationship between the pixels of the image, and the application of a so-called structuring element: Original image Original image 3x3 structuring element defines 3x3 structuring element defines Dilations B P P neighborhood around pixel P neighborhood around pixel P Structuring element Max Max Min Min f (x,y) Erosions Grayscale image Dilation Dilation Erosion Erosion Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 10 JRCA 2006
Morphological filtering Morphological opening (erosion + dilation) K Structuring element Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 11 JRCA 2006
Extended math morphology Extended mathematical morphology allows for spatial/spectral integration: � 100% Vegetation 50% Vegetation + 50 % Soil 2 N → f : Z Z 100% Soil { } { } ( ) ( ) MEI + - ⊕ = ⊗ = f K ( x , y ) arg_Max D ( f (x, y)) f K ( x , y ) arg_Min D ( f (x, y) 2 2 ∈ ∈ (s, t) Z ( K ) (s, t) Z ( K ) Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 12 JRCA 2006
Data partitioning strategies Spectral-domain partitioning: A single pixel vector (spectral signature) may be stored in different processing units and communications would be required for individual pixel-based calculations such as those in the PPI algorithm. Spatial-domain partitioning: A pixel vector (spectral signature) is always stored in the same processing unit. As a result, the entire spectral signature of each hyperspectral image pixel is never partitioned, thus reducing the cost of inter-processor communications. Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 13 JRCA 2006
Spatial-domain partitioning Processing node #1 MEI 3x3 SE PSSP 1 MEI 1 Scatter Gather PSSP 2 MEI 2 Classification map Original image 3x3 SE MEI Processing node #2 Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 14 JRCA 2006
Parallel implementation of MM Handling communications: (1) Need for communication when the structuring element is centered around a border pixel of a local partition. (2) Overlapping scatter allows one to reduce the cost introduced by communications for small structuring element sizes (the proposed classification algorithm is based on a constant, 3x3 structuring element) D D f f i i f (5,3) f (5,3) Datos en Border Overlapping scatter Overlapping pixel borde para kernel 3x3 scatter for 3x3 D D f f D D f f i + i + 1 1 (1) (2) Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 15 JRCA 2006
Definition of benchmark function Definition of a performance model for the morphological processing algorithm (mpC): algorithm MM_perf (int m , int n , int se_size , int iter , int p , int q , int partition_size [p*q]) { coord I = p, J = q; node { I>=0 && J>=0: benchmark * ((partition_size[I*q+J]*iter); }; parent[0,0] ; } • Parameter m specifies the number samples of the data cube. • Parameter n specifies the number of lines. • Parameters se_size and iter respectively denote the size of the SE and the number of iterations executed by the algorithm. • Parameters p and q indicate the dimensions of the computational grid (in columns and rows, respectively), which are used to map the spatial coordinates of the individual processors within the processor grid layout. • Finally, parameter partition_size is an array that indicates the size of the local PSSPs (calculated automatically using the relative estimated computing power of the heterogeneous processors using the benchmark function). Meeting on Parallel Routines Optimization & Applications, University of Murcia, 12-13 June 2007 16 JRCA 2006
Recommend
More recommend