 
              Tamás Budavári (Johns Hopkins University) 7/16/2012
SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING FOR SIMULATIONS AND OBSERVATIONS Tamás Budavári (Johns Hopkins University) 7/16/2012
Storing Simulations 3 Tamás Budavári  Millennium Run (MPA)  10 billion particles, 64 snapshots  FoF groups and merger trees  Millennium XXL  300 billion particles  MultiDark – Bolshoi  Turbulence simulations (JHU)  1024 4 grid, 27TB ISSAC at HiPACC 7/16/2012
Storing Simulations 4 Tamás Budavári  Millennium Run (MPA)  10 billion particles, 64 snapshots  FoF groups and merger trees  Millennium XXL  300 billion particles  MultiDark – Bolshoi  Turbulence simulations (JHU)  1024 4 grid, 27TB Kai Bürger (TUM, JHU) ISSAC at HiPACC 7/16/2012
Observing Simulations 5 Tamás Budavári  Comparison to real observations  Lots of spatial searches  In the database? ISSAC at HiPACC 7/16/2012
Sky Coverage 6 Tamás Budavári  For precise window function  Virtual surveys ISSAC at HiPACC 7/16/2012
Outline 7 Tamás Budavári  Query shapes in SQL  Indexing with space-filling curve  Combine for spatial searches  Periodic boxes  Celestial sphere ISSAC at HiPACC 7/16/2012
Databases 8 Tamás Budavári  Which one to use depends on the task  Sqlite, MySQL, PostGRES, DB2, Oracle, SQL Server  Free “express versions” of the big ones, too  Customization is a must  There is always something missing  Extend by loading your libraries ISSAC at HiPACC 7/16/2012
Query Shapes Tamás Budavári 9  IShape interface TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();  Geometric primitives  Sphere, Box, Cone… ISSAC at HiPACC 7/16/2012
Query Shapes Tamás Budavári 10  IShape interface TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();  Geometric primitives  Sphere, Box, Cone… ISSAC at HiPACC 7/16/2012
Query Shapes Tamás Budavári 11  IShape interface TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();  Composites  Intersect, Union, Difference… ISSAC at HiPACC 7/16/2012
Query Shapes Tamás Budavári 12  In SQL  UDT ISSAC at HiPACC 7/16/2012
Query Shapes Tamás Budavári 13  Generic  UDT  Boolean  Methods ISSAC at HiPACC 7/16/2012
Query Shapes Tamás Budavári 14  Generic  UDT  Boolean  Methods ISSAC at HiPACC 7/16/2012
Indexing Tables Tamás Budavári 15  Better performance of queries  Instantaneous range searches  Fast JOINs  Syntax CREATE INDEX ix_Name ON Table (X ASC , …) INCUDE (V, …) ISSAC at HiPACC 7/16/2012
Multi-Dimensional Tamás Budavári 16  Map the space to a simple index  Different kinds of Space-Filling Curves  Morton’s Z -curve  Peano-Hilbert Curve ISSAC at HiPACC 7/16/2012
Peano-Hilbert Curve 17 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012
Peano-Hilbert Curve 18 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012
Peano-Hilbert Curve 19 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012
Peano-Hilbert Curve 20 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012
Also others… Tamás Budavári 21  Morton Z-order  Simple bit interleave  Etc…  Which one to use?  Statistical analyses  Correlation fn ISSAC at HiPACC 7/16/2012
Divide and Conquer Tamás Budavári 22 ISSAC at HiPACC 7/16/2012
Covers for Shapes 23 Tamás Budavári  Inside approximation  Outside overshoot ISSAC at HiPACC 7/16/2012
Covers for Shapes 24 Tamás Budavári  Inside approximation  Outside overshoot  They are Key ranges ISSAC at HiPACC 7/16/2012
Covers for Shapes 25 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges ISSAC at HiPACC 7/16/2012
Covers for Shapes 26 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges Key between 0 and 7 ISSAC at HiPACC 7/16/2012
Covers for Shapes 27 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges Key between 0 and 7 Key between 0 and 3 or Key between 8 and 11 ISSAC at HiPACC 7/16/2012
Covers for Shapes 28 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges Key between 0 and 7 Key between 0 and 3 or Key between 8 and 11 ISSAC at HiPACC 7/16/2012
Periodic Boundaries 29 Tamás Budavári  Infinite with periodicity  Have to search all boxes ISSAC at HiPACC 7/16/2012
Periodic Boundaries 30 Tamás Budavári  Infinite with periodicity  Have to search all boxes ISSAC at HiPACC 7/16/2012
Searching in SQL Tamás Budavári 31  Key filter  By Cover  ShiftX,-Y,-Z  Where? ISSAC at HiPACC 7/16/2012
Real! Tamás Budavári 32  E.g., ISSAC at HiPACC 7/16/2012
Online Interfaces Tamás Budavári 33  Largest simulations  Search and visualize  10 billion+ objects and growing…  Indra 512 simulations  Coming soon at JHU ISSAC at HiPACC 7/16/2012
Millennium XXL Tamás Budavári 34 ISSAC at HiPACC
Web Services Tamás Budavári 35  Programming interfaces  Execute SQL queries  Most flexible  Inject probes in simulations  Turbulence  Cosmology ISSAC at HiPACC 7/16/2012
Sky Coverage 36 ISSAC at HiPACC 7/16/2012
No Sky Coverage? 37 Tamás Budavári
Spherical Geometry 38 Tamás Budavári A B A B Green area: A  (B- ε ) should find B if it contains an A and not masked Yellow area: A  (B± ε ) is an edge case may find B if it contains an A. 7/16/2012
Approaches to Consider 39 Tamás Budavári  Pixel maps  Sensitivity, etc…  Equations of shapes  Spherical “vector graphics”  And beyond… ISSAC at HiPACC 7/16/2012
An Observation 40 Tamás Budavári  FITS header with WCS  Image dimensions map to the geometry  More exposures?  No common pixel coordinate-system  Overlapping areas ISSAC at HiPACC 7/16/2012
Common Pixels 41 Tamás Budavári  Pre-defined pages of an atlas  Standard in cartography  Image pyramids of hierarchical pixels  Including HTM, Igloo, HEALPix, SDSSPix , etc…  Always approximate! 7/16/2012
Practical Implementation 42 Tamás Budavári  Looking at Terapixels  We know how to work with images  Now have commodity Internet  We have cheap hard-drives WorldWideTelescope.org Sky in G  gle Earth  Integrated catalogs for efficiency  How about more surveys? ISSAC at HiPACC 7/16/2012
Drawing with Equations 43 Tamás Budavári  Working with 3D normal vectors  Benefits include  No wraparound  No projections  No singularities ISSAC at HiPACC 7/16/2012
Drawing with Equations 44 Tamás Budavári  Direct 3D approach  Halfspace  Circle/Cap  Convex  Simple shapes  Region  Unions of convexes  Patches on the sphere ISSAC at HiPACC 7/16/2012
Point in Region Test 45 Tamás Budavári   Halfspace: one side of a plane ( , ) n c      Inside, when n x c  Convex: a collection of halfspaces  Inside, when inside all halfspaces  Region: a collection of convexes  Inside, when inside any convex ISSAC at HiPACC 7/16/2012
Shape Operations 46 Tamás Budavári  Intersection  Concat halfspace lists  Union  Concat convex lists  Unique coverage  Analytic area  Boolean algebra ISSAC at HiPACC 7/16/2012
Difference of Convexes is a Region Tamás Budavári 47  The set of Regions is closed for the Boolean ops ISSAC at HiPACC 7/16/2012
Simplification 48 Tamás Budavári  Eliminate redundant halfspaces  First handle trivial combinations of constraints  Then solve geometry on the surface  Derive Roots, Arcs, Patches  Eliminate redundant convexes  Some trivial cases, but …  Make convexes disjoint  Unique coverage, area, etc.  Stitch together convexes  When possible ISSAC at HiPACC 7/16/2012
SphericalLib .NET 49 Tamás Budavári  C# code  10k lines  OS independent (Windows, Un*x w/ Mono)  Documentation via Sandcastle  Great performance!  Sloan Digital Sky Survey in 10s (13× larger than USA in area) ISSAC at HiPACC 7/16/2012
Numerical Imprecision 50 Tamás Budavári  Double precision calculations  Lots of tricks from Graphics Gems ɛ  IEEE 754 standard  Degeneracy  When are two vectors the same?  Spatial resolution limit  Roughly 30 cm on Earth ISSAC at HiPACC 7/16/2012
51 Tamás Budavári Sky coverage of the Sloan Digital Sky Survey’s 5 th Data Release and the Galaxy Evolution Explorer’s 2 nd Public Release
Region in SQL Tamás Budavári 52 ISSAC at HiPACC 7/16/2012
Recommend
More recommend