spatial searches in
play

SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING - PowerPoint PPT Presentation

Tams Budavri (Johns Hopkins University) 7/16/2012 SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING FOR SIMULATIONS AND OBSERVATIONS Tams Budavri (Johns Hopkins University) 7/16/2012 Storing Simulations 3 Tams


  1. Tamás Budavári (Johns Hopkins University) 7/16/2012

  2. SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING FOR SIMULATIONS AND OBSERVATIONS Tamás Budavári (Johns Hopkins University) 7/16/2012

  3. Storing Simulations 3 Tamás Budavári  Millennium Run (MPA)  10 billion particles, 64 snapshots  FoF groups and merger trees  Millennium XXL  300 billion particles  MultiDark – Bolshoi  Turbulence simulations (JHU)  1024 4 grid, 27TB ISSAC at HiPACC 7/16/2012

  4. Storing Simulations 4 Tamás Budavári  Millennium Run (MPA)  10 billion particles, 64 snapshots  FoF groups and merger trees  Millennium XXL  300 billion particles  MultiDark – Bolshoi  Turbulence simulations (JHU)  1024 4 grid, 27TB Kai Bürger (TUM, JHU) ISSAC at HiPACC 7/16/2012

  5. Observing Simulations 5 Tamás Budavári  Comparison to real observations  Lots of spatial searches  In the database? ISSAC at HiPACC 7/16/2012

  6. Sky Coverage 6 Tamás Budavári  For precise window function  Virtual surveys ISSAC at HiPACC 7/16/2012

  7. Outline 7 Tamás Budavári  Query shapes in SQL  Indexing with space-filling curve  Combine for spatial searches  Periodic boxes  Celestial sphere ISSAC at HiPACC 7/16/2012

  8. Databases 8 Tamás Budavári  Which one to use depends on the task  Sqlite, MySQL, PostGRES, DB2, Oracle, SQL Server  Free “express versions” of the big ones, too  Customization is a must  There is always something missing  Extend by loading your libraries ISSAC at HiPACC 7/16/2012

  9. Query Shapes Tamás Budavári 9  IShape interface TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();  Geometric primitives  Sphere, Box, Cone… ISSAC at HiPACC 7/16/2012

  10. Query Shapes Tamás Budavári 10  IShape interface TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();  Geometric primitives  Sphere, Box, Cone… ISSAC at HiPACC 7/16/2012

  11. Query Shapes Tamás Budavári 11  IShape interface TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();  Composites  Intersect, Union, Difference… ISSAC at HiPACC 7/16/2012

  12. Query Shapes Tamás Budavári 12  In SQL  UDT ISSAC at HiPACC 7/16/2012

  13. Query Shapes Tamás Budavári 13  Generic  UDT  Boolean  Methods ISSAC at HiPACC 7/16/2012

  14. Query Shapes Tamás Budavári 14  Generic  UDT  Boolean  Methods ISSAC at HiPACC 7/16/2012

  15. Indexing Tables Tamás Budavári 15  Better performance of queries  Instantaneous range searches  Fast JOINs  Syntax CREATE INDEX ix_Name ON Table (X ASC , …) INCUDE (V, …) ISSAC at HiPACC 7/16/2012

  16. Multi-Dimensional Tamás Budavári 16  Map the space to a simple index  Different kinds of Space-Filling Curves  Morton’s Z -curve  Peano-Hilbert Curve ISSAC at HiPACC 7/16/2012

  17. Peano-Hilbert Curve 17 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012

  18. Peano-Hilbert Curve 18 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012

  19. Peano-Hilbert Curve 19 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012

  20. Peano-Hilbert Curve 20 Tamás Budavári  Hierarchical space filling ISSAC at HiPACC 7/16/2012

  21. Also others… Tamás Budavári 21  Morton Z-order  Simple bit interleave  Etc…  Which one to use?  Statistical analyses  Correlation fn ISSAC at HiPACC 7/16/2012

  22. Divide and Conquer Tamás Budavári 22 ISSAC at HiPACC 7/16/2012

  23. Covers for Shapes 23 Tamás Budavári  Inside approximation  Outside overshoot ISSAC at HiPACC 7/16/2012

  24. Covers for Shapes 24 Tamás Budavári  Inside approximation  Outside overshoot  They are Key ranges ISSAC at HiPACC 7/16/2012

  25. Covers for Shapes 25 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges ISSAC at HiPACC 7/16/2012

  26. Covers for Shapes 26 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges Key between 0 and 7 ISSAC at HiPACC 7/16/2012

  27. Covers for Shapes 27 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges Key between 0 and 7 Key between 0 and 3 or Key between 8 and 11 ISSAC at HiPACC 7/16/2012

  28. Covers for Shapes 28 Tamás Budavári  Inside approximation  Outside overshoot Key between 0 and 3  They are Key ranges Key between 0 and 7 Key between 0 and 3 or Key between 8 and 11 ISSAC at HiPACC 7/16/2012

  29. Periodic Boundaries 29 Tamás Budavári  Infinite with periodicity  Have to search all boxes ISSAC at HiPACC 7/16/2012

  30. Periodic Boundaries 30 Tamás Budavári  Infinite with periodicity  Have to search all boxes ISSAC at HiPACC 7/16/2012

  31. Searching in SQL Tamás Budavári 31  Key filter  By Cover  ShiftX,-Y,-Z  Where? ISSAC at HiPACC 7/16/2012

  32. Real! Tamás Budavári 32  E.g., ISSAC at HiPACC 7/16/2012

  33. Online Interfaces Tamás Budavári 33  Largest simulations  Search and visualize  10 billion+ objects and growing…  Indra 512 simulations  Coming soon at JHU ISSAC at HiPACC 7/16/2012

  34. Millennium XXL Tamás Budavári 34 ISSAC at HiPACC

  35. Web Services Tamás Budavári 35  Programming interfaces  Execute SQL queries  Most flexible  Inject probes in simulations  Turbulence  Cosmology ISSAC at HiPACC 7/16/2012

  36. Sky Coverage 36 ISSAC at HiPACC 7/16/2012

  37. No Sky Coverage? 37 Tamás Budavári

  38. Spherical Geometry 38 Tamás Budavári A B A B Green area: A  (B- ε ) should find B if it contains an A and not masked Yellow area: A  (B± ε ) is an edge case may find B if it contains an A. 7/16/2012

  39. Approaches to Consider 39 Tamás Budavári  Pixel maps  Sensitivity, etc…  Equations of shapes  Spherical “vector graphics”  And beyond… ISSAC at HiPACC 7/16/2012

  40. An Observation 40 Tamás Budavári  FITS header with WCS  Image dimensions map to the geometry  More exposures?  No common pixel coordinate-system  Overlapping areas ISSAC at HiPACC 7/16/2012

  41. Common Pixels 41 Tamás Budavári  Pre-defined pages of an atlas  Standard in cartography  Image pyramids of hierarchical pixels  Including HTM, Igloo, HEALPix, SDSSPix , etc…  Always approximate! 7/16/2012

  42. Practical Implementation 42 Tamás Budavári  Looking at Terapixels  We know how to work with images  Now have commodity Internet  We have cheap hard-drives WorldWideTelescope.org Sky in G  gle Earth  Integrated catalogs for efficiency  How about more surveys? ISSAC at HiPACC 7/16/2012

  43. Drawing with Equations 43 Tamás Budavári  Working with 3D normal vectors  Benefits include  No wraparound  No projections  No singularities ISSAC at HiPACC 7/16/2012

  44. Drawing with Equations 44 Tamás Budavári  Direct 3D approach  Halfspace  Circle/Cap  Convex  Simple shapes  Region  Unions of convexes  Patches on the sphere ISSAC at HiPACC 7/16/2012

  45. Point in Region Test 45 Tamás Budavári   Halfspace: one side of a plane ( , ) n c      Inside, when n x c  Convex: a collection of halfspaces  Inside, when inside all halfspaces  Region: a collection of convexes  Inside, when inside any convex ISSAC at HiPACC 7/16/2012

  46. Shape Operations 46 Tamás Budavári  Intersection  Concat halfspace lists  Union  Concat convex lists  Unique coverage  Analytic area  Boolean algebra ISSAC at HiPACC 7/16/2012

  47. Difference of Convexes is a Region Tamás Budavári 47  The set of Regions is closed for the Boolean ops ISSAC at HiPACC 7/16/2012

  48. Simplification 48 Tamás Budavári  Eliminate redundant halfspaces  First handle trivial combinations of constraints  Then solve geometry on the surface  Derive Roots, Arcs, Patches  Eliminate redundant convexes  Some trivial cases, but …  Make convexes disjoint  Unique coverage, area, etc.  Stitch together convexes  When possible ISSAC at HiPACC 7/16/2012

  49. SphericalLib .NET 49 Tamás Budavári  C# code  10k lines  OS independent (Windows, Un*x w/ Mono)  Documentation via Sandcastle  Great performance!  Sloan Digital Sky Survey in 10s (13× larger than USA in area) ISSAC at HiPACC 7/16/2012

  50. Numerical Imprecision 50 Tamás Budavári  Double precision calculations  Lots of tricks from Graphics Gems ɛ  IEEE 754 standard  Degeneracy  When are two vectors the same?  Spatial resolution limit  Roughly 30 cm on Earth ISSAC at HiPACC 7/16/2012

  51. 51 Tamás Budavári Sky coverage of the Sloan Digital Sky Survey’s 5 th Data Release and the Galaxy Evolution Explorer’s 2 nd Public Release

  52. Region in SQL Tamás Budavári 52 ISSAC at HiPACC 7/16/2012

Recommend


More recommend