empirical comparisons of fast methods
play

Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas - PowerPoint PPT Presentation

Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas { dalang, klaas } @cs.ubc.ca University of British Columbia December 17, 2004 Fast N-Body Learning - Empirical Comparisons p. 1 SumKernel Methods Fast Multipole Method


  1. Empirical Comparisons of Fast Methods Dustin Lang and Mike Klaas { dalang, klaas } @cs.ubc.ca University of British Columbia December 17, 2004 Fast N-Body Learning - Empirical Comparisons – p. 1

  2. Sum−Kernel Methods Fast Multipole Method Dual−Tree KD−tree Anchors Gaussian Kernel Regular Grid Fast Gauss Transform Improved FGT Box Filter Max−Kernel Methods Dual−Tree Regular Grid KD−tree Anchors Distance Transform A Map of Fast Methods Fast N-Body Learning - Empirical Comparisons – p. 2

  3. We claim that to be useful for other researchers, Fast Methods need: • guaranteed, adjustable error bounds: users can set the error bound low during development stage, then experiment once they know their code works. • no parameters that need to be adjusted by users (other than error tolerance). • documented error behaviour: we must explain the properties of our approximation errors. The Role of Fast Methods Fast N-Body Learning - Empirical Comparisons – p. 3

  4. We tested: � � N −� x i − y j � 2 � 2 Sum-Kernel: f j = w i exp h 2 i =1 Max-Kernel: � � � � −� x i − y j � 2 N 2 j = argmax w i exp x ∗ h 2 i =1 Gaussian kernel, fixed bandwidth h , non-negative weights w i , j = 1 . . . N . Testing Framework Fast N-Body Learning - Empirical Comparisons – p. 4

  5. For the Sum-Kernel problem, we allow a given error tolerance ǫ : | f j − f true | ≤ ǫ for each j . We tested: • Fast Gauss Transform (FGT) • Improved Fast Gauss Transform (IFGT) • Dual-Tree with kd -tree (KDtree) • Dual-Tree with ball-tree constructed via Anchors Hierarchy (Anchors) Testing Framework (2) Fast N-Body Learning - Empirical Comparisons – p. 5

  6. Fast Gauss Transform (FGT) code by Firas Hamze of UBC. KDtree and Anchors Dual-Tree code by Dustin. The same Dual-Tree code was used for KDtree and Anchors. Methods Tested Fast N-Body Learning - Empirical Comparisons – p. 6

  7. Ramani Duraiswami and Changjiang Yang generously gave their code for the Improved Fast Gauss Transform (IFGT). To make the IFGT fit in our testing framework, we had to devise a method for choosing parameters. Our method seems reasonable but is probably not optimal. All methods: in C with Matlab bindings. Methods Tested (2) Fast N-Body Learning - Empirical Comparisons – p. 7

  8. Uniformly distributed points, uniformly distributed weights, 3 dimensions, large bandwidth h = 0 . 1 , ǫ = 10 − 6 : Time. 4 10 • Naive is usually fastest. 2 10 CPU Time (s) • Only FGT is faster - but only ∼ 3 × . • IFGT may become 0 10 Naive faster - after 1 . 5 FGT IFGT hours of compute Anchors KDtree −2 10 time. 2 3 4 5 10 10 10 10 N Results (1): A Worst-Case Scenario Fast N-Body Learning - Empirical Comparisons – p. 8

  9. Uniformly distributed points, uniformly distributed weights, 3 dimensions, large bandwidth h = 0 . 1 , ǫ = 10 − 6 : Memory. 9 10 Memory Usage (bytes) 8 10 • Dual-Tree memory requirements are 7 10 an issue. FGT IFGT 6 Anchors 10 KDtree 2 3 4 5 10 10 10 10 N Results (1): A Worst-Case Scenario Fast N-Body Learning - Empirical Comparisons – p. 8

  10. Uniformly distributed points, uniformly distributed weights, 3 dimensions, smaller bandwidth h = 0 . 01 , ǫ = 10 − 6 . 2 10 • IFGT cannot be run– more than 1 10 10 10 expansion CPU Time (s) terms required for 0 10 N = 100 points. Naive FGT • Dual-Tree and FGT −1 Anchors 10 KDtree are fast, but not Order N*sqrt(N) Order N −2 O ( N ) . 10 2 3 4 5 10 10 10 10 N Results (2) Fast N-Body Learning - Empirical Comparisons – p. 9

  11. Uniformly distributed points, uniformly distributed weights, 3 dimensions, smaller bandwidth h = 0 . 01 , ǫ = 10 − 6 . 9 10 Memory Usage (bytes) 8 10 • Memory require- ments are still an 7 10 issue. FGT 6 Anchors 10 KDtree 2 3 4 5 10 10 10 10 N Results (2) Fast N-Body Learning - Empirical Comparisons – p. 9

  12. Uniform data and weights, N = 10 , 000 , ǫ = 10 − 3 , h = 0 . 01 , varying dimension: CPU time. 3 • IFGT very fast for 10 1D, infeasible 2 10 CPU Time (s) beyond 2D. • KDtree, Anchors 1 10 show (unex- Naive 0 pected?) optimal FGT 10 IFGT behaviour around 3 Anchors KDtree −1 10 or 4 dimensions. 0 1 2 10 10 10 Dimension Results (3) Fast N-Body Learning - Empirical Comparisons – p. 10

  13. Uniform data and weights, N = 10 , 000 , ǫ = 10 − 3 , h = 0 . 01 , varying dimension: Memory usage. 9 10 8 Memory Usage (bytes) 10 7 10 Naive 6 FGT 10 IFGT Anchors KDtree 0 1 2 10 10 10 Dimension Results (3) Fast N-Body Learning - Empirical Comparisons – p. 10

  14. Uniform sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 : CPU time. Naive FGT Anchors KDtree • Cost of Dual-Tree 1 10 methods increases CPU Time slowly with accuracy. • FGT cost rises 0 10 more quickly. −1 −3 −5 −7 −9 −11 10 10 10 10 10 10 Epsilon Results (4) Fast N-Body Learning - Empirical Comparisons – p. 11

  15. Uniform sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 : CPU time relative to Uniform. FGT • Error of Dual-Tree Anchors KDtree methods almost −5 10 exactly as large as Real Error allowed ( ǫ ). • FGT (and presum- −10 10 ably IFGT) overes- timate the error– thus do more work −5 −10 than required. 10 10 Epsilon Results (4) Fast N-Body Learning - Empirical Comparisons – p. 11

  16. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 0 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  17. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 1 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  18. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 2 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  19. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 3 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  20. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 1 . 5 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  21. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 2 . 0 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  22. Uniform data is a worst-case scenario for these methods. Next: clumpy data! Clumpiness = 3 . 0 Clumpy Data Fast N-Body Learning - Empirical Comparisons – p. 12

  23. Clumpy sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time. 1 10 CPU Time Naive As clumpiness FGT Anchors increases, Dual-Tree KDtree methods get faster. 0 10 1 1.5 2 2.5 3 Data Clumpiness Results (5): clumpy sources Fast N-Body Learning - Empirical Comparisons – p. 13

  24. Clumpy sources, uniform targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time relative to 1 Uniform. CPU Usage Relative to Uniform Data 0.9 0.8 Especially Anchors. 0.7 0.6 Naive FGT Anchors 0.5 KDtree 1 1.5 2 2.5 3 Data Clumpiness Results (5): clumpy sources Fast N-Body Learning - Empirical Comparisons – p. 13

  25. Clumpy sources, clumpy targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time. 1 10 CPU Time Naive FGT Even bigger improve- Anchors KDtree ments! 0 10 1 1.5 2 2.5 3 Data Clumpiness Results (6): clumpy sources and targets Fast N-Body Learning - Empirical Comparisons – p. 14

  26. Clumpy sources, clumpy targets, N = 10 , 000 , h = 0 . 01 , D = 3 , ǫ = 10 − 6 , varying clumpiness: CPU time relative to 1 Uniform. CPU Usage Relative to Uniform Data 0.9 0.8 0.7 Large variance- details 0.6 of particular clumpy 0.5 data sets? 0.4 Naive FGT Anchors 0.3 KDtree 1 1.5 2 2.5 3 Data Clumpiness Results (6): clumpy sources and targets Fast N-Body Learning - Empirical Comparisons – p. 14

  27. Clumpy sources and targets ( C = 2 ), N = 10 , 000 , h = 0 . 01 , ǫ = 10 − 3 , varying dimension: CPU time. 2 10 CPU Time (s) 1 10 Not qualitatively differ- ent from uniform data! 0 10 Naive IFGT Anchors KDtree 0 1 10 10 Dimension Results (7): clumpy, dimensionality Fast N-Body Learning - Empirical Comparisons – p. 15

  28. Clumpy sources and targets ( C = 2 ), N = 10 , 000 , h = 0 . 01 , ǫ = 10 − 3 , varying dimension: CPU time. 3 10 2 10 CPU Time (s) For reference: the non- 1 10 clumpy results. Naive 0 FGT 10 IFGT Anchors KDtree −1 10 0 1 2 10 10 10 Dimension Results (7): clumpy, dimensionality Fast N-Body Learning - Empirical Comparisons – p. 15

  29. • Synthetic-data tests; each algorithm is required to guarantee results within a given error tolerance. • IFGT: • We devised a method of choosing parameters– a different method might work better. • The error bounds seem to be very loose, so it does much more work than necessary. Summary (1) Fast N-Body Learning - Empirical Comparisons – p. 16

Recommend


More recommend