collaborative filtering
play

Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid - PowerPoint PPT Presentation

FastANN for High Quality Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid Pajk 3 , Kari Pulli 4 1 Google, Inc. 2 TU Graz 3 NVIDIA 4 Light Research Story Collaborative Filtering - Aggregation Collaborative Filtering -


  1. FastANN for High Quality Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid Pająk 3 , Kari Pulli 4 1 Google, Inc. 2 TU Graz 3 NVIDIA 4 Light Research

  2. Story

  3. Collaborative Filtering - Aggregation 

  4. Collaborative Filtering - Result

  5. Collaborative Filtering - Result

  6. Related Work

  7. Distance Table Distance Table (sorted) 0.13 1.43 0.98 1.33 1.21 2.33 0.13 0.98 1.21 1.33 1.43 2.33 1.22 0.31 0.45 2.01 1.75 0.48 0.31 0.45 0.48 1.22 1.75 2.01 2.11 1.12 0.92 3.16 0.33 0.21 0.21 0.33 0.92 1.12 2.11 3.16 Garcia et al. ICIP 2010

  8. Cayton et al. ADMS 2010

  9. Adams et al. SIGGRAPH 2009

  10. Limitation

  11. Our solution

  12. Our solution Efficient implementation on GPU General solution for different filters High image quality Applicable to different applications

  13. Our solution

  14. Our solution Clustering kNN Query Filtering

  15. Design challenges Register pressure Memory access pattern Thread divergence Kernel launch overhead Memory footprint

  16. Our solution Clustering kNN Query Filtering

  17. Tiling  

  18. Clustering

  19. Clustering

  20. Warp-wide operation 3 8 2 6 3 9 1 4

  21. Warp-wide operation 3 8 2 6 3 9 1 4 3 8 2 6 3 9 1

  22. Warp-wide operation 3 11 10 8 9 12 10 5

  23. Warp-wide operation 3 11 10 8 9 12 10 5 3 11 10 8 9 12

  24. Warp-wide operation 3 11 13 19 19 20 19 17

  25. Warp-wide operation 3 11 13 19 19 20 19 17 3 11 13 19

  26. Warp-wide operation 3 11 13 19 22 31 32 36 Reduce register usage Better parallelism Minimize thread and warp divergence

  27. Clustering kNN kNN kNN kNN

  28. Our solution Clustering

  29. Our solution kNN Query

  30. kNN Query kNN kNN kNN kNN

  31. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 p 1 kNN p 2 p 3 p 0 p 1 p 2 p 3 p 4 p 4

  32. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 p 1 kNN p 2 p 3 p 0 p 1 p 2 p 3 p 4 p 4

  33. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 kNN p 0 p 1 p 2 p 3 p 4

  34. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 1 0 0 1 ≤ 0.8

  35. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 0 0 0 0 ≤ 0.0

  36. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 0 0 0 0 ≤ 0.0 1 1 0 0 0 ≤ 0.4

  37. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 0 0 0 0 ≤ 0.0 1 1 0 0 0 ≤ 0.4 1 1 1 1 1 ≤ 1.3

  38. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 0 0 0 0 ≤ 0.0 1 1 0 0 0 ≤ 0.4 1 1 1 1 1 ≤ 1.3 ≤ 0.9 1 1 0 1 1

  39. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 0 0 0 0 ≤ 0.0 1 1 0 0 0 ≤ 0.4 1 1 1 1 1 ≤ 1.3 ≤ 0.9 1 1 0 1 1 ≤ 0.7 1 1 0 0 1

  40. kNN Query p 0 p 1 p 2 p 3 p 4 p 0 0.0 0.4 1.3 0.9 0.7 1 0 0 0 0 ≤ 0.0 1 1 0 0 0 ≤ 0.4 1 1 1 1 1 ≤ 1.3 ≤ 0.9 1 1 0 1 1 ≤ 0.7 1 1 0 0 1

  41. kNN Query p 0 p 1 p 2 p 3 p 4 1 1 1 1 0 1 1 1 0 1 kNN 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1

  42. Filtering

  43. Our solution Hierarchical 2-mean clustering Clustering Warp-wide operators kNN search Distance table kNN Query Voting Binary coding Filtering Filtering and aggregation

  44. Results

  45. NN quality % of patches matching the kNN result, k=16 (the higher the better) 97.88 39.01 35.21 34.86 24.87 7.18 0.22 Randomized K-means Composite Hierarchical Generalized Random Ball Ours KD-trees Clustering PatchMatch Cover

  46. NN quality D ann /D knn , k=16 (the lower the better) 23.91 7.38 3.01 2.00 1.99 1.32 1.01 Randomized K-means Composite Hierarchical Generalized Random Ball Ours KD-trees Clustering PatchMatch Cover

  47. Single Frame Noise Reduction Nonlocal Means – PSNR [dB], k=16 (the higher the better) 28.55 27.83 27.79 27.13 27.02 26.88 25.65 21.24 Randomized K-means Composite Hierarchical Generalized Random Ball Ours Exhaustive KD-trees Clustering PatchMatch Cover Search

  48. Single Frame Noise Reduction BM3D – PSNR [dB], k=16 (the higher the better) 31.10 31.05 30.72 30.71 30.68 30.57 29.87 28.92 Randomized K-means Composite Hierarchical Generalized Random Ball Ours Exhaustive KD-trees Clustering PatchMatch Cover Search

  49. Single Frame Noise Reduction Run-time [ms], k=16, 0.25MPix (the lower the better) 8930 Query Clustering 1024 1017 982 788 8.19 Randomized KD- K-means Composite Hierarchical Generalized Ours (GPU) trees Clustering PatchMatch

  50. Single Frame Noise Reduction Run-time [ms], k=16, 0.25MPix (the lower the better) 26359 Query 11328 Clustering 594.99 48.3 8.19 kNN-Garcia (GPU) Random Ball Cover Window Search Window Search Opt Ours (GPU) (GPU) (GPU) (GPU)

  51. Architectures MPix/s/Watt, k=16, 0.25MPix (the higher the better) 0.29 0.16 0.02 Core i7-950 Geforce GTX 680 Tegra K1

  52. Burst Denoising – Single Frame First frame of stack First frame of stack 26.45dB 26.45dB

  53. Burst Denoising – Single Frame First frame of stack First frame of stack 26.45dB 26.45dB

  54. Burst Denoising – Single Frame GK GK D- D- T T r r e e e e s s / / ฀r ฀r st st f f r r a a me me Ou Ou r r s s NL NL M M / / ฀r ฀r st st f f r r a a me me First frame of stack First frame of stack 3 3 1 1 . . 0 0 1 1 d d B B / / 1 1 1 1 . . 3 3 s s 3 3 1 1 . . 9 9 0 0 d d B B / / 0 0 . . 0 0 2 2 s s 26.45dB 26.45dB

  55. Burst Denoising – All Frames GKD-Trees / stack GKD-Trees / stack Ours NLM / stack Ours NLM / stack First frame of stack First frame of stack 31.53dB / 1080s 31.53dB / 1080s 34.10dB / 0.52s 34.10dB / 0.52s 26.45dB 26.45dB

  56. Burst Denoising – All Frames GKD-Trees / stack GKD-Trees / stack Ours NLM / stack Ours NLM / stack First frame of stack First frame of stack 31.53dB / 1080s 31.53dB / 1080s 34.10dB / 0.52s 34.10dB / 0.52s 26.45dB 26.45dB

  57. Global Illumination

  58. Global Illumination

  59. Global Illumination

  60. Geometry Noise Reduction Noisy Input

  61. Geometry Noise Reduction Ours

  62. Geometry Noise Reduction Exhaustive Search Ours

  63. Conclusions Efficient implementation on GPU High image quality Applicable to different applications

  64. Thank you Paper and Binary: http://bit.ly/fast-ann

Recommend


More recommend