beyond the wall
play

Beyond'the'Wall:' Near0Data'Processing'for'Databases Sam$Xi - PowerPoint PPT Presentation

Beyond'the'Wall:' Near0Data'Processing'for'Databases Sam$Xi ,'Ore'Babarinsa,' Manos$Athanassoulis ,'Stratos Idreos HARVARD'UNIVERSITY 1 Memory'Wall Memory'Wall HARVARD'UNIVERSITY 3 Row'store Column'store tuple tuple HARVARD'UNIVERSITY 4


  1. Beyond'the'Wall:' Near0Data'Processing'for'Databases Sam$Xi ,'Ore'Babarinsa,' Manos$Athanassoulis ,'Stratos Idreos HARVARD'UNIVERSITY 1

  2. Memory'Wall

  3. Memory'Wall HARVARD'UNIVERSITY 3

  4. Row'store Column'store tuple tuple HARVARD'UNIVERSITY 4

  5. Memory0optimized'data'systems HARVARD'UNIVERSITY 5

  6. Data'access' remains$ the'bottleneck HARVARD'UNIVERSITY 6

  7. HARVARD'UNIVERSITY 7

  8. σ Σ π HARVARD'UNIVERSITY 8

  9. We'are'not'the'first'to'visit'this'pyramid! HARVARD'UNIVERSITY 9

  10. Intelligent'RAM DIVA NearRdata' processing RADram LogicRinRmemory Terasys HARVARD'UNIVERSITY 10

  11. Why'did'NDP'not'take'off? DRAM Logic Leakage Low High Switching2speed Slow Fast Fabrication2processes2are2incompatible HARVARD'UNIVERSITY 11

  12. Moore’s'Law'+'Dennard'scaling provided'consistent'performance'scaling'for'years Metric Scaling2factor 1/κ 2 Area Delay 1/κ Power 1 Moore’s'Law. Dennard'scaling. Not'the'case'anymore! HARVARD'UNIVERSITY 12

  13. HARP Q100 Widx Our$approach Ibex HARVARD'UNIVERSITY 13

  14. Outline Intro NDP'for'data'systems:'Past'and'present The'architecture'of'JAFAR Experimental'results Conclusion HARVARD'UNIVERSITY 14

  15. Opportunity'for'NDP Query Lots2of2data Host'server Database … Filter2data2before2 Many'rows'fail'the' query'predicate'and' it2is2sent2to2CPU. are'discarded. HARVARD'UNIVERSITY 15

  16. JAFAR:'“Just”'A'Filtering' Accelerator'on'Relations CPU CPU CPU CPU Last'level'cache System'bus'+'memory'controller JAFAR JAFAR DRAM DRAM HARVARD'UNIVERSITY 16

  17. Rank Row'address'decoder Chip Bank20 Bank20 Bank20 Bank20 Sense2Amps Sense2Amps Sense2Amps Sense2Amps Column'address'decoder HARVARD'UNIVERSITY 17

  18. Rank Bank Row'address'decoder Rank Bank20 Array20 Array21 Bank20 Bank20 Bank20 Sense2Amps Sense2Amps Sense2Amps Array22 Array23 Sense2Amps Column'address'decoder HARVARD'UNIVERSITY 18

  19. JAFAR:'Overall'design CPU CPU CPU CPU Last'level'cache System'bus'+'memory'controller JAFAR JAFAR DRAM DRAM HARVARD'UNIVERSITY 19

  20. JAFAR'context RAS Bank20 From'CPU Memory2 Bank20 Bank20 access2 Bank20 arbiter Sense2Amps Sense2Amps Sense2Amps Sense'Amps CAS IO'buffer JAFAR HARVARD'UNIVERSITY 20

  21. JAFAR'architecture From1IO1buffer Data'latch Right Left Opcode Opcode ALU ALU Comparison'is'true? page'offset'bitmask write'enable Page'offset'counter Output'buffer HARVARD'UNIVERSITY 21

  22. Programming'JAFAR int errno = select_jafar( void* col_data, int range_low, int range_high, uint8_t* out_buf, size_t num_input_rows, size_t* num_output_rows); HARVARD'UNIVERSITY 22

  23. Handling'multiple'modules CPU CPU CPU CPU Last'level'cache System'bus'+'memory'controller JAFAR JAFAR DRAM DRAM HARVARD'UNIVERSITY 23

  24. Handling'multiple'modules Fill'up'each'module'first CPU CPU CPU CPU Last'level'cache System'bus'+'memory'controller JAFAR JAFAR DRAM DRAM HARVARD'UNIVERSITY 24

  25. Handling'multiple'modules Interleave'data'across'modules CPU CPU CPU CPU Last'level'cache System'bus'+'memory'controller JAFAR JAFAR DRAM DRAM HARVARD'UNIVERSITY 25

  26. Coordinating'memory'access The'CPU'and'JAFAR'cannot'simultaneously'attempt' to'access'memory. CPU'grants'JAFAR'ownership'to'a'DRAM'rank'for'a' period'of'time. Possible'mechanism:'DRAM'mode'registers HARVARD'UNIVERSITY 26

  27. Experimental'setup Simulation'framework gem5 OutRofRorder'CPU Classic'cache'model SimpleDRAM HARVARD'UNIVERSITY 27

  28. Experimental'setup Queries,'input'data,'and'database select * from table where column < n ; InRhouse'column'store' database 0 1M 4'million'rows'of' unsorted'integers HARVARD'UNIVERSITY 28

  29. Experimental'results HARVARD'UNIVERSITY 29

  30. Memory'contention Scheduling'of'ownership'transfers'will'be' important What'would'JAFAR’s'performance'look'like' without a'scheduler? HARVARD'UNIVERSITY 30

  31. Memory'contention Memory'requests Memory'requests CPU Idle'period JAFAR'can'execute HARVARD'UNIVERSITY 31

  32. Idle'periods'on'TPC0H HARVARD'UNIVERSITY 32

  33. JAFAR'as'a'framework More'operators ! Aggregations ! Projections ! Sort ? Joins HARVARD'UNIVERSITY 33

  34. JAFAR'as'a'framework' Data'types'and'layouts RowRstores'and'hybrids Multiple$filters$per$row Efficient$projections Variable'length'datatypes Process$on$CPU? HARVARD'UNIVERSITY 34

  35. NDP'is'an'exciting'opportunity'for' innovation'in'data'systems HARVARD'UNIVERSITY 35

  36. NDP'is'a'promising'solution'to'the' memory'wall'for'data'systems. JAFAR'provides'up'to'9x'speedup'on' simple'select'queries. JAFAR'is'built'on'an'extensible' framework'for'accelerating'data'systems. HARVARD'UNIVERSITY 36

  37. Thank'you HARVARD'UNIVERSITY 37

Recommend


More recommend