how fast indexing makes databases greener
play

How Fast Indexing Makes Databases Greener Martin Farach-Colton - PowerPoint PPT Presentation

How Fast Indexing Makes Databases Greener Martin Farach-Colton Michael A. Bender Rutgers and Tokutek Stony Brook and Tokutek Bradley C. Kuszmaul MIT and Tokutek Fast Indexing Makes Databases Greener Obligatory reference to Data centers


  1. How Fast Indexing Makes Databases Greener Martin Farach-Colton Michael A. Bender Rutgers and Tokutek Stony Brook and Tokutek Bradley C. Kuszmaul MIT and Tokutek

  2. Fast Indexing Makes Databases Greener Obligatory reference to • Data centers used 1.5% of US EPA study. electricity in 2006. • Servers: 50% data-center power • Storage systems: 27% data- center power [Battles, Belleville, Grabau, Maurier.’07] Databases are both storage and CPU intensive. How Fast Indexing Makes Databases Greener 2

  3. Fast Indexing Makes Databases Greener Obligatory reference to • Data centers used 1.5% of US EPA study. electricity in 2006. • Servers: 50% data-center power • Storage systems: 27% data- center power [Battles, Belleville, Grabau, Maurier.’07] Databases are both storage and CPU intensive. We believe big energy savings & performance gains are still on the table How Fast Indexing Makes Databases Greener 2

  4. Fast Indexing Makes Databases Greener Modern indexing structures overcome disk-seek bottlenecks of traditional structures B-tree Fractal Tree R structure log N log N O (log B N )= O ( ) O ( ) Insert/delete log B B 1- ε • If B =1024, then B /log B ≈ 1000. → 100x speedup. (Asymptotically same point-query cost.) • Other structures supporting fast inserts: [O'Neil 1 ,Cheng 2, Gawlick 3 , O'Neil 96] [Argel 03] [Graefe 03] [Brodal, Fagerberg 03] [Buchsbaum, Goldwasser, Venkatasubramanian, Westbrook 00] [Brodal, Demaine, Fineman, Iacono, Langerman, Munro 00] How Fast Indexing Makes Databases Greener 3

  5. Fast Indexing Makes Databases Greener Modern indexing structures overcome disk-seek bottlenecks of traditional structures B-tree Fractal Tree R structure log N log N O (log B N )= O ( ) O ( ) Insert/delete log B B 1- ε • If B =1024, then B /log B ≈ 100. → 100x speedup. (No asymptotic loss in point queries.) • Other structures supporting fast inserts: [O'Neil 1 ,Cheng 2, Gawlick 3 , O'Neil 96] [Argel 03] [Graefe 03] [Brodal, Fagerberg 03] [Buchsbaum, Goldwasser, Venkatasubramanian, Westbrook 00] [Brodal, Demaine, Fineman, Iacono, Langerman, Munro 00] How Fast Indexing Makes Databases Greener 4

  6. Fast Indexing Makes Databases Greener Modern indexing structures overcome disk-seek bottlenecks of traditional structures B-tree Fractal Tree R structure log N log N O (log B N )= O ( ) O ( ) Insert/delete log B B 1- ε • If B =1024, then B /log B ≈ 100. → 100x speedup. (No asymptotic loss in point queries.) • Other structures supporting fast inserts: [O'Neil 1 ,Cheng 2, Gawlick 3 , O'Neil 96] [Argel 03] [Graefe 03] [Brodal, Fagerberg 03] [Buchsbaum, Goldwasser, Venkatasubramanian, Westbrook 00] [Brodal, Demaine, Fineman, Iacono, Langerman, Munro 00] How Fast Indexing Makes Databases Greener 5

  7. Fast Indexing Makes Databases Greener Ex. TokuDB R supports >20,000 index inserts/sec even on high-entropy workloads. • Effectively transform random I/O into sequential I/O. iiBench - 1B Row Insert Test ! 50,000 ! 45,000 ! 40,000 ! 35,000 ! Rows/Second ! 30,000 ! 25,000 ! InnoDB ! 20,000 ! TokuDB ! 15,000 ! 10,000 ! 5,000 ! 0 ! 0 ! 200,000,000 ! 400,000,000 ! 600,000,000 ! 800,000,000 ! 1,000,000,000 ! Rows Inserted ! How Fast Indexing Makes Databases Greener 6

  8. y h w n s o a e r e n o Fast Indexing Makes Databases Greener Fast insertions means ➡ ! we can efficiently maintain sophisticated indexes, ➡ !both !insert !& ! query-dominated workloads also can be more energy- efficient. How Fast Indexing Makes Databases Greener 7

  9. y h w n s o a e r e n o Fast Indexing Makes Databases Greener customer hat Fast insertions means ➡ ! we can efficiently maintain sophisticated indexes, ➡ !both !insert !& ! query-dominated workloads also can be more energy- efficient. How Fast Indexing Makes Databases Greener 7

  10. y h w n s o a e r e n o Fast Indexing Makes Databases Greener customer hat Fast insertions means ➡ ! we can efficiently maintain sophisticated indexes, Many users ➡ !both !insert !& ! query-dominated who think they workloads also can be more energy- have query efficient. bottlenecks actually have insertion bottlenecks. Customer issues can be solved by fast inserts into sophisticated indexes. How Fast Indexing Makes Databases Greener 7

  11. h y w n o s a e r r e h t o n a Fast Indexing Makes Databases Greener Promise of green algorithms: enable more power-efficient hardware. Data centers are already designed around algorithmic specs because existing algorithms should run well on existing hardware. Algorithms + Enabled Hardware = Big Win How Fast Indexing Makes Databases Greener 8

  12. h y w n o s a e r r e h t o n a Fast Indexing Makes Databases Greener Example: Data centers use many small-capacity disks rather than a few large-capacity disks • Why? One reason is to get more I/Os. • Fractal Tree indexes don’t need more spindles. Power consumption of disks • Enterprise 80 to 160 GB disk runs at 4W (idle power). • Enterprise 1-2 TB disk runs at 8W (idle power). Savings on the table: ~10x in storage • Other considerations modify this factor ‣ e.g., CPUs necessary to drive disks, scale-out infrastructure, cooling, etc. Algorithms + Enabled Hardware = Big Win How Fast Indexing Makes Databases Greener 9

  13. h y w n o s a e r r e h t o n a Fast Indexing Makes Databases Greener Example: Data centers use many small-capacity disks rather than a few large-capacity disks • Why? One reason is to get more I/Os. • Fractal Tree indexes don’t need more spindles. Power consumption of disks • Enterprise 80 to 160 GB disk runs at 4W (idle power). • Enterprise 1-2 TB disk runs at 8W (idle power). Savings on the table: ~10x in storage • Other considerations modify this factor ‣ e.g., CPUs necessary to drive disks, scale-out infrastructure, cooling, etc. Algorithms + Enabled Hardware = Big Win How Fast Indexing Makes Databases Greener 10

  14. Open Prob 1: Highly Concurrent & Multithreaded Indexing Develop concurrent, multithreaded indexing data structures for slow, high-core-count machines iiBench - 1B Row Insert Test ! 50,000 ! • server CPU: ~100 W 45,000 ! 40,000 ! • laptop CPU: 5-10 W 35,000 ! Rows/Second ! 30,000 ! ‣ 4x less capable, 10-20x less power hungry 25,000 ! InnoDB ! ‣ 5x more energy efficient 20,000 ! TokuDB ! 15,000 ! • mobile-phone CPU 10,000 ! 5,000 ! ‣ another factor of 5 is on the table 0 ! 0 ! 200,000,000 ! 400,000,000 ! 600,000,000 ! 800,000,000 ! 1,000,000,000 ! Rows Inserted ! Fractal Trees drive more CPUs than B-trees • CPU intensive. E.g, TokuDB is CPU bound • which means big savings are on the table How Fast Indexing Makes Databases Greener 11

  15. Open Prob 1: Highly Concurrent & Multithreaded Indexing Develop concurrent, multithreaded indexing data structures for slow, high-core-count machines iiBench - 1B Row Insert Test ! 50,000 ! • server CPU: ~100 W 45,000 ! 40,000 ! • laptop CPU: 5-10 W 35,000 ! Rows/Second ! 30,000 ! ‣ 4x less capable, 10-20x less power hungry 25,000 ! InnoDB ! ‣ 5x more energy efficient 20,000 ! TokuDB ! 15,000 ! • mobile-phone CPU 10,000 ! 5,000 ! ‣ another factor of 5 is on the table 0 ! 0 ! 200,000,000 ! 400,000,000 ! 600,000,000 ! 800,000,000 ! 1,000,000,000 ! Rows Inserted ! Fractal Trees drive more CPUs than B-trees • CPU intensive. e.g, TokuDB is CPU bound • big efficiency gains are on the table How Fast Indexing Makes Databases Greener 12

  16. Open Prob 2: Energy-Efficient SSD/Rotational Disk Hybrid Design a SSD/rotational disk hybrid for a streaming-B-tree-based storage system. • Rotational devices are more efficient for sequential I/O • SSDs are more efficient for random I/O. Can a hybrid offer energy savings by using each device for the workload it is best suited for? 35000 30000 25000 Insertion Rate TokuDB 20000 FusionIO X25E RAID10 15000 10000 InnoDB 5000 FusionIO X25-E RAID10 0 0 5e+07 1e+08 1.5e+08 Cummulative Insertions Fractal Trees deliver >10x speedups on SSDs vs B-trees How Fast Indexing Makes Databases Greener 13

  17. Open Prob 3: The proof is in the pudding Yes, sir, we were, but this is Proof is in the Ten thousand? We were genuine coin of the realm. talking about a lot more With a dollar of this, you can money than this. buy ten dollars of talk. How Fast Indexing Makes Databases Greener 14

  18. Open Prob 3: The proof is in the pudding Yes, sir, we were, but this is Proof is in the Ten thousand? We were genuine coin of the realm. talking about a lot more With a dollar of this, you can money than this. buy ten dollars of talk. We require research in the classics: algorithms, parallelism, concurrency, data structures, storage systems, etc. How Fast Indexing Makes Databases Greener 14

Recommend


More recommend