runaway complexity in big data
play

Runaway complexity in Big Data And a plan to stop it Nathan Marz - PowerPoint PPT Presentation

Runaway complexity in Big Data And a plan to stop it Nathan Marz @nathanmarz 1 Agenda Common sources of complexity in data systems Design for a fundamentally better data system What is a data system? A system that manages the storage


  1. Batch views are optimized for the queries they serve

  2. Batch views • Batch-writable from MapReduce • Fast random reads • Examples: ElephantDB, Voldemort

  3. Batch view database No random writes required!

  4. Properties All Batch data view Function ElephantDB is only a few thousand lines of code Simple

  5. Properties All Batch data view Function Scalable

  6. Properties All Batch data view Function Highly available

  7. Properties All Batch data view Function Can be heavily optimized (b/c no random writes)

  8. Properties All Batch data view Function Normalized

Recommend


More recommend