i hate your database i hate your database
play

I Hate Your Database I Hate Your Database Andrew Godwin Andrew - PowerPoint PPT Presentation

flickr.com/96dpi flickr.com/96dpi I Hate Your Database I Hate Your Database Andrew Godwin Andrew Godwin @andrewgodwin @andrewgodwin flickr.com/parkerblohm flickr.com/parkerblohm Hate? Databases? Hate? Databases? Misuse Misuse


  1. flickr.com/96dpi flickr.com/96dpi I Hate Your Database I Hate Your Database Andrew Godwin Andrew Godwin @andrewgodwin @andrewgodwin

  2. flickr.com/parkerblohm flickr.com/parkerblohm Hate? Databases? Hate? Databases?

  3. · Misuse · Misuse · Ignorance · Ignorance · Lies · Lies

  4. flickr.com/maistora flickr.com/maistora Different databases, Different databases, different occasions different occasions

  5. Graph Graph Relational Relational Object / Hierarchial Object / Hierarchial Document Document Spatial Spatial Key-Value Time-series / RRD Key-Value Time-series / RRD Search Search

  6. PostgreSQL PostgreSQL PostgreSQL PostgreSQL Relational Relational MySQL MySQL MySQL MySQL SQLite SQLite SQLite SQLite MongoDB MongoDB Document Document CouchDB CouchDB Redis Redis Key-value Key-value Cassandra Cassandra Riak Riak

  7. flickr.com/iamdabe flickr.com/iamdabe Some Quick Theory Some Quick Theory

  8. Atomicity tomicity A Consistency onsistency C Isolation solation I Durability urability D

  9. Consistency onsistency C Availability vailability A Partition Tolerance artition Tolerance P

  10. flickr.com/maistora flickr.com/maistora Relational Databases Relational Databases

  11. Common Pitfalls Common Pitfalls SELECT item1, item2, item3 FROM basket; SELECT item1, item2, item3 FROM basket; INSERT INTO groups (name, people) INSERT INTO groups (name, people) VALUES ('friends', 'aaron,betty,chris,damien') VALUES ('friends', 'aaron,betty,chris,damien') SELECT file_bytes FROM pictures WHERE filename = SELECT file_bytes FROM pictures WHERE filename = 'foo.jpg' 'foo.jpg'

  12. names = set() names = set() for book in Book.objects.filter(year=2012): for book in Book.objects.filter(year=2012): names.add(book.author.name) names.add(book.author.name)

  13. names = Author.objects.filter( names = Author.objects.filter( books__year=2012 books__year=2012 ).values_list('name', flat=True).distinct() ).values_list('name', flat=True).distinct()

  14. MySQL MySQL No transactional DDL No transactional DDL Poor query optimiser Poor query optimiser MyISAM: Full-table locking, no transactions MyISAM: Full-table locking, no transactions Oracle Oracle Very fast for some operations Very fast for some operations

  15. SQLite SQLite Little integrity checking (slowly being fixed) Little integrity checking (slowly being fixed) Impossible to do some table alterations Impossible to do some table alterations No concurrent access No concurrent access Very low overhead Very low overhead Very portable Very portable

  16. PostgreSQL PostgreSQL Slow default configuration Slow default configuration Can be a little harder to learn / less familiar Can be a little harder to learn / less familiar Almost too many features Almost too many features Incredibly reliable Incredibly reliable

  17. flickr.com/babyowls flickr.com/babyowls Document Databases Document Databases

  18. db.insert({ db.insert({ 'name': 'Sally', 'name': 'Sally', 'tags': ['django', 'python', 'search'], 'tags': ['django', 'python', 'search'], 'addresses': [ 'addresses': [ {'type': 'jabber', 'jid': 'sally@eg.com'}, {'type': 'jabber', 'jid': 'sally@eg.com'}, {'type': 'phone', 'number': '011899981199'}, {'type': 'phone', 'number': '011899981199'}, ], ], }) }) db.find({ db.find({ 'tags': 'python', 'tags': 'python', 'addresses.type': 'jabber', 'addresses.type': 'jabber', }) })

  19. Advantages Advantages No fixed schema No fixed schema Low barrier to entry Low barrier to entry Closer to Python datatypes Closer to Python datatypes

  20. Problems Problems Immature (but improving) Immature (but improving) No transactions No transactions No integrity checking No integrity checking

  21. flickr.com/zebble flickr.com/zebble Key-value stores Key-value stores

  22. db.set('foo', 'bar) db.set('foo', 'bar) x = db.get('foo') x = db.get('foo') db.sadd('names', 'andrew') db.sadd('names', 'andrew') db.sadd('names', 'brian') db.sadd('names', 'brian') y = db.scard('names') y = db.scard('names')

  23. Traits Traits Horizontal scaling (but with drawbacks) Horizontal scaling (but with drawbacks) Extremely fast Extremely fast Can only fetch objects by key Can only fetch objects by key Batch/map-reduce queries Batch/map-reduce queries Transactions not possible Transactions not possible

  24. flickr.com/tusnelda flickr.com/tusnelda Other database types Other database types

  25. Spatial Spatial Knowledge of projections useful Knowledge of projections useful Spatial indexes really speed up some problems Spatial indexes really speed up some problems Generally add-on to existing DB Generally add-on to existing DB

  26. Filesystems Filesystems Hierarchial key-value store Hierarchial key-value store Allows multiple writers for appends Allows multiple writers for appends Supports very large files Supports very large files

  27. Graph Databases Graph Databases Allow efficient neighbour queries Allow efficient neighbour queries Generally not much use for anything else Generally not much use for anything else

  28. Round-Robin Database Round-Robin Database Deliberately loses old data Deliberately loses old data Useful for logging or statistics Useful for logging or statistics

  29. flickr.com/pagedooley flickr.com/pagedooley Final Thoughts Final Thoughts

  30. It's unlikely your data all It's unlikely your data all fits in one paradigm. fits in one paradigm.

  31. Just buying bigger servers Just buying bigger servers goes a long way goes a long way

  32. If it sounds too good to be true, If it sounds too good to be true, it probably is. it probably is.

  33. flickr.com/oimax flickr.com/oimax Fin. Fin. Andrew Godwin / @andrewgodwin Andrew Godwin / @andrewgodwin

Recommend


More recommend