scaling hibernate
play

Scaling Hibernate Emmanuel Bernard - Max Ross Emmanuel Bernard - PowerPoint PPT Presentation

Scaling Hibernate Emmanuel Bernard - Max Ross Emmanuel Bernard Hibernate Search in Action blog.emmanuelbernard.com twitter.com/emmanuelbernard Max Ross Google App Engine Hibernate Shards What is scalability? What is


  1. Scaling Hibernate Emmanuel Bernard - Max Ross

  2. Emmanuel Bernard • Hibernate Search in Action • blog.emmanuelbernard.com • twitter.com/emmanuelbernard

  3. Max Ross • Google App Engine • Hibernate Shards

  4. What is scalability?

  5. What is scalability? • Users • Resource • Data • Uptime

  6. How does Hibernate stand? Node • Limitations? Session 2nd level cache Session • SQL optimizations Session Node • 2nd level cache Session 2nd level cache DB Session Session • Conversation Node Session 2nd level cache Session Session

  7. Changes in mass • Bulk insert / update / delete • Stateless session

  8. To Googolzillions and beyond

  9. Googolzillion things? Who are you? • Social network • SaaS

  10. Problem • Same data model • Too much load • Too much data • Too many lawyers

  11. Separating customer data

  12. Logical separation Application • All customers share tables SessionFactory • Manual or Hibernate Filter Schema DB

  13. One user per schema Application Session Session Factory Factory • One SessionFactory per schema Schema Schema DB Application Session Factory • Rewrite SQL Schema Schema DB

  14. Use database security • Map JAAS credentials to DB credentials • One connection (pool) per user

  15. Oracle security • Oracle VPD • Application defines active user

  16. Storing in multiple databases

  17. SessionFactory == DB • Same schema across DBs • Expensive in RAM Sharing state across SessionFactory s • Data isolated is probably doable

  18. How many customer per DB? • One • One per schema • Several per schema • Dispatch customer to the right SessionFactory

  19. Adjusting the application layer

  20. Homogeneous nodes Application Application Application Session Session Session Session Session Session Factory Factory Factory Factory Factory Factory Conn Conn Conn Conn Conn Conn pool pool pool pool pool pool DB DB • Memory • Too many connections • Slow to start

  21. Specialized nodes Dispatch per user Application Application Application Application Session Session Session Session Session Session Session Factory Factory Factory Factory Factory Factory Factory Conn Conn Conn Conn Conn Conn Conn pool pool pool pool pool pool pool DB DB DB DB DB • Load balancing rules • Easy scalability • Efficient resource-wise

  22. What if you need to query all your data?

  23. Hibernate Shards

  24. Simplified Horizontal Partitioning • Separates app logic from federation logic • Standard Hibernate API • Unified view of your data

  25. Shard Strategy • Federation logic is application specific • Selection Model Object • Resolution ? ? ? • Access Shard 1 Shard 2 Shard 3

  26. Shard Selection • On which shard do we create the record? • Round robin • Capacity based • Attribute based • Performance based

  27. Shard Resolution • On which shard do we find the record? • Exhaustive search • Map ID ranges to shards • Distributed cache

  28. Shard Access • How do we apply operations across shards? • Serially • In parallel (bring your own thread pool) • Hybrid

  29. Writing the app is the easy part • Operational challenges/risks are amplified • Virtual shards can help

  30. Virtual Shards Application Sharded Session Factory Virtual Virtual Virtual Shard 1 Shard 2 Shard 3 Physical Shard 1

  31. Virtual Shards Application Sharded Session Factory Virtual Virtual Virtual Shard 1 Shard 2 Shard 3 Physical Physical Shard 1 Shard 2

  32. Coming Soon • Static Data • Full-fledged ShardedQuery • JPA

  33. Hibernate Search

  34. Full-text search your domain objects • Hibernate + Lucene • Same programmatic model • Index synchronized

  35. Human queries • Data set • Word centric • Typos / Synonyms • Relevance

  36. SQL underperforms • Wildcard • Table/Index full scan • Multiple joins • Relevance?

  37. DBA Customer

  38. Full-text search • Move load away from the DB • Replace or complement searches

  39. Scalability Symmetric cluster • Distributed lock • Immediate visibility • Affects front end Hibernate + Hibernate Search Search request Index update Lucene Directory Database (Index) Search request Index update Hibernate + Hibernate Search

  40. Scalability Asymmetric cluster • Search local / change sent to master • Asynchronous indexing (delay) • No front end extra cost / good scalability Slave Hibernate Database Lucene + Directory Search request Hibernate Search (Index) Copy Copy Index update order Hibernate Master Lucene + Directory Index update Hibernate Search (Index) Process JMS Master queue

  41. Scalabilities (sic) • Hibernate a good citizen • Isolating customer data • Deal with multiple databases • Hibernate Shards • Hibernate Search

  42. Q&A • For more infos • Hibernate Search in Action • Java Persistence with Hibernate • Max’s podcasts • http://google-code-updates.blogspot.com/2007/08/google-developer- podcast-episode-six.html • http://www.javaworld.com/podcasts/jtech/2008/072408jtech.html • hibernate.org

Recommend


More recommend