Scaling Hibernate Emmanuel Bernard - Max Ross
Emmanuel Bernard • Hibernate Search in Action • blog.emmanuelbernard.com • twitter.com/emmanuelbernard
Max Ross • Google App Engine • Hibernate Shards
What is scalability?
What is scalability? • Users • Resource • Data • Uptime
How does Hibernate stand? Node • Limitations? Session 2nd level cache Session • SQL optimizations Session Node • 2nd level cache Session 2nd level cache DB Session Session • Conversation Node Session 2nd level cache Session Session
Changes in mass • Bulk insert / update / delete • Stateless session
To Googolzillions and beyond
Googolzillion things? Who are you? • Social network • SaaS
Problem • Same data model • Too much load • Too much data • Too many lawyers
Separating customer data
Logical separation Application • All customers share tables SessionFactory • Manual or Hibernate Filter Schema DB
One user per schema Application Session Session Factory Factory • One SessionFactory per schema Schema Schema DB Application Session Factory • Rewrite SQL Schema Schema DB
Use database security • Map JAAS credentials to DB credentials • One connection (pool) per user
Oracle security • Oracle VPD • Application defines active user
Storing in multiple databases
SessionFactory == DB • Same schema across DBs • Expensive in RAM Sharing state across SessionFactory s • Data isolated is probably doable
How many customer per DB? • One • One per schema • Several per schema • Dispatch customer to the right SessionFactory
Adjusting the application layer
Homogeneous nodes Application Application Application Session Session Session Session Session Session Factory Factory Factory Factory Factory Factory Conn Conn Conn Conn Conn Conn pool pool pool pool pool pool DB DB • Memory • Too many connections • Slow to start
Specialized nodes Dispatch per user Application Application Application Application Session Session Session Session Session Session Session Factory Factory Factory Factory Factory Factory Factory Conn Conn Conn Conn Conn Conn Conn pool pool pool pool pool pool pool DB DB DB DB DB • Load balancing rules • Easy scalability • Efficient resource-wise
What if you need to query all your data?
Hibernate Shards
Simplified Horizontal Partitioning • Separates app logic from federation logic • Standard Hibernate API • Unified view of your data
Shard Strategy • Federation logic is application specific • Selection Model Object • Resolution ? ? ? • Access Shard 1 Shard 2 Shard 3
Shard Selection • On which shard do we create the record? • Round robin • Capacity based • Attribute based • Performance based
Shard Resolution • On which shard do we find the record? • Exhaustive search • Map ID ranges to shards • Distributed cache
Shard Access • How do we apply operations across shards? • Serially • In parallel (bring your own thread pool) • Hybrid
Writing the app is the easy part • Operational challenges/risks are amplified • Virtual shards can help
Virtual Shards Application Sharded Session Factory Virtual Virtual Virtual Shard 1 Shard 2 Shard 3 Physical Shard 1
Virtual Shards Application Sharded Session Factory Virtual Virtual Virtual Shard 1 Shard 2 Shard 3 Physical Physical Shard 1 Shard 2
Coming Soon • Static Data • Full-fledged ShardedQuery • JPA
Hibernate Search
Full-text search your domain objects • Hibernate + Lucene • Same programmatic model • Index synchronized
Human queries • Data set • Word centric • Typos / Synonyms • Relevance
SQL underperforms • Wildcard • Table/Index full scan • Multiple joins • Relevance?
DBA Customer
Full-text search • Move load away from the DB • Replace or complement searches
Scalability Symmetric cluster • Distributed lock • Immediate visibility • Affects front end Hibernate + Hibernate Search Search request Index update Lucene Directory Database (Index) Search request Index update Hibernate + Hibernate Search
Scalability Asymmetric cluster • Search local / change sent to master • Asynchronous indexing (delay) • No front end extra cost / good scalability Slave Hibernate Database Lucene + Directory Search request Hibernate Search (Index) Copy Copy Index update order Hibernate Master Lucene + Directory Index update Hibernate Search (Index) Process JMS Master queue
Scalabilities (sic) • Hibernate a good citizen • Isolating customer data • Deal with multiple databases • Hibernate Shards • Hibernate Search
Q&A • For more infos • Hibernate Search in Action • Java Persistence with Hibernate • Max’s podcasts • http://google-code-updates.blogspot.com/2007/08/google-developer- podcast-episode-six.html • http://www.javaworld.com/podcasts/jtech/2008/072408jtech.html • hibernate.org
Recommend
More recommend