Scaling Dropbox P R E S L AV L E , N O V E M B E R 7 T H , 2 0 1 6
Zone Zone (west) (east) Zone (central) block.dropbox.com
Zone Zone (west) (east) Zone (central) block.dropbox.com
Zone Zone (west) (east) Zone (central) block.dropbox.com
Fear of the unknown
M E M O R Y L E A K
S Y N C H O R N I Z AT I O N E V E N T
Success story
TO DAY ’ S TA L K 2012 • SCALING CHALLENGES • 2016 • Q&A •
P R E S L AV L E • At Dropbox since 2013 • Projects: Magic Pocket, Infrastructure Performance, Tra ffi c team
F I L E , S Y N C & S H A R E
5 0 0 M I L L I O N U S E R S
2 0 1 2 Dropbox’s datacenters AWS Memcached DB S3 Memcached DB Memcached DB metaserver async metaserver blockserver async processing metaserver blockserver notification async processing blockserver server processing nginx nginx nginx nginx LB LB clients
B LO C K DATA I N S 3 Dropbox’s datacenters AWS AWS Memcached DB S3 Memcached DB Memcached DB metaserver async metaserver blockserver async processing metaserver blockserver notification async processing blockserver server processing nginx nginx nginx nginx LB LB clients
M E TA DATA I N M Y S Q L Dropbox’s datacenters Dropbox’s datacenters AWS Memcached DB S3 Memcached DB Memcached DB metaserver async metaserver blockserver async processing metaserver blockserver notification async processing blockserver server processing nginx nginx nginx nginx LB LB clients
1 . F E TC H M E TA DATA Dropbox’s datacenters AWS Memcached DB S3 Memcached DB Memcached Memcached DB DB metaserver async metaserver blockserver async processing metaserver metaserver blockserver notification async processing blockserver server processing nginx nginx nginx nginx LB LB LB clients clients
2 . D OW N LOA D B LO C K S Dropbox’s datacenters AWS Memcached DB S3 S3 Memcached DB Memcached DB metaserver async metaserver blockserver async processing metaserver blockserver notification async processing blockserver blockserver server processing nginx nginx nginx nginx LB LB LB LB clients clients
3 . WA I T F O R N OT I F I C AT I O N S Dropbox’s datacenters AWS Memcached DB S3 Memcached DB Memcached DB metaserver async metaserver blockserver async processing metaserver metaserver blockserver notification notification async processing blockserver server server processing nginx nginx nginx nginx LB LB clients clients
P Y T H O N E V E R Y W H E R E Dropbox’s datacenters AWS Memcached DB S3 Memcached DB Memcached DB metaserver async metaserver blockserver async processing metaserver blockserver notification async processing blockserver server processing nginx nginx nginx nginx LB LB clients
C LU S T E R I S O L AT I O N meta-client meta-client meta-api meta-mobile meta-client meta-client meta-api meta-mobile meta-client meta-web meta-api meta-mobile Dropbox’s datacenters
Scaling Databases Scaling as Organization Scaling Software Managing Complexity
S C A L I N G DATA BA S E S shard0 shard0 shard1 mysql shard1 mysql shardN shrardN replica replica replica replica replica replica replica replica … shard0 shard1 mysql shardN master master master master Memcached metaserver Memcached Memcached
H O R I ZO N TA L S C A L I N G shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shardN master master master … metaserver metaserver metaserver metaserver metaserver metaserver
CO N N E C T I O N S shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shardN master master master … metaserver metaserver metaserver metaserver metaserver metaserver
S Q L P R OX Y shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shardN master master master SQL Proxy SQL Proxy SQL Proxy … metaserver metaserver metaserver metaserver metaserver metaserver
Scaling Databases Scaling as Organization Scaling Software Managing Complexity
G LO BA L DATA BA S E
AVA I L A B I L I T Y I S S U E S
P L AY B O O K 1. Check for ongoing deployments or newly enabled features
P L AY B O O K 1. Check for ongoing deployments or newly enabled features 2. Check for recently started background jobs
P L AY B O O K 1. Check for ongoing deployments or newly enabled features 2. Check for recently started background jobs 3. DBA oncall, please help!
Dropbox grew from 100 to 500 employees
• Slow queries would adversely impact performance across the board
• Slow queries would adversely impact performance across the board • More features => Managing more independent MySQL
• Slow queries would adversely impact performance across the board • More features => Managing more independent MySQL • Reactively (re)sharding individual databases as they hit capacity
• Slow queries would adversely impact performance across the board • More features => Managing more independent MySQL • Reactively (re)sharding individual databases as they hit capacity • Impacted developer productivity
S C A L A B L E M E TA DATA S TO R E D E S I G N E D F O R M U LT I -T E N A N C Y 2013 — Present
S H A R D I N G A N D C AC H I N G B E H I N D T H E S C E N E S
E N T I T I E S A N D A S S O C I AT I O N S
F I R S T G O S E R V I C E
Scaling Databases Scaling as Organization Scaling Software Managing Complexity
P E R F E C T S TO R M
S H A R D I N G
P H OTO A L B U M S
T E A M A D M I N CO N S O L E
R E Q U E S T FA N O U T request
G LO BA L I D 8 bytes 8 bytes Colocation ID Counter • Colocation ID: Identi fi es a shard • Counter: Unique ID within the shard
Lack of colocation also hurts performance
N E W S E R V I C E : F I L E J O U R N A L shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shardN master master master … File Journal File Journal File Journal … metaserver metaserver metaserver metaserver metaserver metaserver
S H A R D FA I LU R E shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shard1 shardN master master master master … File Journal File Journal File Journal … metaserver metaserver metaserver metaserver metaserver metaserver
S H A R D I N G ( PA R T I I )
LO N G T I M E O U T S shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shard1 shardN master master master master … File Journal File Journal File Journal … metaserver metaserver metaserver metaserver metaserver metaserver
R U N O U T O F W O R K E R S shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shard1 shardN master master master master … File Journal File Journal File Journal File Journal File Journal File Journal … metaserver metaserver metaserver metaserver metaserver metaserver
C A S C A D I N G FA I LU R E shard0 shard0 shard1 shard1 shardN shrardN replica replica replica replica replica replica … shard0 shard1 shard1 shardN master master master master … File Journal File Journal File Journal File Journal File Journal File Journal … metaserver metaserver metaserver metaserver metaserver metaserver metaserver metaserver metaserver metaserver metaserver metaserver
S H A R D I S O L AT I O N Limit resources dedicated to processing a single shard
Scaling Databases Scaling as Organization Scaling Software Managing Complexity
M AG I C P O C K E T B LO C K S TO R AG E S Y S T E M 500PB+ user block data 3+ geographic regions 500+ million users
put Zone Zone (west) (east) get put put get get Zone (central)
complicated! complicated! complicated! complicated! ☹ ☹ ☹ ☹ simple ☺ complicated! complicated! ☹ ☹
P Y T H O N , G O & R U S T
https://blogs.dropbox.com/tech/
2 0 1 6 Magic Magic Pocket File Journal Magic … Pocket File Journal Pocket Auth Cape Auth Edgestore Blockservice Block File Journal Search Auth Block Routing File Journal Search Auth Block Auth Routing File Journal Search Riviera Routing service Riviera Thumbnail Auth service Auth Presence &Notications meta-client meta-client meta-api meta-mobile blockserver meta-client meta-client meta-api meta-mobile blockserver meta-client meta-web meta-api meta-mobile blockserver
Recommend
More recommend