This presentation uses the following non-standard fonts: Lato - - PowerPoint PPT Presentation

this presentation uses the following non standard fonts
SMART_READER_LITE
LIVE PREVIEW

This presentation uses the following non-standard fonts: Lato - - PowerPoint PPT Presentation

This presentation uses the following non-standard fonts: Lato (http://www.latofonts.com/lato-free-fonts/) FontAwesome (https://fortawesome.github.io/Font-Awesome/) You will need to download and install the TTF versions of these fonts for the


  • Progress Points Little’s Law: W= L / λ latency = transactions / throughput � � � � � � � � � transactions++ transactions-- Bob wants to minimize response Bme. He adds latency progress points.

  • Coz: a Causal Profiler for Linux > coz run --- ./ogle_server deploy

  • Coz Produces Causal Profiles Let’s use it to improve Ogle � Program Speedup � � � � Speedup ?

  • Using Causal Profiling on Ogle gle gle found 8,000,000 
 similar images

  • Using Causal Profiling on Ogle dedup compression ferret image comparison

  • Ferret image comparison input output feature segmentaOon indexing ranking extracOon

  • Ferret 100% ● ● 75% ● ● ● line 320 ● ● ● ranking 50% ● ● ● ● ● ● 25% ● ● ● ● ● 0% ● 100% Program Speedup 75% line 358 indexing 50% 25% 0% 100% 75% line 255 segmentaOon 50% 25% 0% 0% 25% 50% 75% 100% Line Speedup ● line 320 Line line 358 line 255

  • Ferret input output feature segmentaOon indexing ranking extracOon

  • Ferret Probably doesn’t need as many threads input output feature segmentaOon indexing ranking extracOon

  • Ferret 21% Speedup input output feature segmentaOon indexing ranking extracOon

  • What did Causal Profiling predict? ranking Increased from 16 to 22 threads 27% increase in ranking throughput

  • What did Causal Profiling predict? 100% ● ● 75% ● ● ● line 320 ● ● ● ranking 50% ● ● ● ● ● ● 25% ● ● ● ● ● 0% 0% ● 100% 0% 25% 50% 75% 100% Program Speedup Line Speedup 27% increase in ranking throughput Causal Profiling predicted a 21% improvement Exactly what we observed

  • Using Causal Profiling on Ogle dedup compression ferret image comparison

  • Dedup Compression via deduplicaBon

  • Dedup Compression via deduplicaBon

  • Dedup Compression via deduplicaBon grumpycat1.jpg funisawful.jpg

  • Dedup Compression via deduplicaBon i = hash_function( )

  • Dedup Compression via deduplicaBon i = hash_function( )

  • Dedup Compression via deduplicaBon i = hash_function( )

  • Dedup Compression via deduplicaBon i = hash_function( )

  • Dedup Compression via deduplicaBon i = hash_function( )

  • Dedup Compression via deduplicaBon i = hash_function( )

  • Dedup Compression via deduplicaBon Hash table is accessed concurrently by many threads � � � � � � Causal Profiler says the loop that accesses this list is important

  • Dedup Compression via deduplicaBon More hash buckets should lead to fewer collisions � � � � � � � � � � � �

  • Dedup Compression via deduplicaBon More hash buckets should lead to fewer collisions � � � � � � � � � � � �

  • Dedup Compression via deduplicaBon More hash buckets should lead to fewer collisions No performance improvement � � � � � � � � � � � �

  • Dedup Compression via deduplicaBon What else could be causing collisions? i = hash_function( ) � � � � � � � � � � � �

  • Dedup Compression via deduplicaBon Horrible hash func8on! Items per-bucket Bin UBlizaBon 150 Original Version 2.3% 100 50 Collisions 0 0 250 500 750 1000 Bucket Index

  • Dedup Compression via deduplicaBon Bin UBlizaBon 150 Original Version 2.3% 100 Items per-bucket 50 Collisions Collisions 0 150 After Optimization 9% Speedup 100 82% 50 0 0 250 500 750 1000 Bucket Index

  • Dedup Compression via deduplicaBon What did Causal Profiling predict? Blocks per-bucket Before: 76.7 ACer: 2.09 96% traversal speedup 9% predicted speedup, 
 exactly what we observed

  • Using Causal Profiling on Ogle dedup compression ferret image comparison

  • Simple SQL Database #if THREAD_SAFE config_t global_config = { … .unlock = pthread_mutex_unlock, .getsize = sqlite_usable_size, .nextitem = sqlite_pagecache_next, … }; #endif

  • Simple SQL Database void pthreadMutexLeave(lock* l) { global_config.unlock(l); } Indirect Call .nextitem = sqlite_pagecache_next, Cheap, but almost the same cost as pthread_mutex_unlock

  • Simple SQL Database Coz highlights these lines void pthreadMutexLeave(lock* l) { global_config.unlock(l); } void sqlite3MemSize(void* p) { global_config.getsize(p); } void pcache1Fetch(item* i) { global_config.nextitem(i); }

  • Simple SQL Database Line 16916 Line 18974 Line 40345 25% ● ● ● ● ● ● ● 0% Program Speedup ● ● ● ● ● ● ● − 25% ● − 50% 0% 50% 0% 50% 0% 50% Line Speedup