Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian - PowerPoint PPT Presentation

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian 1

Need for Data Analysis • Performance monitoring – Detect unexpected performance drops/rises • Pattern mining – Understand user response to new features • Ad revenue monitoring – Identify regional drops/rises in ad clicks and revenue 2

Data Analysis at Facebook • Large data volumes • Real time analysis of this data • Key Requirements – Low latency – Flexibility – Scalability 3

Proposed Solution: Scuba • Structure – In-memory database – Across hundreds of servers • How does it work? – Holds and processes sampled real-time data – Query interface to access data – Visualization interface to analyze data 4

Architecture Server Leaf nodes 5

Data Layout • Data stored in tables • Data types supported – Integers, strings, sets of strings, vectors of strings • Different compression for different data types Table Characteristics • Table is created upon data arrival at a leaf node • Table can have empty columns; treated as null 6

Data Ingestion into Scuba Scribe Leaf nodes 7

Data Ingestion into Scuba • Events are sampled to reduce the data volume • Use Scribe, a distributed messaging system to – Collect, aggregate and deliver data to Scuba • For each batch of incoming data – Pick two leaf nodes at random – Send the batch to the node with more free memory • Data compressed and sent to disk • Data then read back and stored in memory 8

Dealing with Old Data • Memory capacity is a concern • Need to add new servers every 2-3 weeks • Delete data based on – Age: Sample and preserve a fraction of old data – Space: When exceeding space limits, delete old data 9

Querying Scuba • Three kinds of interfaces – Web-based – SQL – API to support querying from application code • Queries supported – Different forms of aggregation – Percentiles, histograms • Joins not supported by Scuba 10

Query Execution Root Aggregator Intermediate Aggregators Leaf Aggregators Leaf nodes 11

Query Execution • Leaf node may or may not contain a table’s data – Depends on the table size and age • Data scanning is usually by time range – Time is Scuba’s only notion of index • Results of a node are omitted beyond a time out – Small missing pieces of data do not affect accuracy of computations much – Lower response time is a bigger requirement 12

Performance Model • Breaks down the latencies of different components • Function of fanout, processing time at each aggregator, depth of tree 13

Experimental Setup and Queries • 4 racks of 40 machines • Machine configuration – Intel Xeon E5-2660 – 2.2 GHz – 144 GB DRAM memory • 10G ethernet • Scan query, Time series query 14

Speedup and Scaleup 15

Throughput 16

Discussion • Details on the kind of data stored and analyzed • Performance numbers for a wider set of queries • Are these query throughputs good enough? – Might be fine for an internal system 17

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian - PowerPoint PPT Presentation

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian 1 Need for Data Analysis Performance monitoring Detect unexpected performance drops/rises Pattern mining Understand user response to new features Ad revenue

SCUBA: the (not-so) dangerous underwater sport Dilan Ustek The what SCUBA = Self-Contained

Scuba: Diving into Data at Facebook - Lior et. al Presented By - Sidharth Singla MMATH CS

Training Scuba Divers: A Fatality Training Scuba Divers: A Fatality and Risk Analysis and Risk

Scuba Diving Without Air & Other QS Annual Conference Show & Tell Impossible Sept.

Diving into Mastery Guidance for Educators Diving Deeper Deepest Aim Read Roman numerals to

Recent advances in diving medicine research DAN Europe VGE Studies A Century of Diving Medicine

Day 2: Diving Deeper into Day 2: Diving Deeper into Data Visualization with R Data Visualization

Scuba diving as Mediterranean Culture. page 20 preservation and presentation of gozos maritime

Disclosures I have no financial conflicts of interest Deep: Scuba diving associated I

6 and 8 Times Table and Division Facts Diving into Mastery Guidance for Educators Diving Deeper

7, 11 and 12 Multiplication Tables Diving into Mastery Guidance for Educators Diving Deeper

Diving Group RNLN Dive into the future with the RNLN Royal Netherlands Navy Anton van Dijk 19

The SCUBA-2 Cosmology Legacy Survey and beyond Jim Geach on behalf of the S2CLS consortium #SMG20

Southern Diving Group SDU1 (Plymouth) / SDU2 (Portsmouth) Commander Del McKnight RN Fleet Diving

Diving into Mastery Guidance for Educators Each activity sheet is split into three sections,

where were going Cdr Al Nekrews QGM RN CO Fleet Diving Squadron Royal Navy Scope: - RN

Computer Science 135: Diving into the Deluge of Data Contact Information: Brent Heeringa Email:

Primal heuristic for MINLPs in SCIP Ambros Gleixner, and Felipe Serrano Zuse Institute Berlin

III. A Deep Dive into the CA English Learner Roadmap Laurie Olsen, Ph.D.; Strategic Advisor,

Pro po sing a n a lte rna tive me tho d o f de mo nstra ting c o mplia nc e with 46 CF R Pa rt

Next Generation ACO Model Open Door Forum: Financial Deep Dive March 31, 2015 Agenda

Initial Submission Id IHV Publication Id IHV DUA Submission Id IHV DUA Publication Id Publish

Insect Division of Labour Applied to Online Scheduling Koen van der Blom Leiden Institute of

HFST: A new division of labour between software industry and linguists Kimmo Koskenniemi

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian - PowerPoint PPT Presentation

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian 1 Need for Data Analysis Performance monitoring Detect unexpected performance drops/rises Pattern mining Understand user response to new features Ad revenue

SCUBA: the (not-so) dangerous underwater sport Dilan Ustek The what SCUBA = Self-Contained

Scuba: Diving into Data at Facebook - Lior et. al Presented By - Sidharth Singla MMATH CS

Training Scuba Divers: A Fatality Training Scuba Divers: A Fatality and Risk Analysis and Risk

Scuba Diving Without Air &amp; Other QS Annual Conference Show &amp; Tell Impossible Sept.

Diving into Mastery Guidance for Educators Diving Deeper Deepest Aim Read Roman numerals to

Recent advances in diving medicine research DAN Europe VGE Studies A Century of Diving Medicine

Day 2: Diving Deeper into Day 2: Diving Deeper into Data Visualization with R Data Visualization

Scuba diving as Mediterranean Culture. page 20 preservation and presentation of gozos maritime

Disclosures I have no financial conflicts of interest Deep: Scuba diving associated I

6 and 8 Times Table and Division Facts Diving into Mastery Guidance for Educators Diving Deeper

7, 11 and 12 Multiplication Tables Diving into Mastery Guidance for Educators Diving Deeper

Diving Group RNLN Dive into the future with the RNLN Royal Netherlands Navy Anton van Dijk 19

The SCUBA-2 Cosmology Legacy Survey and beyond Jim Geach on behalf of the S2CLS consortium #SMG20

Southern Diving Group SDU1 (Plymouth) / SDU2 (Portsmouth) Commander Del McKnight RN Fleet Diving

Diving into Mastery Guidance for Educators Each activity sheet is split into three sections,

where were going Cdr Al Nekrews QGM RN CO Fleet Diving Squadron Royal Navy Scope: - RN

Computer Science 135: Diving into the Deluge of Data Contact Information: Brent Heeringa Email:

Primal heuristic for MINLPs in SCIP Ambros Gleixner, and Felipe Serrano Zuse Institute Berlin

III. A Deep Dive into the CA English Learner Roadmap Laurie Olsen, Ph.D.; Strategic Advisor,

Pro po sing a n a lte rna tive me tho d o f de mo nstra ting c o mplia nc e with 46 CF R Pa rt

Next Generation ACO Model Open Door Forum: Financial Deep Dive March 31, 2015 Agenda

Initial Submission Id IHV Publication Id IHV DUA Submission Id IHV DUA Publication Id Publish

Insect Division of Labour Applied to Online Scheduling Koen van der Blom Leiden Institute of

HFST: A new division of labour between software industry and linguists Kimmo Koskenniemi

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Scuba Diving Without Air & Other QS Annual Conference Show & Tell Impossible Sept.