Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat - PowerPoint PPT Presentation

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi Klopman

Timeline 1:40 - 1:50: Introduction 1:50 - 3:00: Real-Time Databases/Scheduling 3:00 - 3:10: Break 3:10 - 4:00: Operator Scheduling in Aurora 4:00 - 4:25: Discussion 4:25 - 4:30: Comments

Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data: consistency and validity • Conclusions

References • http://www.fpa.org/newsletter_info2584/n ewsletter_info.htm (info on scud missiles) • http://www.fas.org/spp/starwars/gao/im9 2026.htm (info on Patriot Missile System)

Imagine this… • We are at war with Iraq • Our soldiers find a potential target • Military intelligence consults a database to determine course of action

Imagine this… • We are at war with Iraq • Air control system constantly monitors hundreds of aircraft and records them in a database • Intelligence systems constantly query the database for potential threats

Suddenly… • Hundreds of missiles are launched • We suspect some are nuclear • Need info which will allow us to determine a course of action • Need this info to make rapid decision • The costs of indecision are catastrophic

What could go wrong? • Limited number of missiles we can intercept • Once they’re launched, we have limited time to react • Our traditional database is slowed by less critical queries • Finally, our queries may not be answered in time due to system load

We need a system that: • Handles time-sensitive queries • Returns only temporally valid data • Supports priority scheduling • Solution: Real-Time Databases!

Real-Time Databases and Streams • Scheduling – Streams: priority based on QoS optimization – Real-Time: priority based on deadlines • Load Shedding – Streams: dropping tuples from queues – Real-Time: missing deadlines • Freshness of data: – Streams: not guaranteed – Real-Time: resample

Real-Time Databases and Streams • Scheduling – Streams: priority based on QoS optimization – Real-Time: priority based on deadlines and user- supplied values • Load Shedding – Streams: dropping tuples from queues – Real-Time: missing deadlines, dropping transactions • Freshness of data: – Streams: not guaranteed – Real-Time: resample

Real-Time Databases • An extension to traditional databases • Motivated by class of applications that require reliable responses • Predictable (not necessarily fast)

Real-Time Database Features • Priority – Classification of transactions – Assigns value to transactions • Deadlines – Transactions specify explicit time requirements – Transaction scheduling takes time requirements into account – Predictability that transactions will complete by deadline or not at all

Transactions and Streams • Operation on the database that perform combinations of reads/writes in an atomic step – Queries are a subset of transactions • Streams are read-only data (may create new tuples) • Data Consistency

Characteristics of Transactions • Manner in which transactions use data • Nature of time constraints • Significance of executing a transaction by its deadline – consequence of missing specified time constraints

Transaction Classification • Effect of missing transaction deadlines • Value to user is dependent on timeliness: – Soft: have some value after deadline – Firm: have no value after deadline – Hard: have negative value after deadline • Special case: no deadline • Idea for Streams: Queries have periodic deadlines

Scheduling and Streams • Streams: schedules queries in terms of QoS • Real-Time Databases: schedule transactions in terms of scheduling policy

Real-Time Databases/Scheduling • General Introduction • Scheduling Policies � • Resource Allocation • Properties of Data: consistency and validity • Conclusions

Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data: consistency and validity • Conclusions

Scheduling Policies • Earliest deadline first (PMM, PAQRS) • Highest value first • Highest value per unit computation time first • Longest executed transaction first

PMM • Priority Memory Management • Admission Control – Decide if we run a query. • Memory Allocation – How much memory does each running query get.

Memory Allocation: Two Strategies • Max – Queries get their maximum required memory or no memory at all. • MinMax – High priority queries get their maximum required memory and low priority queries get their minimum.

Admission Control • Goal: minimize the miss ratio (number of queries that miss their deadline/total queries). • MultiProgramming Level (MPL) = number of queries to run. • Optimize system resource use: optimal MPL.

Relating MPL to Streams • Real-Time: One time queries • Stream: Continuous Queries • Possibilities for future DSMS: – Using MPL for QoS

Oh no! Missiles are launched again. • We are running two types of queries: – Query1 – Where should CNN’s cameras face to see the missile? – Query2 – Should we shoot the missile down? • Queries of type 2 are obviously more important, but how does the db know? • Consider: Applications for relative query values in stream systems.

PAQRS – extension of PMM • Priority Adaptation Query Resource Scheduling. • PMM only minimizes miss ratio for the entire system. • We would like to be able to specify a ratio between query classes for missed deadlines. • RelMissRatio (Relative Miss Ratio) = {99:1} Query1:Query2.

Why do we care? • Think of the missile example. – Same problems still exist in stream systems. • Potential Stream Additions: – Relative Priority Scheduling. • Not all queries are equal • Another form of QoS – Periodic Query Deadlines. • Deadlines for continuous queries

Bias Control • Puts queries into two groups: – Regular – Queries run with normal priority – Reserve – Queries run with priority lower than regular. • Manages groups on a per query basis – Each class gets RegQuota regular queries. – The rest have to run as reserve queries.

Relative Weights Weight should reflect a class’ RelMissRatio. Weight i = (1/RelMissRatio i )/ Σ j (1/RelMissRatio j ) Weight cnn = (1/99)/(1/99 + 1) = .01 Weight mis = (1)/(1/99 + 1) = .99

Bias Control using Relative Weights WeightedMissRatio = Σ (Weight i * MissRatio i ) All terms are equal when the ratio is correct. WeightMissRatio ex =(.01*99x%) + (.99*x%) WeightMissRatio ex =.99x% + .99x%

Back to Missiles and CNN • The actual miss ratio is not correct, the miss ratio is 50:50! new = RegQuota i old * • RegQuota i {(Weight i * MissRatio i )/ (WeightedMissRatio/NumClasses)}

Missiles and CNN Calculations WeightedMissRate=(.01*.50)+(.99*.50)=.5 .005 ≠ .495 RegQuota cnnnew =RegQuota cnnold * (.01*.50)/(.5/2) RegQuota cnnnew =RegQuota cnnold *.02 (98% less) RegQuota misnew =RegQuota misold * (.99*.50)/(.5/2) RegQuota misnew =RegQuota misold *1.98 (98% more)

Does it really work?

Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation � • Properties of Data:consistency & validity • Conclusions

Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data:consistency & validity • Conclusions

Essence of Real Time • Although adaptive systems give better throughput, IT DOESN’T MATTER! • RT is about dependability , not throughput. 1% miss rate is (usually) unacceptable. • Throughput can be handled (usually) with extra hardware (i.e. money). Dependability needs a special design.

Resources in Databases • Physical • Logical – CPU(s) – Locks – Memory • Cache • Work Area – I/O Bandwidth • Disks & Storage • Network for Distributed Processing – Time...

Cost of a Transaction • Waiting for locks to release • Work memory needed (e.g. O(n) for in-memory hash join, O(sqrt(n)) for disk-assisted) • I/O amount (e.g. worst case join: multiplication) • CPU needed to process • Cost of aborting a transaction (negligible for queries) If success cannot be guaranteed, don't start!

Physical Resources – Now and Then 1995 (Paper) 2003 (opt.RAID) CPU Speed (MIPS) 40 2 X 2500 Memory Buffers (MB) 20 800 I/O Bandwidth (MB/s) 10 100 Disk Size (GB) 1 120 (1TB) Disk Cache (kB) 256 8192 (1GB) # Disks 10 4 (16) Disk Latency (ms) 16.7 6 Latency is Forever…

Memory Allocation Strategies (I) • Max – all memory needed or nothing (don't admit) • MinMax – all memory needed for high-priority – min memory needed for low-priority • M&M – feedback-based allocation – adaptive – small amount of memory set aside for small transactions

Memory Allocation Strategies (II) • Multiclass Dependent – Small get all the memory they need – Large get a minimum amount – Medium get according to level load • Classes are: – Small – less than 10% of memory – Large – more than memory – Medium – between them.

Allocating Memory S M L M S L S

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat - PowerPoint PPT Presentation

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi Klopman Timeline 1:40 - 1:50: Introduction 1:50 - 3:00: Real-Time Databases/Scheduling 3:00 - 3:10: Break 3:10 - 4:00: Operator Scheduling in

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Real graduates, Real graduates, real transitions, real transitions, real stories: real

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

EMBEDDED EMBEDDED REAL TIME SYSTEMS REAL TIME SYSTEMS EMBEDDED EMBEDDED REAL TIME SYSTEMS

Scripts for Sensor Network Seminar Data Management Section Lectured by George Kollios,

Confusion in the land of the serverless Sam Newman Building Microservices DESIGNING FINE -

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Overload Control for Scaling WeChat Microservices WeChat The new way to connect Chat Moments

Ope rating State s November 16 th 2018 1. Follow up on Autonomous Islands Age nda 2. Brief

The Eight Requirements of Real- Time Stream Processing: STREAM vs Storm Presentation by: Alex

Staying FIT with Aurora/Borealis Wednesday, 01 October 2008 Overview Introduction to Stream

Power Grid Impacts Resulting From Unintentional Demand Response J EFF D AGLE , PE Chief

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat - PowerPoint PPT Presentation

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi Klopman Timeline 1:40 - 1:50: Introduction 1:50 - 3:00: Real-Time Databases/Scheduling 3:00 - 3:10: Break 3:10 - 4:00: Operator Scheduling in

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Real graduates, Real graduates, real transitions, real transitions, real stories: real

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Databases and PHP Accessing databases from PHP PHP &amp; Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

EMBEDDED EMBEDDED REAL TIME SYSTEMS REAL TIME SYSTEMS EMBEDDED EMBEDDED REAL TIME SYSTEMS

Scripts for Sensor Network Seminar Data Management Section Lectured by George Kollios,

Confusion in the land of the serverless Sam Newman Building Microservices DESIGNING FINE -

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Overload Control for Scaling WeChat Microservices WeChat The new way to connect Chat Moments

Ope rating State s November 16 th 2018 1. Follow up on Autonomous Islands Age nda 2. Brief

The Eight Requirements of Real- Time Stream Processing: STREAM vs Storm Presentation by: Alex

Staying FIT with Aurora/Borealis Wednesday, 01 October 2008 Overview Introduction to Stream

Power Grid Impacts Resulting From Unintentional Demand Response J EFF D AGLE , PE Chief

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to