Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi Klopman
Timeline 1:40 - 1:50: Introduction 1:50 - 3:00: Real-Time Databases/Scheduling 3:00 - 3:10: Break 3:10 - 4:00: Operator Scheduling in Aurora 4:00 - 4:25: Discussion 4:25 - 4:30: Comments
Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data: consistency and validity • Conclusions
References • http://www.fpa.org/newsletter_info2584/n ewsletter_info.htm (info on scud missiles) • http://www.fas.org/spp/starwars/gao/im9 2026.htm (info on Patriot Missile System)
Imagine this… • We are at war with Iraq • Our soldiers find a potential target • Military intelligence consults a database to determine course of action
Imagine this… • We are at war with Iraq • Air control system constantly monitors hundreds of aircraft and records them in a database • Intelligence systems constantly query the database for potential threats
Suddenly… • Hundreds of missiles are launched • We suspect some are nuclear • Need info which will allow us to determine a course of action • Need this info to make rapid decision • The costs of indecision are catastrophic
What could go wrong? • Limited number of missiles we can intercept • Once they’re launched, we have limited time to react • Our traditional database is slowed by less critical queries • Finally, our queries may not be answered in time due to system load
We need a system that: • Handles time-sensitive queries • Returns only temporally valid data • Supports priority scheduling • Solution: Real-Time Databases!
Real-Time Databases and Streams • Scheduling – Streams: priority based on QoS optimization – Real-Time: priority based on deadlines • Load Shedding – Streams: dropping tuples from queues – Real-Time: missing deadlines • Freshness of data: – Streams: not guaranteed – Real-Time: resample
Real-Time Databases and Streams • Scheduling – Streams: priority based on QoS optimization – Real-Time: priority based on deadlines and user- supplied values • Load Shedding – Streams: dropping tuples from queues – Real-Time: missing deadlines, dropping transactions • Freshness of data: – Streams: not guaranteed – Real-Time: resample
Real-Time Databases • An extension to traditional databases • Motivated by class of applications that require reliable responses • Predictable (not necessarily fast)
Real-Time Database Features • Priority – Classification of transactions – Assigns value to transactions • Deadlines – Transactions specify explicit time requirements – Transaction scheduling takes time requirements into account – Predictability that transactions will complete by deadline or not at all
Transactions and Streams • Operation on the database that perform combinations of reads/writes in an atomic step – Queries are a subset of transactions • Streams are read-only data (may create new tuples) • Data Consistency
Characteristics of Transactions • Manner in which transactions use data • Nature of time constraints • Significance of executing a transaction by its deadline – consequence of missing specified time constraints
Transaction Classification • Effect of missing transaction deadlines • Value to user is dependent on timeliness: – Soft: have some value after deadline – Firm: have no value after deadline – Hard: have negative value after deadline • Special case: no deadline • Idea for Streams: Queries have periodic deadlines
Scheduling and Streams • Streams: schedules queries in terms of QoS • Real-Time Databases: schedule transactions in terms of scheduling policy
Real-Time Databases/Scheduling • General Introduction • Scheduling Policies � • Resource Allocation • Properties of Data: consistency and validity • Conclusions
Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data: consistency and validity • Conclusions
Scheduling Policies • Earliest deadline first (PMM, PAQRS) • Highest value first • Highest value per unit computation time first • Longest executed transaction first
PMM • Priority Memory Management • Admission Control – Decide if we run a query. • Memory Allocation – How much memory does each running query get.
Memory Allocation: Two Strategies • Max – Queries get their maximum required memory or no memory at all. • MinMax – High priority queries get their maximum required memory and low priority queries get their minimum.
Admission Control • Goal: minimize the miss ratio (number of queries that miss their deadline/total queries). • MultiProgramming Level (MPL) = number of queries to run. • Optimize system resource use: optimal MPL.
Relating MPL to Streams • Real-Time: One time queries • Stream: Continuous Queries • Possibilities for future DSMS: – Using MPL for QoS
Oh no! Missiles are launched again. • We are running two types of queries: – Query1 – Where should CNN’s cameras face to see the missile? – Query2 – Should we shoot the missile down? • Queries of type 2 are obviously more important, but how does the db know? • Consider: Applications for relative query values in stream systems.
PAQRS – extension of PMM • Priority Adaptation Query Resource Scheduling. • PMM only minimizes miss ratio for the entire system. • We would like to be able to specify a ratio between query classes for missed deadlines. • RelMissRatio (Relative Miss Ratio) = {99:1} Query1:Query2.
Why do we care? • Think of the missile example. – Same problems still exist in stream systems. • Potential Stream Additions: – Relative Priority Scheduling. • Not all queries are equal • Another form of QoS – Periodic Query Deadlines. • Deadlines for continuous queries
Bias Control • Puts queries into two groups: – Regular – Queries run with normal priority – Reserve – Queries run with priority lower than regular. • Manages groups on a per query basis – Each class gets RegQuota regular queries. – The rest have to run as reserve queries.
Relative Weights Weight should reflect a class’ RelMissRatio. Weight i = (1/RelMissRatio i )/ Σ j (1/RelMissRatio j ) Weight cnn = (1/99)/(1/99 + 1) = .01 Weight mis = (1)/(1/99 + 1) = .99
Bias Control using Relative Weights WeightedMissRatio = Σ (Weight i * MissRatio i ) All terms are equal when the ratio is correct. WeightMissRatio ex =(.01*99x%) + (.99*x%) WeightMissRatio ex =.99x% + .99x%
Back to Missiles and CNN • The actual miss ratio is not correct, the miss ratio is 50:50! new = RegQuota i old * • RegQuota i {(Weight i * MissRatio i )/ (WeightedMissRatio/NumClasses)}
Missiles and CNN Calculations WeightedMissRate=(.01*.50)+(.99*.50)=.5 .005 ≠ .495 RegQuota cnnnew =RegQuota cnnold * (.01*.50)/(.5/2) RegQuota cnnnew =RegQuota cnnold *.02 (98% less) RegQuota misnew =RegQuota misold * (.99*.50)/(.5/2) RegQuota misnew =RegQuota misold *1.98 (98% more)
Does it really work?
Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation � • Properties of Data:consistency & validity • Conclusions
Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data:consistency & validity • Conclusions
Essence of Real Time • Although adaptive systems give better throughput, IT DOESN’T MATTER! • RT is about dependability , not throughput. 1% miss rate is (usually) unacceptable. • Throughput can be handled (usually) with extra hardware (i.e. money). Dependability needs a special design.
Resources in Databases • Physical • Logical – CPU(s) – Locks – Memory • Cache • Work Area – I/O Bandwidth • Disks & Storage • Network for Distributed Processing – Time...
Cost of a Transaction • Waiting for locks to release • Work memory needed (e.g. O(n) for in-memory hash join, O(sqrt(n)) for disk-assisted) • I/O amount (e.g. worst case join: multiplication) • CPU needed to process • Cost of aborting a transaction (negligible for queries) If success cannot be guaranteed, don't start!
Physical Resources – Now and Then 1995 (Paper) 2003 (opt.RAID) CPU Speed (MIPS) 40 2 X 2500 Memory Buffers (MB) 20 800 I/O Bandwidth (MB/s) 10 100 Disk Size (GB) 1 120 (1TB) Disk Cache (kB) 256 8192 (1GB) # Disks 10 4 (16) Disk Latency (ms) 16.7 6 Latency is Forever…
Memory Allocation Strategies (I) • Max – all memory needed or nothing (don't admit) • MinMax – all memory needed for high-priority – min memory needed for low-priority • M&M – feedback-based allocation – adaptive – small amount of memory set aside for small transactions
Memory Allocation Strategies (II) • Multiclass Dependent – Small get all the memory they need – Large get a minimum amount – Medium get according to level load • Classes are: – Small – less than 10% of memory – Large – more than memory – Medium – between them.
Allocating Memory S M L M S L S
Recommend
More recommend