The Future Home of Data? Michael Franklin UC Berkeley VLDB Conf. August 2002 The Pervasive Computing Argument � Increasingly ubiquitous networking at all scales. – ad hoc sensor nets, wireless, global Internet � Explosion in number, locations, and types, of data sources and sinks. – mobile devices, P2P networks, data centers � Emerging software infrastructure to put it all together. – pub/sub, XML, web services, … � As a result… 2 1
Prediction Looking back in ten years, debating the “Future Home of Data” will seem as sensible as debating the “Future Home of Air”. The relevant issue for our community is instead, to determine what our role should be in this new world. 3 Data Management in a Networked World � The concept of a “home” for data shows our intellectual bias. – This stems from our success to date in solving previous enterprise data management problems. � Data is a commodity, and like all commodities, its value is realized only when it is moved to where it is needed. � Thus, we need to apply data management techniques and insights to data that is constantly in motion . 4 2
Implications � Shift emphasis from “storage” to movement. – must process data “on-the-fly”. – inspiration from and collaboration with networks and systems research. � Can’t orchestrate processing, must be reactive . – static, global (and probably local) planning and optimization won’t work. – can be opportunistic (for resources and data). � Transactional and Semantic consistency are huge bottlenecks (they cause blocking). – Will be used/expected only in limited and extreme cases. – CWA is inappropriate. 5 Example 1 - Data Stream Processing � Current “hot” trend in our field. – This is not an accident. � Existing technology can’t cut it: – Need event-based push/pull processing. – Need continuously-adaptive, shared processing. – Need appropriate data model & query lang. � Time & Window semantics: input and output � Continuously improving answers � Notification semantics & thresholds – Approximation , satisficing, and QoS � Must be driven by user needs and context � Adapt to available resources & time constraints – Integration & interaction with “pooled” data. 6 3
Example 2: Sensor Networks � Tiny (or not so tiny) devices measure the physical world. – Berkeley “motes”, Smart Dust, Smart Tags – Applications: Transportation, Seismic, Energy, Military… – 2 way – can actuate to effect or actively monitor the environment � Form dynamic ad hoc networks. � Aggregate and communicate streams of values. – Query Proc. & Routing Protocols work together. � Database insights are crucial here. – programmability + semantic optimizations. – see Madden et al. OSDI 02 7 Conclusions � Data will be everywhere and constantly in motion. � Need to shift our focus to address the wealth of new issues raised by pervasive connectivity. � Data management is central to this emerging environment, and database insights are crucial. 8 4
Recommend
More recommend