9/25/2009 What Goes Around Comes Around What Goes Around Comes Around Administrative Notes Administrative Notes Michael Stonebraker, Joseph M. Hellerstein Michael Stonebraker, Joseph M. Hellerstein HW1 due now HW1 due now A few comments about the paper A few comments about the paper response sample grades response sample grades Most 2‟s needed more discussion (though Most 2‟s needed more discussion (though a few were a tad too short on summary) a few were a tad too short on summary) 2.5 – really close 2.5 – really close Don‟t turn in late Don‟t turn in late Slides based on slides originally by Garth Shoemaker Goals of the day: Goals of the day: Presenter Hat: Summary Presenter Hat: Summary To cover the first paper To cover the first paper 9 epochs in database research: 9 epochs in database research: To give an idea about how I would To give an idea about how I would We are repeating old ideas. We are repeating old ideas. suggest presenting/leading discussion suggest presenting/leading discussion We are failing to learn from old We are failing to learn from old I‟ll be wearing at least three hats: I‟ll be wearing at least three hats: mistakes. mistakes. Presenter Presenter We‟ll cover most of the epochs and We‟ll cover most of the epochs and Discusser Discusser lessons lessons Me Me Lesson 1. Physical and logical data Lesson 1. Physical and logical data independence are highly desirable independence are highly desirable Hierarchical (IMS) (late 60s-70s) Hierarchical (IMS) (late 60s-70s) Pros: Pros: IMS (hierarchical) was particularly bad IMS (hierarchical) was particularly bad Uses simple data manipulation language (DL/I) Uses simple data manipulation language (DL/I) at this at this Cons: Cons: Done to avoid very bad performance Done to avoid very bad performance Information is repeated Information is repeated This is like the example we saw last week This is like the example we saw last week Existence depends on parents Existence depends on parents You can‟t tune an application and You can‟t tune an application and No physical data independence (can‟t tune physical No physical data independence (can‟t tune physical guarantee that the DL/1 program can run guarantee that the DL/1 program can run level without tuning app) level without tuning app) Not much logical data independence either (can‟t Not much logical data independence either (can‟t tune schema without changing app (think views)) tune schema without changing app (think views)) 1
9/25/2009 Lesson 2. Tree structured data Lesson 2. Tree structured data Lesson 3. It’s a challenge to provide Lesson 3. It’s a challenge to provide sophisticated logical reorganizations of tree sophisticated logical reorganizations of tree models are very restrictive models are very restrictive structured data structured data Information is repeated Information is repeated IMS allowed 2 tree-structured IMS allowed 2 tree-structured databases to be combined databases to be combined You have to have a single parent, so You have to have a single parent, so sometimes you have to duplicate sometimes you have to duplicate Handy thing to do, but… Handy thing to do, but… Existence depends on parents Existence depends on parents Created a separate “view”, and views were Created a separate “view”, and views were handled differently for users (a real pain) handled differently for users (a real pain) What do you do if there is no parent value? What do you do if there is no parent value? Mapping the view to other databases was Mapping the view to other databases was very, very challenging very, very challenging Lesson 6: Loading and recovering directed Lesson 6: Loading and recovering directed Directed Graph (CODASYL) (70s) Directed Graph (CODASYL) (70s) graphs is more complex than hierarchies graphs is more complex than hierarchies Pros: Pros: Independence: Independence: Yeah! Graphs, not trees! Yeah! Graphs, not trees! In IMS, each database could be In IMS, each database could be Can model many-to-many relationships Can model many-to-many relationships independently loaded from a source independently loaded from a source Cons: Cons: In CODASYL, it‟s all connected, so In CODASYL, it‟s all connected, so Still no physical data independence Still no physical data independence everything had to be loaded at once everything had to be loaded at once Much more complex than IMS Much more complex than IMS Need to think carefully about disk seeks Need to think carefully about disk seeks (no general loading utility) (no general loading utility) Relational Relational Discussion Discussion (70s-early 80s) (70s-early 80s) The proposal in a nutshell: The proposal in a nutshell: Do you think structuring your data as a Do you think structuring your data as a Store the data in a simple data structure Store the data in a simple data structure graph instead of a tree is inherently graph instead of a tree is inherently Access through a high level set-at-a-time Access through a high level set-at-a-time too complicated, or does this seem too complicated, or does this seem DML DML like an implementation issue? like an implementation issue? No need for a physical storage proposal No need for a physical storage proposal Lots of good arguing by various sides “the great Lots of good arguing by various sides “the great debate” debate” 2
9/25/2009 Lesson 10: query optimizers can beat all Lesson 10: query optimizers can beat all Lesson 9: Technical debates are usually settled Lesson 9: Technical debates are usually settled but the best record at a time DBMS but the best record at a time DBMS by the elephants of the marketplace, and often by the elephants of the marketplace, and often for reasons not related to technology for reasons not related to technology application programmers application programmers What really brought down IMS? What really brought down IMS? Surprising at the time, but true Surprising at the time, but true IBM had both IMS and DB/2 IBM had both IMS and DB/2 Like playing chess – the computer can Like playing chess – the computer can think of many more options than a human, think of many more options than a human, IBM put DB/2 on VAX, but IMS on IBM put DB/2 on VAX, but IMS on even if not all even if not all mainframes mainframes Also similar to compilers Also similar to compilers Mainframes had most of the DB market Mainframes had most of the DB market They tried to implement DB/2 on top of IMS They tried to implement DB/2 on top of IMS and failed (complexity of IMS) and failed (complexity of IMS) Releasing DB/2 and IMS for mainframes Releasing DB/2 and IMS for mainframes Curtains for IMS Curtains for IMS Entity-Relationship (70s) Entity-Relationship (70s) Extended Relational (80s) Extended Relational (80s) Response to normalization Response to normalization How many features can relational databases How many features can relational databases have… have… Standard wisdom: create table, then Standard wisdom: create table, then normalize. Problems for DBAs: normalize. Problems for DBAs: Set valued attributes Set valued attributes Aggregation Aggregation 1. Where do I get initial tables 1. Where do I get initial tables Generalization Generalization 2. Can‟t understand functional dependences 2. Can‟t understand functional dependences And many, many more And many, many more Lesson 11: Functional dependencies Lesson 11: Functional dependencies Lesson 12: unless there is a big performance or Lesson 12: unless there is a big performance or are too difficult for mere mortals to are too difficult for mere mortals to functionality advantage, new constructs will functionality advantage, new constructs will understand. Another reason for KISS understand. Another reason for KISS go nowhere go nowhere Discussion Discussion Semantic (late 70 ‟ s and 80 ‟ s) (SDM) Semantic (late 70 ‟ s and 80 ‟ s) (SDM) Similar ideas, but more radical; change Similar ideas, but more radical; change The last two epochs didn‟t make much The last two epochs didn‟t make much whole model to be semantically richer. whole model to be semantically richer. lasting impact. Were they worth doing? lasting impact. Were they worth doing? Why or why not? Why or why not? Lots of machinery, little benefit. Died Lots of machinery, little benefit. Died without a trace. without a trace. 3
Recommend
More recommend