why is this important
play

Why Is This Important? How does a relational database conceptually - PDF document

Why Is This Important? How does a relational database conceptually represent data? The Relational Model How can we access specific values in a database? How do we map an ER diagram to an actual database? Chapter 3 1 2 Why Study


  1. Why Is This Important?  How does a relational database conceptually represent data? The Relational Model  How can we access specific values in a database?  How do we map an ER diagram to an actual database? Chapter 3 1 2 Why Study the Relational Model? Relational Database: Definitions  Most widely used model.  Relational database: a set of relations  Big vendors: Oracle, IBM, Microsoft, MySQL  Relation: made up of two parts  “Legacy systems” in older models, e.g., IBM’s IMS  Instance: a table, with rows and columns.  Other competitors • #Rows = cardinality  Tuple stores: Hadoop HBase, Google BigTable, Amazon • #Fields = degree / arity.  Schema: specifies name of relation, plus name and type of SimpleDB and Dynamo  Document stores: CouchDB, MongoDB each column.  Graph databases: Sones, AllegroGraph • E.g., Students(sid: string, name: string, login: string, age: integer, gpa: real).  Object-oriented databases: ObjectStore, Versant, Objectivity  Can think of a relation as a set of rows or tuples • A synthesis: object-relational model by all major relational vendors  XML databases: Oracle Berkeley DB XML, Tamino, MarkLogic,  All rows are distinct. (Not necessarily true for DBMS eXist tables.) 3 4 Example Instance of Students Relation Relational Query Languages  A major strength of the relational model: supports sid name login age gpa simple, powerful querying of data. 53666 Jones jones@cs 18 3.4  Queries can be written intuitively, and the DBMS is 53688 Smith smith@eecs 18 3.2 responsible for efficient evaluation. 53650 Smith smith@math 19 3.8  Specify WHAT you want, not HOW to get it efficiently • Declarative query language plus automatic optimizer  The key: precise semantics for relational queries.  Cardinality = 3, degree = 5, all rows distinct • Simplicity and elegance of relational model and operators also  Do all columns in a relation instance have to be crucial distinct?  Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change. 5 6

  2. The SQL Query Language The SQL Query Language  Developed by IBM (System R) in the 1970s  To find all 18 year old students, we can write:  Need for a standard since it is used by many vendors sid name login age gpa SELECT *  Standards: FROM Students S 53666 Jones jones@cs 18 3.4  SQL-86 WHERE S.age=18 53688 Smith smith@ee 18 3.2  SQL-89 (minor revision)  SQL-92 (major revision)  To find just names and logins, replace the first line:  SQL-99 (major extensions, current standard) SELECT S.name, S.login  Careful, not all vendors implement the complete standard and often there are vendor-specific extensions 7 8 Querying Multiple Relations Creating Relations in SQL SELECT S.name, E.cid  What does this  Creates the Students relation. CREATE TABLE Students FROM Students S, Enrolled E (sid CHAR(20) , query compute? WHERE S.sid=E.sid AND E.grade =“A”  Type (domain) of each field is name CHAR(20) , specified and enforced by the  Given the following instances of Enrolled and login CHAR(10), DBMS whenever tuples are Students: age INTEGER , added or modified. sid name login age gpa gpa REAL ) sid cid grade 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53831 Reggae203 B  The Enrolled table holds 53688 Smith smith@eecs 18 3.2 53650 Topology112 A CREATE TABLE Enrolled information about courses 53666 History105 B (sid CHAR(20) , 53650 Smith smith@math 19 3.8 that students take. cid CHAR(20) , S.name E.cid grade CHAR (2))  We get Smith Topology112 9 10 Destroying and Altering Relations Adding and Deleting Tuples  Can insert a single tuple using: DROP TABLE Students  Destroys the relation Students. The schema INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, „Smith‟, „ smith@ee ‟, 18, 3.2) information and the tuples are deleted.  Can delete all tuples satisfying some condition (e.g., ALTER TABLE Students name = Smith): ADD COLUMN firstYear: integer DELETE  The schema of Students is altered by adding a new FROM Students S field; every tuple in the current instance is extended WHERE S.name = „Smith‟ with a null value in the new field. 11 12

  3. Integrity Constraints (ICs) Primary Key Constraints  IC: condition that must be true for every instance of  A set of fields is a key for a relation if : the database; e.g., domain constraints. 1. No two distinct tuples can have the same values in all key fields, and  ICs are specified when schema is defined. 2. This is not true for any subset of the key.  ICs are checked when relations are modified.  Part 2 false? A superkey.  A legal instance of a relation is one that satisfies all specified ICs.  If there’s >1 key for a relation, one of the keys is  DBMS should not allow illegal instances. chosen (by DBA) to be the primary key.  E.g., sid is a key for Students.  If the DBMS checks ICs, stored data is more faithful  What about student name? to real-world meaning.  The set {sid, gpa} is a superkey.  Avoids data entry errors, too! 13 14 Primary and Candidate Keys in SQL Foreign Keys, Referential Integrity  Possibly many candidate keys  Foreign key: Set of fields in one relation that is used CREATE TABLE Enrolled (specified using UNIQUE), one of to `refer’ to a tuple in another relation. (sid CHAR (20) which is chosen as the primary cid CHAR(20) ,  Must correspond to primary key of the second relation. key. grade CHAR (2),  “For a given student and course,  Like a `logical pointer’. there is a single grade.” vs. PRIMARY KEY (sid, cid) )  E.g., sid in Enrolled is a foreign key referring to “Students can take only one Students: course, and receive a single grade for that course; further, no two  Enrolled(sid: string, cid: string, grade: string) CREATE TABLE Enrolled students in a course receive the (sid CHAR (20)  If all foreign key constraints are enforced, referential same grade.” cid CHAR(20) , integrity is achieved, i.e., no dangling references.  Used carelessly, an IC can prevent grade CHAR (2),  Can you name a data model w/o referential integrity? the storage of database instances that arise in practice! PRIMARY KEY (sid), • Links in HTML! UNIQUE (cid, grade) ) 15 16 Foreign Keys in SQL Enforcing Referential Integrity  Consider Students and Enrolled  Only students listed in the Students relation should  sid in Enrolled is a foreign key that references Students. be allowed to enroll for courses.  What should be done if an Enrolled tuple with a non- CREATE TABLE Enrolled existent student id is inserted? (sid CHAR (20), cid CHAR(20) , grade CHAR (2),  Reject it. PRIMARY KEY (sid, cid),  What should be done if a Students tuple is deleted? FOREIGN KEY (sid) REFERENCES Students )  Also delete all Enrolled tuples that refer to it. Students Enrolled  Disallow deletion of a Students tuple that is referred to. sid name login age gpa sid cid grade  Set sid in Enrolled tuples that refer to it to a default sid. 53666 Jones jones@cs 18 3.4 53666 Carnatic101 C  (In SQL, also: Set sid in Enrolled tuples that refer to it to a 53688 Smith smith@eecs 18 3.2 53666 Reggae203 B special value null , denoting `unknown’ or `inapplicable’.) 53650 Smith smith@math 19 3.8 53650 Topology112 A  Similar if primary key of Students tuple is updated. 53666 History105 B 17 18

Recommend


More recommend