Why Is This Important? How does a relational database conceptually represent data? The Relational Model How can we access specific values in a database? How do we map an ER diagram to an actual database? Chapter 3 1 2 Why Study the Relational Model? Relational Database: Definitions Most widely used model. Relational database: a set of relations Big vendors: Oracle, IBM, Microsoft, MySQL Relation: made up of two parts “Legacy systems” in older models, e.g., IBM’s IMS Instance: a table, with rows and columns. Other competitors • #Rows = cardinality Tuple stores: Hadoop HBase, Google BigTable, Amazon • #Fields = degree / arity. Schema: specifies name of relation, plus name and type of SimpleDB and Dynamo Document stores: CouchDB, MongoDB each column. Graph databases: Sones, AllegroGraph • E.g., Students(sid: string, name: string, login: string, age: integer, gpa: real). Object-oriented databases: ObjectStore, Versant, Objectivity Can think of a relation as a set of rows or tuples • A synthesis: object-relational model by all major relational vendors XML databases: Oracle Berkeley DB XML, Tamino, MarkLogic, All rows are distinct. (Not necessarily true for DBMS eXist tables.) 3 4 Example Instance of Students Relation Relational Query Languages A major strength of the relational model: supports sid name login age gpa simple, powerful querying of data. 53666 Jones jones@cs 18 3.4 Queries can be written intuitively, and the DBMS is 53688 Smith smith@eecs 18 3.2 responsible for efficient evaluation. 53650 Smith smith@math 19 3.8 Specify WHAT you want, not HOW to get it efficiently • Declarative query language plus automatic optimizer The key: precise semantics for relational queries. Cardinality = 3, degree = 5, all rows distinct • Simplicity and elegance of relational model and operators also Do all columns in a relation instance have to be crucial distinct? Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change. 5 6
The SQL Query Language The SQL Query Language Developed by IBM (System R) in the 1970s To find all 18 year old students, we can write: Need for a standard since it is used by many vendors sid name login age gpa SELECT * Standards: FROM Students S 53666 Jones jones@cs 18 3.4 SQL-86 WHERE S.age=18 53688 Smith smith@ee 18 3.2 SQL-89 (minor revision) SQL-92 (major revision) To find just names and logins, replace the first line: SQL-99 (major extensions, current standard) SELECT S.name, S.login Careful, not all vendors implement the complete standard and often there are vendor-specific extensions 7 8 Querying Multiple Relations Creating Relations in SQL SELECT S.name, E.cid What does this Creates the Students relation. CREATE TABLE Students FROM Students S, Enrolled E (sid CHAR(20) , query compute? WHERE S.sid=E.sid AND E.grade =“A” Type (domain) of each field is name CHAR(20) , specified and enforced by the Given the following instances of Enrolled and login CHAR(10), DBMS whenever tuples are Students: age INTEGER , added or modified. sid name login age gpa gpa REAL ) sid cid grade 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53831 Reggae203 B The Enrolled table holds 53688 Smith smith@eecs 18 3.2 53650 Topology112 A CREATE TABLE Enrolled information about courses 53666 History105 B (sid CHAR(20) , 53650 Smith smith@math 19 3.8 that students take. cid CHAR(20) , S.name E.cid grade CHAR (2)) We get Smith Topology112 9 10 Destroying and Altering Relations Adding and Deleting Tuples Can insert a single tuple using: DROP TABLE Students Destroys the relation Students. The schema INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, „Smith‟, „ smith@ee ‟, 18, 3.2) information and the tuples are deleted. Can delete all tuples satisfying some condition (e.g., ALTER TABLE Students name = Smith): ADD COLUMN firstYear: integer DELETE The schema of Students is altered by adding a new FROM Students S field; every tuple in the current instance is extended WHERE S.name = „Smith‟ with a null value in the new field. 11 12
Integrity Constraints (ICs) Primary Key Constraints IC: condition that must be true for every instance of A set of fields is a key for a relation if : the database; e.g., domain constraints. 1. No two distinct tuples can have the same values in all key fields, and ICs are specified when schema is defined. 2. This is not true for any subset of the key. ICs are checked when relations are modified. Part 2 false? A superkey. A legal instance of a relation is one that satisfies all specified ICs. If there’s >1 key for a relation, one of the keys is DBMS should not allow illegal instances. chosen (by DBA) to be the primary key. E.g., sid is a key for Students. If the DBMS checks ICs, stored data is more faithful What about student name? to real-world meaning. The set {sid, gpa} is a superkey. Avoids data entry errors, too! 13 14 Primary and Candidate Keys in SQL Foreign Keys, Referential Integrity Possibly many candidate keys Foreign key: Set of fields in one relation that is used CREATE TABLE Enrolled (specified using UNIQUE), one of to `refer’ to a tuple in another relation. (sid CHAR (20) which is chosen as the primary cid CHAR(20) , Must correspond to primary key of the second relation. key. grade CHAR (2), “For a given student and course, Like a `logical pointer’. there is a single grade.” vs. PRIMARY KEY (sid, cid) ) E.g., sid in Enrolled is a foreign key referring to “Students can take only one Students: course, and receive a single grade for that course; further, no two Enrolled(sid: string, cid: string, grade: string) CREATE TABLE Enrolled students in a course receive the (sid CHAR (20) If all foreign key constraints are enforced, referential same grade.” cid CHAR(20) , integrity is achieved, i.e., no dangling references. Used carelessly, an IC can prevent grade CHAR (2), Can you name a data model w/o referential integrity? the storage of database instances that arise in practice! PRIMARY KEY (sid), • Links in HTML! UNIQUE (cid, grade) ) 15 16 Foreign Keys in SQL Enforcing Referential Integrity Consider Students and Enrolled Only students listed in the Students relation should sid in Enrolled is a foreign key that references Students. be allowed to enroll for courses. What should be done if an Enrolled tuple with a non- CREATE TABLE Enrolled existent student id is inserted? (sid CHAR (20), cid CHAR(20) , grade CHAR (2), Reject it. PRIMARY KEY (sid, cid), What should be done if a Students tuple is deleted? FOREIGN KEY (sid) REFERENCES Students ) Also delete all Enrolled tuples that refer to it. Students Enrolled Disallow deletion of a Students tuple that is referred to. sid name login age gpa sid cid grade Set sid in Enrolled tuples that refer to it to a default sid. 53666 Jones jones@cs 18 3.4 53666 Carnatic101 C (In SQL, also: Set sid in Enrolled tuples that refer to it to a 53688 Smith smith@eecs 18 3.2 53666 Reggae203 B special value null , denoting `unknown’ or `inapplicable’.) 53650 Smith smith@math 19 3.8 53650 Topology112 A Similar if primary key of Students tuple is updated. 53666 History105 B 17 18
Recommend
More recommend