course intro
play

Course Intro & Relational Model Lecture # 01 Database Systems - PowerPoint PPT Presentation

Course Intro & Relational Model Lecture # 01 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018 2 Leon Wrinkles (1 946-201 8) CMU 15-445/645 (Fall 2018) 3 CMU 15-445/645 (Fall 2018)


  1. Course Intro & Relational Model Lecture # 01 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018

  2. 2 Leon Wrinkles (1 946-201 8) CMU 15-445/645 (Fall 2018)

  3. 3 CMU 15-445/645 (Fall 2018)

  4. #1 KB #2 Databases

  5. 5 Wait List Overview Course Logistics Relational Model Relational Algebra CMU 15-445/645 (Fall 2018)

  6. 6 WAIT LIST There are currently 150 people on the waiting list. Max capacity is 100. We will enroll people from the waiting list in the order that you complete Homework #1. CMU 15-445/645 (Fall 2018)

  7. 7 CO URSE OVERVIEW This course is on the design and implementation of disk-oriented database management systems. This is not a course on how to use a database to build applications or how to administer a database. → See CMU 95-703 (Heinz College) Database Applications (15-415/615) is cancelled this semester. CMU 15-445/645 (Fall 2018)

  8. 8 CO URSE O UTLIN E Relational Databases Storage Execution Concurrency Control Recovery Distributed Databases Potpourri CMU 15-445/645 (Fall 2018)

  9. 9 CO URSE LO GISTICS Course Policies + Schedule: → Refer to course web page. Academic Honesty: → Refer to CMU policy page. → If you’re not sure, ask the professors. → Don’t be stupid. All discussion + announcements will be on Piazza. CMU 15-445/645 (Fall 2018)

  10. 10 TEXTBO O K Database System Concepts 6th Edition Silberschatz, Korth, & Sudarshan We will also provide lecture notes that covers topics not found in textbook. CMU 15-445/645 (Fall 2018)

  11. 11 CO URSE RUBRIC Homeworks (15%) Projects (45%) Midterm Exam (20%) Final Exam (20%) Extra Credit (+10%) CMU 15-445/645 (Fall 2018)

  12. 12 H O M EWO RKS Five homework assignments throughout the semester. First homework is a SQL assignment. The rest will be pencil-and-paper assignments. All homeworks should be done individually. CMU 15-445/645 (Fall 2018)

  13. 13 PRO J ECTS You will build your own storage manager from scratch of the course of the semester. Each project builds on the previous one. We will not teach you how to write/debug C++11 code. CMU 15-445/645 (Fall 2018)

  14. 14 LATE PO LICY You are allowed 4 slip days for either homeworks or projects. You lose 25% of an assignment’s points for every 24hrs it is late. Mark on your submission (1) how many days you are late and (2) how many late days you have left. CMU 15-445/645 (Fall 2018)

  15. 15 PLAGIARISM WARN IN G The homeworks and projects must be your own work. They are not group assignments. You may not copy source code from other people or the web. Plagiarism will not be tolerated. See CMU's Policy on Academic Integrity for additional information. CMU 15-445/645 (Fall 2018)

  16. DATABASE RESEARCH Database Group Meetings → Mondays @ 4:30pm (GHC 8102) → https://db.cs.cmu.edu Peloton Developer Meetings → Tuesdays @ 12:00pm (GHC 8115) → https://pelotondb.io Database Seminar Series → Thursdays @ 12:00pm (CIC 4 th Floor) → https://db.cs.cmu.edu/seminar2018 CMU 15-445/645 (Fall 2018)

  17. DATABASE RESEARCH Database Group Meetings → Mondays @ 4:30pm (GHC 8102) → https://db.cs.cmu.edu Peloton Developer Meetings → Tuesdays @ 12:00pm (GHC 8115) → https://pelotondb.io Database Seminar Series → Thursdays @ 12:00pm (CIC 4 th Floor) → https://db.cs.cmu.edu/seminar2018 CMU 15-445/645 (Fall 2018)

  18. Databases

  19. 18 DATABASE Organized collection of inter-related data that models some aspect of the real-world. Databases are core the component of most computer applications. CMU 15-445/645 (Fall 2018)

  20. 19 DATABASE EXAM PLE Create a database that models a digital music store to keep track of artists and albums. Things we need store: → Information about Artists → What Albums those Artists released CMU 15-445/645 (Fall 2018)

  21. 20 FLAT FILE STRAWM AN Store our database as comma-separated value (CSV) files that we manage in our own code. → Use a separate file per entity. → The application has to parse the files each time they want to read/update records. CMU 15-445/645 (Fall 2018)

  22. 21 FLAT FILE STRAWM AN Create a database that models a digital music store. Album (name, artist, year) Artist (name, year, country) "Enter the Wu Tang","Wu Tang Clan",1993 "Wu Tang Clan",1992,"USA" "St.Ides Mix Tape","Wu Tang Clan",1994 "Notorious BIG",1992,"USA" "AmeriKKKa's Most Wanted","Ice Cube",1990 "Ice Cube",1989,"USA" CMU 15-445/645 (Fall 2018)

  23. 22 FLAT FILE STRAWM AN Example: Get the year that Ice Cube went solo. Artist (name, year, country) for line in file: "Wu Tang Clan",1992,"USA" record = parse (line) "Notorious BIG",1992,"USA" if “Ice Cube” == record[0]: print int (record[1]) "Ice Cube",1989,"USA" CMU 15-445/645 (Fall 2018)

  24. 23 FLAT FILES: DATA IN TEGRITY How do we ensure that the artist is the same for each album entry? What if somebody overwrites the album year with an invalid string? How do we store that there are multiple artists on an album? CMU 15-445/645 (Fall 2018)

  25. 24 FLAT FILES: IM PLEM EN TATIO N How do you find a particular record? What if we now want to create a new application that uses the same database? What if two threads try to write to the same file at the same time? CMU 15-445/645 (Fall 2018)

  26. 25 FLAT FILES: DURABILITY What if the machine crashes while our program is updating a record? What if we want to replicate the database on multiple machines for high availability? CMU 15-445/645 (Fall 2018)

  27. 26 DATABASE M AN AGEM EN T SYSTEM A DBMS is software that allows applications to store and analyze information in a database. A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases. CMU 15-445/645 (Fall 2018)

  28. 27 EARLY DBM Ss Database applications were difficult to build and maintain. Tight coupling between logical and physical layers. You have to (roughly) know what queries your app would execute before you deployed the database. Edgar F. Codd CMU 15-445/645 (Fall 2018)

  29. 27 EARLY DBM Ss Database applications were difficult to build and maintain. Tight coupling between logical and physical layers. You have to (roughly) know what queries your app would execute before you deployed the database. Edgar F. Codd CMU 15-445/645 (Fall 2018)

  30. 28 RELATIO N AL M O DEL Proposed in 1970 by Ted Codd. Database abstraction to avoid this maintenance: → Store database in simple data structures. → Access data through high-level language. → Physical storage left up to implementation. Edgar F. Codd CMU 15-445/645 (Fall 2018)

  31. 29 DATA M O DELS A data model is collection of concepts for describing the data in a database. A schema is a description of a particular collection of data, using a given data model. CMU 15-445/645 (Fall 2018)

  32. 30 DATA M O DEL Relational Key/Value Graph Document Column-family Array / Matrix Hierarchical Network CMU 15-445/645 (Fall 2018)

  33. 30 DATA M O DEL Relational Key/Value Graph Document Column-family Array / Matrix Hierarchical Network CMU 15-445/645 (Fall 2018)

  34. 30 DATA M O DEL Relational Key/Value Graph Document Column-family Array / Matrix Hierarchical Network CMU 15-445/645 (Fall 2018)

  35. 30 DATA M O DEL Relational Key/Value Graph Document Column-family Array / Matrix Hierarchical Network CMU 15-445/645 (Fall 2018)

  36. 30 DATA M O DEL Relational Key/Value Graph Document Column-family Array / Matrix Hierarchical Network CMU 15-445/645 (Fall 2018)

  37. 31 RELATIO N AL M O DEL Structure: The definition of relations and their contents. Integrity: Ensure the database’s contents satisfy constraints. Manipulation: How to access and modify a database’s contents. CMU 15-445/645 (Fall 2018)

  38. 32 RELATIO N AL M O DEL Artist (name, year, country) A relation is unordered set that contain the relationship of attributes name year country that represent entities. Wu Tang Clan 1992 USA Notorious BIG 1992 USA A tuple is a set of attribute values (also Ice Cube 1989 USA known as its domain) in the relation. n -ary Relation → Values are (normally) atomic/scalar. = → The special value NULL is a member of every domain. Table with n columns CMU 15-445/645 (Fall 2018)

  39. 33 RELATIO N AL M O DEL: PRIM ARY KEYS Artist (name, year, country) A relation’s primary key uniquely identifies a single tuple. name year country Wu Tang Clan 1992 USA Some DBMSs automatically create an Notorious BIG 1992 USA internal primary key if you don't Ice Cube 1989 USA define one. Auto-generation of unique integer primary keys: → SEQUENCE (SQL:2003) → AUTO_INCREMENT (MySQL) CMU 15-445/645 (Fall 2018)

  40. 33 RELATIO N AL M O DEL: PRIM ARY KEYS Artist (id, name, year, country) A relation’s primary key uniquely identifies a single tuple. id name year country 123 Wu Tang Clan 1992 USA Some DBMSs automatically create an 456 Notorious BIG 1992 USA internal primary key if you don't 789 Ice Cube 1989 USA define one. Auto-generation of unique integer primary keys: → SEQUENCE (SQL:2003) → AUTO_INCREMENT (MySQL) CMU 15-445/645 (Fall 2018)

Recommend


More recommend