entity relationship model
play

Entity-relationship model Introduction So far in the course, we - PDF document

Entity-relationship model Introduction So far in the course, we have studied a lot about extracting useful information from an already-existing database. However, we have only barely brushed up against getting data into a database. The next


  1. Entity-relationship model Introduction So far in the course, we have studied a lot about extracting useful information from an already-existing database. However, we have only barely brushed up against getting data into a database. The next segment of this course will focus on this latter problem: how to design a useful database schema, and how to load (and alter) data in a database. One of the most important steps in creating a database happens before ever touching a database engine: how to model the slice of the real world that the database will eventually represent. A good database design balances detail and simplicity, providing sufficient detail to answer the kinds of questions that will be asked while providing a simplified view of the world that makes those questions reasonably easy to ask and the answers reasonably easy to interpret. To give a concrete example, suppose we had been tasked to create a database back-end for a CAD (computer-aided design) tool that specializes in the automotive industry. Clearly, the kinds of “car database” examples we’ve used in class would be woefully inadequate: a CAD engineer will not care in the slightest about VIN numbers and license plates (other than where they will be found in the final product); the engineer will care very much about the engine and wheels, but will demand far more detail than anything you would find in a car lot advertisement. At the other extreme, a molecule- accurate model of the car is also not useful to a car designer: nobody wants to manipulate individual molecules and atoms when designing something you can see with the naked eye! Instead, a useful database would probably store a collection of polygons in 3-D space, with adequate support to find and group polygons in various ways (by the part they correspond to, the location relative to a virtual camera’s field of view, etc.). Since very few database designers are CAD experts, any real database design for our imaginary application would have to start by exploring with tool developers what, precisely, they need the database to provide for them; similarly, since the CAD tool developers are not database experts, they would also need some guidance in expressing what they want in terms of what a database engine can possibly do (we really can’t do 3-D rendering, for example). In practice, this early design process is iterative, and proposed designs can change rapidly as people explore different options. Some form of visual representation, or model, of the problem is extremely useful to facilitate the design process and allow participants to express their thoughts clearly; we will focus on a particular model known as the Entity-Relationship Model (ERM).

  2. Entity-relationship model The ER model consists of three basic concepts, illustrated in two different styles below: name title Actors StarsIn Movies address year role • Entity set [rectangle] – represents a type of object we wish to model, such as a “customers,” “retailers” or “landmarks.” Each entity set can contain zero or more entities; “Canadian Tire” and “Tim Hortons” might be entities from the entity set “Retailers.” Entities almost always correspond to nouns in plain English descriptions of a design. 1 • Relationship set [diamond] – represents a type of relationship or link between two or more entity sets. Relationship sets often correspond to verbs (e.g. entities doing something to other entities), but sometimes as nouns as well; many relationships are common enough—and important enough—that they have their own name. Examples include “friendship,” “rivalry,” and “marriage.” Note that such relationship nouns can usually be converted into verbs, such as “friends with” or “married to.” • Attribute [oval] – represents a piece of information or detail about an entity or relationship. For example, color is an attribute of many objects, while starting and ending dates are common attributes of relationships. Attributes are often associated with adjectives and quantities in plain English: colors (red, green, blue, etc.) or salary ($10,000, $50,000, etc.) are good examples. As you can see from the above descriptions, there is some overlap between the concepts: some relationships are almost important enough to be entities in their own right; the same is true for some types of attributes, such as addresses. This flexibility is good because it allows different designs to model the real world in different ways and at different levels of detail. We will come back to this issue later on. Entity vs. entity set One point to be careful of: an entity set is a class of things, while an entity is one object of the class. For example, “Batman Returns" would be an entity from the entity set “movies.” Similarly, CSCC43 would be an entity from the entity set "courses." A similar distinction exists for relationships vs. relationship sets; Romeo and Juliet as well as Anthony and Cleopatra would both be relationships from the relationship set "ill-fated romances." In relational algebra these would correspond to relations and tuples, while in object oriented programming these would correspond to classes and objects or instances. Recursive relationships 1 The converse is not always true: not all nouns should be represented as Entities!

  3. Many relationship sets join two or more entity sets, but some join the same entity set twice. These are known as recursive relationships. In some relationships the two entities involved are symmetric, such as friends; in other cases the relationship is asymmetric, such as employee and boss. In the last example, both employee and boss are people, but there's a big difference between them as far as the relationship is concerned. Another example is shown on the right: two employees can be colleagues, and we don't usually distinguish between the two. However in the case of sovereign succession, it matters very much was the predecessor and two is the successor. N-way relationships Many relationships involve more than two entity sets. Roles For example, we may need to model actors in movies using a ternary relationship if we allow actors to play multiple roles. A binary relationship between actors and movies may not suffice, especially if the roles being Actors StarsIn Movies played span multiple movies and have been promoted to proper entities. As another example, each line item from the TPC-H schema involves an order, a part, and a supplier. In practice, it's rare to see anything more complicated than three-way relationships, but in theory any number of entity sets can be involved in a relationship set. Attributes Attributes describe elementary properties of entities or relationships (e.g., Surname, Salary and Age are attributes of Employee, as shown on the right). Attributes are used to describe details, aspects of the relationship or entity set that are not important enough to be entities their own right. For example, most of the time we don't care to track age 10 vs. age 12 separately. In the student example shown here, both entities and relationships have attributes. Note that the set of attributes we choose to add or model depends strongly on the application. Under some circumstances it might be very relevant to store the student's name along with their student number and enrolment date. Similarly, if all courses always had exactly one exam, it might not be very important to track the date the exam is given. Sometimes, it's attractive to model groups of related attributes as composite values. The example of the right groups the different parts of an address into a single composite attribute. It might also make sense to group surname and given name into a single composite name field.

  4. You should always be careful about using composite attributes, however, because they can also be a warning sign that you're trying to represent an entity set with attributes. Many databases track addresses as entities in their own right, for example. In the end, it's up to the database designer to decide whether the composite attribute is important enough to promote to an entity set. Putting it together The figure above gives a more complete example of using entities, relationships, and attributes to model a company hierarchy. Observe that employees and departments can be involved in different kinds of relationships: all employees are presumably members of a department, and some employees also manage departments. Constraints What important purpose of modeling a database is to identify the constraints that the database should impose on the data stored in it. The most common constraints involve cardinality: how many times a given entity may, or must, participate in a given a relationship. In the previous example of a company hierarchy, it would be reasonable to enforce that every employee must belong to one department, and no more than one department. It might also be reasonable to enforce that no employee may manage more than one department, while allowing most employees not to manage any departments. We usually represent cardinality constraints using a pair of numbers in parentheses: (m,n). In this case, m represents the minimum cardinality, and n represents the maximum cardinality. If M is zero, the relationship is optional, in the sense that an entity need not participate. If M is one or greater, the entity must participate at least that many times. N gives a maximum cardinality, and prevents an entity from participating too many times.

Recommend


More recommend