Understanding the Entity Relationship Model (ERM)
The Entity Relationship Model (ERM)
Forms the Basis of an Entity Relationship Diagram (ERD)
The ERM provides a conceptual view of a database as seen by the end user. It defines the main components of a database, including:
- Entities
- Attributes
- Relationships
Entities
Entities represent objects of interest to the end user. They refer to an entity set rather than a single entity occurrence. In the relational environment, an entity corresponds to a table, while an entity instance or occurrence corresponds to a table row. Entities are typically represented by rectangles in Chen, Crow’s Foot, and UML notations.
Attributes
Attributes are characteristics of entities. They can be classified as:
- Required: Must have a value.
- Optional: May be left empty.
Other important attribute concepts include:
- Domain: Set of possible values.
- Identifier: Uniquely identifies each entity instance.
- Composite identifier: Primary key composed of multiple attributes.
- Composite attribute: Can be subdivided into additional attributes.
- Simple attribute: Cannot be subdivided.
- Single-valued attribute: Has only one value.
- Multivalued attribute: Has multiple values.
- Derived attribute: Calculated from other attributes.
Advantages and Disadvantages of Storing Derived Attributes
Stored Derived Attribute
- Advantage: Saves processing time and data access time.
- Disadvantage: Requires maintenance to ensure accuracy.
Non-Stored Derived Attribute
- Advantage: Saves storage space and always yields current value.
- Disadvantage: Uses processing time and increases data access time.
Relationships, Connectivity, and Cardinality
Relationships represent associations between entities. Key concepts include:
- Participants: Entities involved in the relationship.
- Connectivity: Classification of the relationship (e.g., 1:1, 1:M, M:N).
- Cardinality: Minimum and maximum number of entity occurrences associated with one occurrence of a related entity.
Existence Dependence
- Existence-dependent entity: Exists only when associated with another entity.
- Existence-independent entity: Exists apart from related entities (strong entity).
Relationship Strength
- Weak (non-identifying) relationship: Primary key of the related entity does not contain a primary key component of the parent entity.
- Strong (identifying) relationship: Primary key of the related entity contains a primary key component of the parent entity.
Weak Entities
Weak entities are existence-dependent and have a primary key partially or totally derived from the parent entity. Their existence relies on the association with another entity.
Relationship Participation
- Optional participation: An entity occurrence does not require a corresponding entity occurrence in the relationship.
- Mandatory participation: An entity occurrence requires a corresponding entity occurrence in the relationship.
Crow’s Foot Symbols
Crow’s Foot notation uses symbols to represent cardinality and participation in relationships.
Relationship Degree
Relationship degree indicates the number of entities involved:
- Unary: Within a single entity.
- Binary: Two entities.
- Ternary: Three entities.
Recursive Relationships
Recursive relationships occur within a single entity type, where an entity can be related to other occurrences of the same entity set.
Associative (Composite) Entities
Associative entities represent M:N relationships between entities. They have a 1:M relationship with the parent entities and are composed of their primary key attributes.
Database Design Challenges: Conflicting Goals
Database designers often face conflicting goals, such as balancing design standards with processing speed and information generation.
Entity Supertypes and Subtypes
Entity Supertype
A generic entity type related to one or more entity subtypes, containing common characteristics.
Entity Subtype
Contains unique characteristics of each entity subtype. Subtypes exist within the context of a supertype and inherit attributes and relationships from it.
Specialization Hierarchy
Depicts the arrangement of supertypes (parent entities) and subtypes (child entities) in an “is-a” relationship.
Inheritance
Enables subtypes to inherit attributes and relationships from the supertype.
Subtype Discriminator
An attribute in the supertype that determines the related subtype.
Disjoint and Overlapping Constraints
- Disjoint subtypes: Contain unique subsets of the supertype entity set.
- Overlapping subtypes: Contain nonunique subsets of the supertype entity set.
Completeness Constraint
- Partial completeness: Not every supertype occurrence is a member of a subtype.
- Total completeness: Every supertype occurrence must be a member of at least one subtype.
Specialization and Generalization
- Specialization: Top-down process of identifying subtypes from a supertype.
- Generalization: Bottom-up process of identifying a supertype from subtypes.
Entity Integrity: Selecting Primary Keys
Primary Keys
Uniquely identify each entity instance and guarantee entity integrity.
Natural Keys and Primary Keys
Natural keys are real-world identifiers used to uniquely identify objects. They can be used as primary keys.
Primary Key Guidelines
Guidelines for selecting primary keys include considering natural keys, composite keys, and surrogate keys.
Summary
The EER model extends the ER model with supertypes, subtypes, and clusters. Natural keys, composite keys, and surrogate keys play important roles in entity integrity. Time-variant data and fan traps are additional considerations in database design.
The Normalization Process
Normalization is the process of organizing data to reduce redundancy and improve data integrity. The different normal forms (1NF, 2NF, 3NF, BCNF, 4NF) address specific types of data anomalies.
Functional Dependence
Functional dependence occurs when one attribute determines the value of another attribute.
Conversion to First Normal Form (1NF)
1NF eliminates repeating groups and ensures that all attributes are dependent on the primary key.
The Boyce-Codd Normal Form (BCNF)
BCNF is a stricter form of 3NF, ensuring that every determinant is a candidate key.
Understanding Normalization
Normalization helps to organize data efficiently and avoid redundancy. Functional dependencies and transitive dependencies are key concepts in the normalization process.
