Understanding the Entity Relationship Model (ERM)

The Entity Relationship Model (ERM)

Forms the Basis of an Entity Relationship Diagram (ERD)

The ERM provides a conceptual view of a database as seen by the end user. It defines the main components of a database, including:

  • Entities
  • Attributes
  • Relationships

Entities

Entities represent objects of interest to the end user. They refer to an entity set rather than a single entity occurrence. In the relational environment, an entity corresponds to a table, while an entity instance or occurrence corresponds to a table row. Entities are typically represented by rectangles in Chen, Crow’s Foot, and UML notations.

Attributes

Attributes are characteristics of entities. They can be classified as:

  • Required: Must have a value.
  • Optional: May be left empty.

Other important attribute concepts include:

  • Domain: Set of possible values.
  • Identifier: Uniquely identifies each entity instance.
  • Composite identifier: Primary key composed of multiple attributes.
  • Composite attribute: Can be subdivided into additional attributes.
  • Simple attribute: Cannot be subdivided.
  • Single-valued attribute: Has only one value.
  • Multivalued attribute: Has multiple values.
  • Derived attribute: Calculated from other attributes.

Advantages and Disadvantages of Storing Derived Attributes

Stored Derived Attribute

  • Advantage: Saves processing time and data access time.
  • Disadvantage: Requires maintenance to ensure accuracy.

Non-Stored Derived Attribute

  • Advantage: Saves storage space and always yields current value.
  • Disadvantage: Uses processing time and increases data access time.

Relationships, Connectivity, and Cardinality

Relationships represent associations between entities. Key concepts include:

  • Participants: Entities involved in the relationship.
  • Connectivity: Classification of the relationship (e.g., 1:1, 1:M, M:N).
  • Cardinality: Minimum and maximum number of entity occurrences associated with one occurrence of a related entity.

Existence Dependence

  • Existence-dependent entity: Exists only when associated with another entity.
  • Existence-independent entity: Exists apart from related entities (strong entity).

Relationship Strength

  • Weak (non-identifying) relationship: Primary key of the related entity does not contain a primary key component of the parent entity.
  • Strong (identifying) relationship: Primary key of the related entity contains a primary key component of the parent entity.

Weak Entities

Weak entities are existence-dependent and have a primary key partially or totally derived from the parent entity. Their existence relies on the association with another entity.

Relationship Participation

  • Optional participation: An entity occurrence does not require a corresponding entity occurrence in the relationship.
  • Mandatory participation: An entity occurrence requires a corresponding entity occurrence in the relationship.

Crow’s Foot Symbols

Crow’s Foot notation uses symbols to represent cardinality and participation in relationships.

Relationship Degree

Relationship degree indicates the number of entities involved:

  • Unary: Within a single entity.
  • Binary: Two entities.
  • Ternary: Three entities.

Recursive Relationships

Recursive relationships occur within a single entity type, where an entity can be related to other occurrences of the same entity set.

Associative (Composite) Entities

Associative entities represent M:N relationships between entities. They have a 1:M relationship with the parent entities and are composed of their primary key attributes.

Database Design Challenges: Conflicting Goals

Database designers often face conflicting goals, such as balancing design standards with processing speed and information generation.

Entity Supertypes and Subtypes

Entity Supertype

A generic entity type related to one or more entity subtypes, containing common characteristics.

Entity Subtype

Contains unique characteristics of each entity subtype. Subtypes exist within the context of a supertype and inherit attributes and relationships from it.

Specialization Hierarchy

Depicts the arrangement of supertypes (parent entities) and subtypes (child entities) in an “is-a” relationship.

Inheritance

Enables subtypes to inherit attributes and relationships from the supertype.

Subtype Discriminator

An attribute in the supertype that determines the related subtype.

Disjoint and Overlapping Constraints

  • Disjoint subtypes: Contain unique subsets of the supertype entity set.
  • Overlapping subtypes: Contain nonunique subsets of the supertype entity set.

Completeness Constraint

  • Partial completeness: Not every supertype occurrence is a member of a subtype.
  • Total completeness: Every supertype occurrence must be a member of at least one subtype.

Specialization and Generalization

  • Specialization: Top-down process of identifying subtypes from a supertype.
  • Generalization: Bottom-up process of identifying a supertype from subtypes.

Entity Integrity: Selecting Primary Keys

Primary Keys

Uniquely identify each entity instance and guarantee entity integrity.

Natural Keys and Primary Keys

Natural keys are real-world identifiers used to uniquely identify objects. They can be used as primary keys.

Primary Key Guidelines

Guidelines for selecting primary keys include considering natural keys, composite keys, and surrogate keys.

Summary

The EER model extends the ER model with supertypes, subtypes, and clusters. Natural keys, composite keys, and surrogate keys play important roles in entity integrity. Time-variant data and fan traps are additional considerations in database design.

The Normalization Process

Normalization is the process of organizing data to reduce redundancy and improve data integrity. The different normal forms (1NF, 2NF, 3NF, BCNF, 4NF) address specific types of data anomalies.

Functional Dependence

Functional dependence occurs when one attribute determines the value of another attribute.

Conversion to First Normal Form (1NF)

1NF eliminates repeating groups and ensures that all attributes are dependent on the primary key.

The Boyce-Codd Normal Form (BCNF)

BCNF is a stricter form of 3NF, ensuring that every determinant is a candidate key.

Understanding Normalization

Normalization helps to organize data efficiently and avoid redundancy. Functional dependencies and transitive dependencies are key concepts in the normalization process.