Database Data Models: Structure, Keys, and ER Design

Data Models: Blueprint for Data Management

Data models serve as the blueprint for how data is structured, stored, and manipulated. The evolution of these models reflects the growing need for flexibility and the ability to handle complex relationships.

Evolution of Data Models

1. Hierarchical Data Model

The Hierarchical Model is the oldest of the three, popularized by IBM’s Information Management System (IMS) in the 1960s. It organizes data in a top-down, tree-like structure.

  • Structure: Data is represented as a collection of “segments” connected by parent-child links.
  • Relationship: Strictly One-to-Many (1:N). Each child can have only one parent, but a parent can have multiple children.
  • Navigation: Accessing data requires “traversing” the tree from the root node down the branches.
  • Best For: Simple, fixed structures like file systems (folders and files) or organizational charts.

2. Network Data Model

The Network Model was developed to address the rigidity of the hierarchical model. It was formalized by the CODASYL group in the early 1970s.

  • Structure: Data is represented as a graph rather than a tree.
  • Relationship: Supports Many-to-Many (M:N). Unlike the hierarchical model, a child node (called a “member”) can have multiple parent nodes (called “owners”).
  • Navigation: Uses “pointers” or physical addresses to link records, making it faster but more complex to manage manually.
  • Best For: Complex applications like banking or telecommunications where many-to-many relationships are common.

3. Relational Data Model

Introduced by E.F. Codd in 1970, the Relational Model is the foundation for modern databases like MySQL, PostgreSQL, and Oracle.

  • Structure: Data is organized into tables (relations) consisting of rows (tuples) and columns (attributes).
  • Relationship: Links between tables are established using Keys (Primary and Foreign keys) rather than physical pointers. It supports 1:1, 1:N, and M:N relationships.
  • Navigation: Users use a declarative language (SQL) to tell the system what data they want, rather than how to navigate to it.
  • Best For: General-purpose data management, e-commerce, and any application requiring high flexibility and data integrity.

Comparison Summary

FeatureHierarchicalNetworkRelational
Data StructureTree (Parent-Child)Graph (Owner-Member)Tables (Relations)
RelationshipOne-to-Many onlyMany-to-ManyMany-to-Many (via keys)
FlexibilityLow (Rigid)ModerateHigh
Data AccessNavigation PathPointers/LinksSQL Queries
Data IndependenceLowLowHigh

Entity-Relationship (ER) Model Concepts

In the Entity-Relationship (ER) Model, the terms Entity, Entity Type, and Entity Set are often used interchangeably in casual conversation, but they represent distinct levels of abstraction in database design.

1. Entity (The Instance)

An Entity is a specific, unique object or concept in the real world that has an independent existence. It is a single “occurrence” of something you want to track.

  • Physical Existence: A specific person (e.g., “John Doe”), a specific car, or a specific building.
  • Conceptual Existence: A specific bank account, a specific course (e.g., “Intro to CS”), or a job position.
  • Analogy: If your database is a spreadsheet, a single row is an entity.

2. Entity Type (The Blueprint)

An Entity Type is the category or “class” that defines the structure for a group of similar entities. It describes the schema—what properties (attributes) all entities of this type will have.

  • Example: STUDENT is an entity type. It defines that every student will have a Name, Roll_Number, and Date_of_Birth.
  • ER Representation: It is represented by a Rectangle.
  • Analogy: The column headers and the table name (the structure) represent the Entity Type.

3. Entity Set (The Collection)

An Entity Set is the actual collection of all entities of a particular type present in the database at any given moment.

  • Example: The set of all students currently enrolled in a university forms the STUDENT entity set.
  • Key Detail: While an Entity Type is a static definition, an Entity Set is a dynamic “snapshot” of the data currently stored.
  • Analogy: The entire table (all rows combined) is the Entity Set.

Key Differences at a Glance

FeatureEntityEntity TypeEntity Set
LevelInstance (Single)Template (Schema)Collection (Data)
FocusOne specific objectCharacteristics/AttributesAll objects of that type
ER SymbolNot directly shownRectangleThe data within the rectangle
ExampleStudent “Alice”STUDENT (Name, ID){Alice, Bob, Charlie}
Strong vs. Weak Entity Types
  • Strong Entity Type: Can exist on its own and has a Primary Key (e.g., EMPLOYEE with Emp_ID). Represented by a single rectangle.
  • Weak Entity Type: Cannot be uniquely identified by its own attributes and depends on a “parent” entity (e.g., DEPENDENT of an employee). Represented by a double rectangle.

Attributes, Keys, and Integrity Constraints

To build a robust database, you need to define the properties of your data (Attributes), how to find specific records (Keys), and the rules that keep the data accurate (Integrity Constraints).

1. Types of Attributes

Attributes are the characteristics that describe an entity. In an ER diagram, they are represented by ovals.

  • Simple vs. Composite:
    • Simple: Atomic values that cannot be divided (e.g., Age).
    • Composite: Can be divided into sub-parts (e.g., Name into First_Name and Last_Name).
  • Single-valued vs. Multi-valued:
    • Single-valued: Only one value per entity (e.g., Date_of_Birth).
    • Multi-valued: Can have multiple values (e.g., Phone_Number). Represented by a double oval.
  • Derived: Not stored physically but calculated from other attributes (e.g., Age derived from Birth_Date). Represented by a dashed oval.
  • Key Attribute: Uniquely identifies an entity (e.g., Roll_No). Represented by an underlined label.

2. Keys in DBMS

Keys are used to identify rows in a table and establish relationships between tables.

  • Super Key: Any set of one or more attributes that can uniquely identify a record. It may contain extra, unnecessary attributes.
  • Candidate Key: A “minimal” super key. It has no redundant attributes. A table can have multiple candidate keys.
  • Primary Key: The specific candidate key chosen by the DBA to uniquely identify records. It cannot be NULL.
  • Alternate Key: Candidate keys that were not chosen as the primary key.
  • Foreign Key: An attribute in one table that refers to the Primary Key of another table, creating a link between them.

3. Integrity Constraints

These are the “rules of the game” that prevent accidental damage to the database during updates or deletions.

ConstraintDescriptionExample
Domain ConstraintRestricts the values an attribute can take based on data type or range.A Gender column only accepting ‘M’, ‘F’, or ‘O’.
Entity IntegrityEvery table must have a Primary Key, and it cannot be NULL.You can’t have a Student record without a Student_ID.
Referential IntegrityA Foreign Key must either match an existing Primary Key in the parent table or be NULL.You can’t assign an Order to a Customer_ID that doesn’t exist.
Key ConstraintEnsures that the key attributes remain unique across all rows.No two employees can share the same Social_Security_Number.
How They Work Together

Imagine a University Database:

  • Attribute: Student_Email is a multi-valued attribute (students have personal and school emails).
  • Key: Student_ID is the Primary Key.
  • Integrity: If you try to delete a Professor record while they are still assigned to a Course, Referential Integrity will block the deletion to prevent “orphaned” courses.

Designing an Entity-Relationship Diagram (ERD)

Designing an ER Diagram is the process of mapping out a system’s logic before it is built into a physical database. It involves identifying the “things” (entities), their “properties” (attributes), and how they “interact” (relationships).

1. Steps to Design an ER Diagram

Designing an ERD is an iterative process that moves from abstract requirements to a technical map.

  1. Identify Entities: Look for the nouns in your requirements (e.g., Customer, Order, Product).
  2. Define Relationships: Identify the verbs that connect these nouns (e.g., Customer places Order, Order contains Product).
  3. Add Attributes: Determine what data you need to store for each entity (e.g., Customer_Name, Price, Order_Date).
  4. Assign Keys: Choose a Primary Key (underlined) for each entity to ensure every record is unique.
  5. Define Cardinality: Specify the numerical constraints (1:1, 1:N, or M:N).
  6. Review and Refine: Remove redundant relationships and ensure all weak entities have a “parent” (owner) entity.

2. Cardinality and Participation

These constraints define the rules of the relationship between two entities.

Cardinality (Max Limit)

This tells you the maximum number of times an instance in one entity set can relate to instances in another.

  • One-to-One (1:1): One person has one Passport.
  • One-to-Many (1:N): One Customer can place many Orders.
  • Many-to-Many (M:N): Many Students can enroll in many Courses.
Participation (Min Limit)

This tells you if an entity must participate in a relationship to exist.

  • Total Participation: Every entity in the set must be involved in the relationship. Represented by a double line. (e.g., Every Loan must belong to a Customer).
  • Partial Participation: Some entities may not participate. Represented by a single line. (e.g., Not every Customer has to have a Loan).

3. Notation Cheat Sheet (Chen’s Notation)

When drawing, use these standard symbols to communicate clearly with developers and stakeholders:

SymbolMeaning
RectangleEntity Type
Double RectangleWeak Entity Type
DiamondRelationship
Double DiamondIdentifying Relationship (for weak entities)
OvalAttribute
Underlined OvalPrimary Key
Double OvalMulti-valued Attribute
Dashed OvalDerived Attribute
Pro-Tip: Avoid the “Spider Web”

As diagrams grow, they can become unreadable. Best practices suggest grouping related entities and using Crow’s Foot notation if you are moving toward physical implementation, as it is often cleaner for large-scale systems than Chen’s notation.