Core Database Concepts: ER Models, Normalization, and SQL

Entity Relationship Model (ERD)

  • Connectivities: Describes how entities are related (e.g., 1:1, 1:M, M:N).

    • Example: A Student (1) can enroll in many Courses (M), so the relationship is 1:M.
  • Cardinalities: Indicates the number of instances of one entity related to another.

    • Example: A Professor can teach up to 3 Courses (cardinality: 0..3).
  • Strong Relationship: A relationship between two independent, strong entities.

    • Example: Student and Course are strong entities, each with its own primary key.
  • Weak Relationship: A relationship where one entity’s existence depends on another.

    • Example: Dependent (weak entity) depends on Employee (strong entity).
  • PK (Primary Key): A unique identifier for an entity.

    • Example: StudentID is the PK for the Student table.
  • FK (Foreign Key): A key used to link two tables together.

    • Example: CourseID in the Enrollment table is an FK referencing the Course table.
  • Linking Entity: An associative entity used to convert M:N relationships into two 1:M relationships.

    • Example: An Enrollment table links Student and Course.
  • Entity Supertypes and Subtypes: A method of generalization and specialization.

    • Example: Person (supertype) can have subtypes like Student and Professor.

Normalization

  • 1NF (First Normal Form): Ensures that table columns hold atomic values and eliminates repeating groups.

    • Example: A table with multiple phone numbers in a single cell is not in 1NF.
  • 2NF (Second Normal Form): Must be in 1NF and removes partial dependencies, meaning all non-key attributes must depend on the entire primary key.

    • Example: In a table with a composite primary key of (StudentID, CourseID), the Grade attribute should depend on both parts of the key.
  • 3NF (Third Normal Form): Must be in 2NF and removes transitive dependencies, meaning non-key attributes must depend only on the primary key.

    • Example: In a Student table, DepartmentName should not depend on a non-key attribute like DepartmentID; it should be in a separate Department table.

SQL Commands and Operations

  • CREATE TABLE: Defines a new table and its columns.

    CREATE TABLE Student (
        StudentID INT PRIMARY KEY,
        Name VARCHAR(50)
    );
  • INSERT: Adds new rows of data to a table.

    INSERT INTO Student (StudentID, Name) VALUES (1, 'John Doe');
  • SELECT: Retrieves data from one or more tables.

    SELECT Name FROM Student WHERE StudentID = 1;
  • UPDATE: Modifies existing data in a table.

    UPDATE Student SET Name = 'Jane Doe' WHERE StudentID = 1;
  • DELETE: Removes existing rows from a table.

    DELETE FROM Student WHERE StudentID = 1;
  • JOIN: Combines rows from two or more tables based on a related column.

    SELECT s.Name, c.CourseName 
    FROM Student s
    JOIN Enrollment e ON s.StudentID = e.StudentID 
    JOIN Course c ON e.CourseID = c.CourseID;

Database Views

  • View vs. Table: A view is a virtual table based on the result set of a SQL statement. It contains rows and columns, just like a real table, but does not store data itself.

    CREATE VIEW StudentCourses AS 
    SELECT s.Name, c.CourseName 
    FROM Student s 
    JOIN Enrollment e ON s.StudentID = e.StudentID 
    JOIN Course c ON e.CourseID = c.CourseID;

Database Triggers

  • Create Trigger: A stored procedure in a database that automatically executes when a specific event occurs in a table.

    CREATE TRIGGER UpdateStudentCount 
    AFTER INSERT ON Enrollment 
    FOR EACH ROW 
    UPDATE Course SET StudentCount = StudentCount + 1 
    WHERE CourseID = NEW.CourseID;
  • Trigger Events: Triggers can be set to execute on INSERT, UPDATE, or DELETE operations.

    • Example: A trigger can fire after a new row is inserted into the Enrollment table.
  • OLD and NEW Keywords: Within a trigger, these keywords access the column values before (OLD) and after (NEW) the data modification.

    • Example: OLD.CourseID refers to the course ID before an update, and NEW.CourseID refers to the new value.

Stored Procedures

  • Create and Invoke: A reusable block of SQL code that can be saved and executed multiple times.

    CREATE PROCEDURE GetStudentName(
        IN studentID INT,
        OUT studentName VARCHAR(50)
    )
    BEGIN
        SELECT Name INTO studentName FROM Student WHERE StudentID = studentID;
    END;
  • Parameters: Procedures can accept parameters of different modes: IN (input), OUT (output), and INOUT (input and output).

    • Example: IN studentID INT defines an input parameter.
  • Control Flow: Procedures can include control flow statements like IF, CASE, and LOOP.

    IF studentID = 1 THEN
        SET studentName = 'John Doe';
    END IF;

Database Transactions and Concurrency

  • Transaction Control: Commands like START TRANSACTION, COMMIT, and ROLLBACK are used to manage a transaction’s lifecycle.

    START TRANSACTION;
    UPDATE Account SET Balance = Balance - 100 WHERE AccountID = 1;
    UPDATE Account SET Balance = Balance + 100 WHERE AccountID = 2;
    COMMIT;
  • ACID Properties: A set of properties that guarantee transaction reliability.

    • Atomicity: A transaction is an all-or-nothing operation.
    • Consistency: A transaction brings the database from one valid state to another.
    • Isolation: Concurrent transactions execute without interfering with each other.
    • Durability: Once a transaction is committed, its changes are permanent.
  • Schedules and Serializability: Serializability ensures that the outcome of concurrent transactions is equivalent to executing them in some serial order.

  • Concurrency Issues: Problems that can occur in a multi-user environment, such as Dirty Read, Non-repeatable Read, and Phantom Read.

    • Example: A dirty read occurs when one transaction reads data that has been modified by another transaction but not yet committed.
  • Conflict Serializability: A schedule is conflict-serializable if it can be transformed into a serial schedule by swapping non-conflicting operations.

    • Example: A precedence graph can be used to test for conflicts between transactions.