Understanding Relational Databases, SQL, and NoSQL: A Comprehensive Guide
1. Informal Design Guidelines for Relational Schema Design
Informal design guidelines for relational schema design help ensure efficiency, flexibility, and manageability. Here are some key guidelines:
- Simplicity: Aim for an easy-to-understand and maintain schema. Avoid excessive complexity.
- Naming Conventions: Use meaningful and consistent names for tables and columns (e.g., use ‘Employee’ instead of ‘Emp’).
- Eliminate Redundancy: Apply normalization techniques (up to the third normal form) to minimize data redundancy.
- Design for Performance: Consider indexing frequently queried columns and optimize join operations.
- Referential Integrity: Use foreign key constraints to enforce referential integrity between tables.
- Normalization: Normalize the schema to eliminate redundant data and ensure logical data dependencies.
- Scalability: Design the schema to handle future data growth. Consider partitioning large tables.
- Data Integrity Constraints: Define constraints like PRIMARY KEY, UNIQUE, NOT NULL, and CHECK.
- Documentation: Document the schema design, including table purposes, column meanings, and relationships.
- Iteration and Feedback: Continuously review and refine the schema based on feedback and performance analysis.
2. Functional Dependency and Inference Rules
Functional Dependency (FD): In a relational database, an FD is a constraint describing the relationship between attributes. X→Y means that if two tuples have the same values for attributes in X, they must also have the same values for attributes in Y.
Inference Rules:
- Reflexive Rule: If B is a subset of A, then A logically determines B (B ⊆ A then A → B).
- Augmentation Rule: If A determines B, adding any extra attribute doesn’t change the dependency (A → B, then AC → BC).
- Transitive Rule: If A determines B and B determines C, then A indirectly determines C (A → B and B → C then A → C).
- Union Rule: If A determines B and C, then A determines BC (A → B and A → C then A → BC).
- Decomposition Rule: If A determines BC, then it can be decomposed as A → B and A → C (A → BC then A → B and A → C).
- Pseudo Transitive Rule: If A determines B and BC determines D, then AC determines D (A → B and BC → D then AC → D).
3. Equivalence of Functional Dependency Sets
To determine if F and E are equivalent, we need to check if they imply each other.
Sets of Functional Dependencies:
- Set F: A → C, AC → D, E → AD, E → H
- Set E: A → CD, E → AH
Analysis: Both F and E imply each other. Therefore, sets F and E are equivalent.
4. Assertions and Triggers in Databases
Assertions ensure data integrity. Example:
CREATE ASSERTION CheckMinSalary CHECK (NOT EXISTS ( SELECT * FROM Employees WHERE Salary < 30000 ));
Triggers automate actions. Example:
CREATE TABLE Orders ( OrderID INT PRIMARY KEY, OrderDate DATE, LastUpdated TIMESTAMP );
CREATE TRIGGER UpdateLastUpdated BEFORE UPDATE ON Orders FOR EACH ROW BEGIN SET NEW.LastUpdated = CURRENT_TIMESTAMP; END;
5. System Log in Database Transactions
A transaction log records all actions executed by a DBMS, ensuring ACID properties. Key functions include recording transactions, individual transaction recovery, system startup recovery, point-in-time recovery, and supporting replication and high availability.
6. Stored Procedures in SQL
A stored procedure is a set of SQL statements stored and executed on the database server. Benefits include reduced network traffic, reusability, security, and performance.
Example:
CREATE PROCEDURE GetEmployeeDetails @EmployeeID INT AS BEGIN SELECT FirstName, LastName, Department FROM Employees WHERE EmployeeID = @EmployeeID; END;
7. Two-Phase Locking (2PL) Protocol
2PL ensures serializability and concurrency control. It has two phases: growing (acquiring locks) and shrinking (releasing locks). Variations include strict 2PL and rigorous 2PL.
8. The Need for Concurrency Control
Concurrency control ensures data consistency and integrity during simultaneous transaction execution. It prevents data inconsistency, ensures isolation, and maintains database integrity.
9. NoSQL and the CAP Theorem
NoSQL databases handle large volumes of diverse data. The CAP theorem states that a distributed data store cannot simultaneously provide consistency, availability, and partition tolerance.
10. Document-Based NoSQL Systems and MongoDB
Document-based NoSQL systems store data in documents (e.g., JSON). MongoDB is a popular example. CRUD operations in MongoDB include insertOne/insertMany (Create), find (Read), updateOne/updateMany/replaceOne (Update), and deleteOne/deleteMany (Delete).
11. NoSQL Graph Databases and Neo4j
NoSQL graph databases store data in graph format using nodes and edges. Neo4j is a leading example, offering native graph storage, ACID compliance, Cypher query language, high performance, and scalability.
12. Cursors in Embedded SQL
A cursor retrieves and navigates through query results. Properties include declaration, opening, fetching, and closing. Examples demonstrate cursor usage.
15. Assertions and Triggers in SQL
Assertions enforce constraints. Example:
CREATE ASSERTION SALARY_CONSTRAINT CHECK (NOT EXISTS ( SELECT * FROM EMPLOYEE E, EMPLOYEE M, DEPARTMENT D WHERE E.Salary > M.Salary AND E.Dno = D.Dnumber AND D.Mgr_ssn = M.Ssn ));
Triggers automate actions. Example:
CREATE TABLE SALARY_LOG ( EmpID INT, OldSalary DECIMAL(10, 2), NewSalary DECIMAL(10, 2), ChangeDate DATETIME );
CREATE TRIGGER SalaryChangeTrigger AFTER UPDATE OF Salary ON EMPLOYEE FOR EACH ROW BEGIN INSERT INTO SALARY_LOG (EmpID, OldSalary, NewSalary, ChangeDate) VALUES (OLD.EmpID, OLD.Salary, NEW.Salary, NOW()); END;