Database Management System Concepts and Architecture
1
What is DBMS? Describe the advantages and disadvantages of using DBMS
==ADatabase Management System (DBMS)
Is a software system that allows users to create, store, retrieve, update, and manage data in an organized way.
Examples: MySQL, Oracle, SQL Server, PostgreSQL.
Advantages of DBMS : *//Data Redundancy Control
DBMS reduces duplicate data by storing data centrally. *//Data Consistency
Since data is stored in one place, changes are reflected everywhere. *//Data Security
DBMS provides authentication, authorization, and access control. *//Data Sharing
Multiple users can access the same database simultaneously. *//Backup and Recovery
DBMS supports automatic backup and recovery after system failure. *//Data Integrity :
Constraints ensure accuracy and correctness of data.
Disadvantages of DBMS *//High Cost :
DBMS software, hardware, and maintenance are expensive. *//Complexity :
Requires skilled database administrators and trained users. *//Performance Overhead :
Small applications may run slower compared to file systems. *///Single Point of Failure :
If DBMS fails, entire data access may stop.
2. Describe 3-Schema Architecture of Database
The 3-Schema Architecture is used to separate user applications from physical database storage.
1. External Schema (View Level) : */
Represents user views *//
Each user sees only required data *//Example: Student sees marks, admin sees full records *//2. Conceptual Schema (Logical Level) :
Describes entire database structure *//Defines tables, attributes, relationships *//Independent of physical storage *//3. Internal Schema (Physical Level) :
Describes how data is stored *//Includes file organization, indexes, storage paths
Advantages :
Data abstraction *// Data independence *//Improved security
3. What are roles and responsibilities of DBA?
==DBA is responsible formanaging, controlling, and maintaining the database system. *//Roles and Responsibilities : Database Design
Creates schema, tables, relationships. *//Security Management :
Grants/revokes user access privileges. *//Backup and Recovery :
Ensures data safety and recovery after failure. *//Performance Monitoring :
Optimizes queries and indexes. *//Data Integrity
Enforces constraints and consistency rules. *//User Management
Creates users and controls permissions.
4.What is DDL and DML queries?
==DDL is used todefine, modify, and delete the structure of database objects such as tables, schemas, and indexes. **//DDL Commands :CREATE – Creates database objects Example: CREATE TABLE Student(id INT, name VARCHAR(50)); *//ALTER – Modifies table structure. Example: ALTER TABLE Student ADD age INT; *//DROP – Deletes database objects . Example: DROP TABLE Student;
*//TRUNCATE – Removes all records from a table. Example: TRUNCATE TABLE Student;
DML is used to insert, update, delete, and retrieve data from database tables.
DML Commands: INSERT – Adds new records. Example: INSERT INTO Student VALUES (1, 'Ram'); *//UPDATE – Modifies existing records. Example: UPDATE Student SET name='Shyam' WHERE id=1; *//DELETE – Removes records. Example: DELETE FROM Student WHERE id=1; *//SELECT – Retrieves data. Example: SELECT * FROM Student;
1. Define normalization. Importance. Explain 1NF, 2NF, 3NF with examples
== Normalization is the process of organizing data to reduce redundancy and dependency.
Importance of Normalization :
Eliminates data redundancy *//Improves data integrity*
//Reduces update anomalies *//Efficient data storage
First Normal Form (1NF) :
No multivalued attributes *//Atomic values only
Example (Not 1NF) :
Student (ID, Subjects)
1NF Table :
Student(ID, Subject)
Second Normal Form (2NF) :
Must be in 1NF. No partial dependency
Example :
Marks(StudentID, CourseID, StudentName)
StudentName depends only on StudentID
Third Normal Form (3NF) :
Must be in 2NF *// No transitive dependency
Example :
Employee(EmpID, DeptID, DeptName)
DeptName depends on DeptID, not EmpID
1. What is a File System? Explain advantages of DBMS over File System
== A file system is a traditional method of storing data in files using operating system facilities. Each application manages its own files separately.
Advantages of DBMS over File System : *//Reduced Data Redundancy
DBMS stores data centrally, avoiding duplication. *//Improved Data Consistency
Changes in DBMS are reflected everywhere automatically. *//Better Data Security
DBMS provides user authentication and access control. *//Data Sharing
Multiple users can access data at the same time. *//Backup and Recovery
DBMS supports automatic recovery after failures. *//Data Integrity
Constraints ensure correct and valid data. *//Concurrency Control
Prevents conflicts when multiple users access data.
2. Difference between Logical and Physical Data Independence
== Logical Data independence:
//Change logical schema without affecting users //Conceptual level // Logical structure // Harder // Add attribute 2.
Physical Data Independence
//Change physical storage without affecting logical schema //Internal level // Physical storage // Easier //Add index
3. Difference between Primary Key, Super Key, Candidate Key, and Partial Key
==
1. Super Key :
Asuper key is any set of one or more attributes that can uniquely identify a record in a table. *//Characteristics :
May contain extra (unnecessary) attributes *//Ensures uniqueness *// Can be single or composite //Example :
Student(RollNo, Name, Email)
Super Keys: //{RollNo} // {Email} //{RollNo, Name}
2. Candidate Key :
Acandidate key is a minimal super key, meaning no attribute can be removed and still maintain uniqueness.
Characteristics :
No redundant attributes //Uniquely identifies records //One table can have multiple candidate keys Example :
Student(RollNo, Email, Name) *//Candidate Keys:
:; {RollNo} //{Email}
3. Primary Key :
Aprimary key is one candidate key chosen to uniquely identify records in a table. *//Characteristics :
Must be unique //Cannot be NULL //Only one primary key per table *//Example :
Student(RollNo, Name, Email) *//Primary Key: RollNo
4. Partial Key :
Apartial key is an attribute that uniquely identifies weak entities when combined with a primary key of another table. *///Characteristics :
Used in weak entities //
Not sufficient alone to identify a record //Works with a foreign key *// Example :
Employee(EmpID, Name) // Dependent(EmpID, DepName, Age) //Partial Key:
DepName //
Composite Key: (EmpID, DepName)
4. Selection and Projection Operations in Relational Algebra
== Selection (σ) :
Selects rows based on condition *// Syntax: σ condition (Relation) *///Projection (π) :
Selects specific columns // Syntax: π attributes (Relation)
*///RA Query : 1. Schema:
Student(RollNo, Name, Address, Level, DOB)
2. Query: Find names of students in Bachelor level living in Birgunj
π Name ( σ Level = 'Bachelor' AND Address = 'Birgunj' (Student) )
5. How Under-the-Department-With Operation Works
== This refers to Division operation (÷)
in relational algebra.
Purpose :
Find entities related toall values in another relation.
Example : //
Employee(Eid, Dept) // RequiredDept(Dept)
Employee ÷ RequiredDept
Result gives employees working in all departments listed.
6. How Timestamp-Based Algorithm Serializes Concurrent Transactions
==
Timestamp-based algorithm assigns unique timestamps to transactions and executes them in timestamp order.
*//Each transaction gets a timestamp at start. *//Each data item keeps:
*//Read Timestamp (RTS) *//Write Timestamp (WTS) *//Read allowed if TS ≥ WTS
*//Write allowed if TS ≥ RTS and WTS *//Otherwise transaction is aborted and restarted.
Advantages :
Deadlock free *//No locking required *///Disadvantages :
Higher rollback rate *//Starvation possible
7. Types of Indexing and SQL Index Creation
== Types of Indexing :
Primary Index *//Secondary Index *// Dense Index *//Sparse Index *//Single-Level Index *//Multi-Level Index *//Clustered Index *//Non-Clustered Index
Creating Index using SQL :
CREATE INDEX idx_name
ON Student(Name);
Unique Index :
CREATE UNIQUE INDEX idx_roll
ON Student(RollNo);
1. Advantages of Normalization. Explain 3NF and BCNF with example
== Advantages of Normalization :
Reduces redundancy // Improves data integrity // Removes anomalies // Efficient storage //Easy maintenance
Third Normal Form (3NF) :
Must be in 2NF //No transitive dependency
Example
Employee(EmpID, DeptID, DeptName) *//DeptName depends on DeptID → violation
*//Split into: *//Employee(EmpID, DeptID) *//Department(DeptID, DeptName)
Boyce-Codd Normal Form (BCNF) :
Stronger than 3NF *//Every determinant must be a candidate key
Example :
Course(CourseID, Teacher, Room) *// Teacher → Room (Teacher not key) → violation *// Split tables to achieve BCNF.
3.How ER model can be mapped into relational model? Explain with proper example
==Mapping ER model into relational model means converting ER diagram components (entities, attributes, relationships) into tables (relations).
Steps to Map ER Model : 1. Mapping Strong Entity :
//Each strong entity becomes a table //Attributes become columns //Primary key remains primary key
Example : //
ER Entity: //Student(StudentID, Name, Address)
Relational Table:
Student(StudentID PK, Name, Address)
2. Mapping Weak Entity :
Weak entity becomes a table /*//Primary key = partial key + primary key of owner *// Example :
Dependent(EmpID, DepName, Age)
Dependent(EmpID PK/FK, DepName PK, Age)
3. Mapping One-to-One (1:1) Relationship
: Add foreign key in any one table : */Example :
Person ↔ Passport *// Person(PersonID PK, PassportNo FK)
4. Mapping One-to-Many (1:M) Relationship :
Add foreign key on many side
*// Example :
Department → Employee *//Employee(EmpID PK, DeptID FK)5. Mapping Many-to-Many (M:N) Relationship :
Create a new table
*//Primary key = combination of both entity keys //Example :
Student ↔ Course
Enroll(StudentID FK, CourseID FK)2. A) What is Inner Join?
== An INNER JOIN returns only those records that have matching values in both tables.
SQL Inner Join Query
SELECT C.Name//* FROM Customer C *//INNER JOIN Booking B ON C.Cid = B.Cid *//INNER JOIN Flight F ON B.FlightId = F.FlightId *//WHERE F.Destination = 'South Wales';
Customer joined with Booking using Cid *
//Booking joined with Flight using FlightId /*/
Only customers whose destination is South Wales are selected
7. ACID Properties and Serializability
==ACID Properties **//1. Atomicity :
Transaction executes completely or not at all. *// 2. Consistency :
Database remains in valid state. *//3. Isolation :
Transactions do not interfere. *// 4. Durability :
Committed data is permanent.
Serializability :
Serializability ensures that concurrent transactions produce the same result as serial execution. */*//Types of Serializability : 1. Conflict Serializability :
Transactions reordered without conflict *//Uses precedence graph
2. View Serializability : Same final result as serial schedule /*// More general than conflict serializability
Importance :
Maintains data consistency *// Ensures correctness in multi-user environment
6. Discretionary Access Control vs Mandatory Access Control
==Discretionary Access Control (DAC):
//Controlled by data owner //Owner can grant or revoke access *//High flexibility *//Moderate security //User identity and privileges //Owner decides access rights //Commercial DBMS, organizations
Mandatory Access Control (MAC):
Controlled by system policy *//Users cannot change permissions //Low flexibility //Very high security //Security labels and clearance levels //System enforces rules //Military, government systems
1. Define Database System. Write the Merits and Demerits of DBMS
==
A database system consists of:
Database (organized collection of data)
DBMS (software to manage data)
Users & Applications
It allows users to store, retrieve, update, and manage data efficiently.
Merits (Advantages) of DBMS : Reduced Data Redundancy
Same data is not stored multiple times. //Improved Data Consistency
Changes are reflected everywhere automatically. //Data Security
Access control, authentication, authorization. //Data Sharing
Multiple users can access data simultaneously. //Backup and Recovery
Data can be restored after system failure. //Data Integrity
Constraints ensure correctness of data. //Concurrency Control
Prevents conflicts during simultaneous access.
Demerits (Disadvantages) of DBMS : High Cost
Software, hardware, and maintenance cost. //Complexity
Requires trained DBA and skilled users. //Performance Overhead
Slower for small applications. //Single Point of Failure
DBMS failure affects all users.’
2. Three-Schema Architecture of Database System
== Thethree-schema architecture separates database design into three levels to achieve data abstraction and data independence.
1. External Schema (View Level) : //
User-specific views //Shows only required data //Improves security *///Example:
Student sees marks, admin sees full details.
2. Conceptual Schema (Logical Level) :
Describes entire database /
Tables, attributes, relationships //Independent of physical storage
3. Internal Schema (Physical Level) : How data is stored physically // File structure, indexes, storage location
Diagram (Text Form) :
User Views
↓
External Schema
↓
Conceptual Schema
↓
Internal Schema
↓
Physical Database
Advantages :
Data abstraction // Logical & physical data independence //Improved security
3. What are Entity and Attributes? Difference between Strong and Weak Entity Set
== An entity is a real-world object that has independent existence.
Example:
Student, Employee, Course *//Attributes :
Attributes are properties of an entity. *// Example:
Student → RollNo, Name, Address *//Strong Entity Set :
Has primary key //
Exists independently *//Example : Student(RollNo PK, Name, Address)
Weak Entity Set : //
Does not have primary key //Depends on strong entity //Uses partial key
Example:
Employee(EmpID PK)
Dependent(EmpID FK, DepName PartialKey)
Difference Table
4. Why Integrity Constraints are Used? Types with Examples
–Rules applied to database to maintain accuracy and consistency of data.
Types of Integrity Constraints
1. Domain Constraint :
Attribute values must be from a valid domain
Example : Age > 0
2. Entity Integrity Constraint :
Primary key cannot be NULL
Example : Student(RollNo PK) → RollNo cannot be NULL
3. Referential Integrity Constraint :
Foreign key must match primary key
Example :
Booking(Cid FK) must exist in Customer(Cid PK)
4. Key Constraint :
Ensures uniqueness of key values
Example:
RollNo must be unique
1. Define Database and DBMS. Explain the benefits of DBMS in detail
==A database is an organized collection of related data stored electronically so that it can be easily accessed, managed, and updated. **//A DBMS is software that allows users to create, store, retrieve, update, and manage data in a database.
Examples: MySQL, Oracle, SQL Server. *////Benefits of DBMS : Reduced Data Redundancy :
Data is stored centrally, avoiding duplication. *// Improved Data Consistency :
Same data appears consistently throughout the database. */// Data Security :
Access control, authentication, and authorization are provided. *// Data Sharing
: Multiple users can access data simultaneously. *///Backup and Recovery :
Automatic recovery after system failure. *///Data Integrity :
Constraints ensure valid and correct data. *// Concurrency Control :
Prevents conflicts during simultaneous access. **// Data Independence :
Changes in structure or storage do not affect applications.
2. SQL Constraints to Specify Rules for Data
==SQL constraints are rules enforced on data to maintain accuracy and integrity.
Types of SQL Constraints : NOT NULL :*//
Prevents NULL values *//Name VARCHAR(30) NOT NULL *//2.UNIQUE : *//
Ensures unique values *///Email VARCHAR(50) UNIQUE
*// 3.PRIMARY KEY :
Unique and NOT NULL */// RollNo INT PRIMARY KEY *//4.FOREIGN KEY :
Maintains referential integrity */// Cid INT REFERENCES Customer(Cid) *// 5.CHECK :
Validates condition */// Age CHECK (Age >= 18) /*//DEFAULT :
Assigns default value */// Country DEFAULT 'Nepal'
3. What is Relational Algebra? Explain RA operations with examples
==Relational Algebra is a procedural query language that uses mathematical operations to retrieve data from relations.
Relational Algebra Operations
1. Selection (σ)
– Selects rows *//σ Age > 20 (Student) *//2.Projection (π)
– Selects columns *//π Name, Address (Student) *//3.Union (∪)
– Combines relations *////////R ∪ S *//4.Set Difference (−) *//R − S *//5.Cartesian Product (×) : Student × Course *//Join (⨝) : Student ⨝ Enroll *//Division (÷) *//
Used for “for all” queries
4. Candidate Key vs Primary Key & Foreign Key
== Candidate Key :
A candidate key is a minimal set of attributes that can uniquely identify a record in a table. *// A table can have more than one candidate key. *// It cannot contain NULL values. *// All candidate keys are possible choices for the primary key. *// Example :
Student(RollNo, Email, Name) *// Candidate Keys:
RollNo, Email
Primary Key : *//
A primary key is one candidate key selected to uniquely identify each record in a table. *// A table can have only one primary key. *// It cannot be NULL and cannot have duplicate values. *//It is the main identifier of the table. *// Example :
Student(RollNo, Email, Name) *// Primary Key:
RollNo
Foreign Key :
A foreign key is an attribute (or set of attributes) in one table that refers to the primary key of another table.
It is used to establish and maintain relationships between tables. *// Example :
Borrower table: */// Borrower(CardNo PK, Name, Address)
5. What is a Stored Procedure? Use with Example
== A stored procedure is a precompiled set of SQL statements stored in the database and executed when called.
*//Uses :
Improves performance *//Reusable logic *//Enhances security
Example :
CREATE PROCEDURE GetStudents()
BEGIN
SELECT * FROM Student;
END;
Execution: CALL GetStudents();
6. Purposes of Concurrency Control & Two-Phase Locking Protocol
== *//Prevent lost updates *// Maintain consistency *//Avoid dirty reads *//Ensure isolation */// Support multi-user access
Two-Phase Locking (2PL) Protocol
Phase 1: Growing Phase :
Transaction acquires locks *//No locks are released *//Phase 2: Shrinking Phase :
Locks are released *// No new locks can be acquired
Advantages :
Ensures serializability *// Prevents inconsistency
Disadvantages :
Deadlock possible **/// Reduced concurrency
7. Database Security & Steps of Query Processing
== Database security protects data from unauthorized access, misuse, and threats.
Security Measures :
Authentication *// Authorization *// Encryption *// Access control
Steps of Query Processing : Parsing & Translation – SQL converted to internal form *// Query Optimization – Best execution plan chosen *// Execution – Query executed on database *// Result Output – Results returned to user
1. Fully and Partial Functional Dependency. Explain 2NF & 3NF
== Non-key attribute depends on part of composite key.
Example: (StudentID, CourseID) → StudentName
Fully Functional Dependency :
Non-key attribute depends on entire key.
Example: (StudentID, CourseID) → Grade
Second Normal Form (2NF) :
Must be in 1NF *// No partial dependency
Third Normal Form (3NF) :
Must be in 2NF *// No transitive dependency
Example: EmpID → DeptID → DeptName
2. What is Functional Dependency? Explain its Types with Examples
==
An attributeY is functionally dependent on X if X determines Y.
Notation: X → Y
Types of Functional Dependency
1. Trivial FD : If Y ⊆ X *// Example : {RollNo, Name} → Name
2. Non-Trivial FD : Y is not a subset of X //*
Example :
RollNo → Name
3. Fully Functional Dependency : Y depends on whole X *//Example : (StudentID, CourseID) → Grade
4. Partial Dependency : Y depends on part of composite key
Example :
(StudentID, CourseID) → StudentName
StudentID → StudentName
5. Transitive Dependency : X → Y and Y → Z *// Example : EmpID → DeptID DeptID → DeptName
2. Differentiate stored procedure and trigger. Why concurrency control needed? Describe timestamp protocol
Stored Procedure *//
Executed manually –Called by user or application –No specific event required –Can accept parameters –Can return values –Business logic, reuse code
Trigger: *//
Executed automatically –Fired by database event –INSERT, UPDATE, DELETE –Cannot accept parameters –Cannot return values –Enforce rules, auditing
Why Concurrency Control is Needed: Concurrency control is the mechanism used to manage simultaneous access of multiple transactions to the database without causing inconsistency.
*//
Prevents data inconsistency *//Avoids lost updates *//Maintains isolation *//Supports multiple users
Timestamp-Based Concurrency Protocol
The timestamp-based protocol is a concurrency control technique in which each transaction is assigned a unique timestamp, and transactions are executed based on their timestamps *//Assigns timestamp to each transaction **//Orders execution based on timestamps *//Older transaction gets priority *//Prevents conflicts without locking
