Understanding Database Concepts: A Comprehensive Guide

Database Concepts

Relation: Table (e.g., Pets table)

Attribute: Column (e.g., Name)

Domain: Set of possible values (e.g., Age: 0-20)

Tuple: Row (e.g., (1, 'Buddy', 'Dog', 3))

Degree: Number of attributes (e.g., 4 in Pets)

Cardinality: Number of tuples (e.g., 10 rows in Pets)

Candidate Key: Unique identifier (e.g., CourseID)

Primary Key: Selected unique identifier (e.g., CourseID as primary key)

Foreign Key: Reference to primary key in another table (e.g., OwnerID in Pets)

Domain Constraint: Valid value range (e.g., Age >= 0)

Entity Integrity: Primary key cannot be NULL

Referential Integrity: Foreign key must match primary key or be NULL

NULL: Missing or unknown data

Views: Virtual tables with custom views for security, customization, and simplification

Primary Key = unique + not null.

Foreign Key = links tables, referential integrity, duplicates, links to PK in different tables.

Composite Key = multiple columns, combine for uniqueness.

Unique Key = ensures uniqueness, can be null.

Candidate Key = potential primary keys, unique identifiers.

Alternate Key = candidate keys that are not primary keys.

Surrogate Key = auto-generated/incrementing, no meaning.

Superkey = set of all columns that can uniquely identify a record.

Integrity Constraints

Null: Represents unknown or not applicable value/Different from zero or blank spaces.

Entity Integrity: Primary key attributes cannot be Null.

Referential Integrity: Foreign Key must match a Candidate Key in its home relation or be wholly Null.

General Constraints: Additional rules set by users/admins to define or constrain database aspects.

Views

Base Relation: Named relation corresponding to an entity in the conceptual schema, physically stored in the database.

View: Dynamic result of operations on base relations. Produced on request, at query time. Defined by a query on one or more base relations.

Purpose of Views

Definition: Virtual tables created by queries on one or more base tables.

  • Security: Restrict access to specific data.
  • Customization: Present data differently for various users.
  • Simplicity: Simplify complex queries.

COMP 3611: Create a view to show ORDER with product prices:

CREATE VIEW OrderDetails AS
SELECT ORDER.ORD_ID, PRODUCT.PROD_NAME, PRODUCT.PRICE
FROM ORDER
JOIN ORDER_PRODUCT ON ORDER.ORD_ID = ORDER_PRODUCT.ORD_ID
JOIN PRODUCT ON ORDER_PRODUCT.PROD_ID = PRODUCT.PROD_ID;

Erehwon Clinic: Create a view to list all pets by client:

CREATE VIEW ClientPets AS
SELECT Client.Name, Pet.Name
FROM Client
JOIN Pet ON Client.ClientID = Pet.ClientID;

SQL

salary DECIMAL(7,2) CHECK (salary BETWEEN 6000.00 AND 40000.00)
CREATE TABLE Employees (
    ID INT PRIMARY KEY,
    Salary DECIMAL(10, 2) CHECK (Salary BETWEEN 30000 AND 200000)
);
CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY, -- This creates a primary index
    Name VARCHAR(100),
    DepartmentID INT
); // Database index

Write a query to list all courses with titles + credits:

SELECT COURSE_CODE,COURSE_TITLE,CREDITS FROM COURSE;

Total Credits for all courses:

SELECT SUM(CREDITS) AS TOTAL_CREDITS FROM COURSE;

Write an SQL SUB query to find the course with highest number of credits:

SELECT COURSE_CODE,COURSE_TITLE,CREDITS FROM COURSE WHERE CREDITS = (SELECT MAX (CREDITS) FROM COURSE);

Insert new course into table:

INSERT INTO COURSE (COURSE_CODE, COURSE_TITLE, CREDITS) VALUES('NEW101', 'New Course Title', 3);

UPDATE credits for a course:

UPDATE course SET CREDITS = 4 WHERE COURSE_CODE ='EXIST101';

DELETE course from table:

DELETE FROM COURSE WHERE COURSE_CODE = 'OLD101';

INNER JOIN CLIENT & PET TABLES to combine rows from two or more tables// return only row that match in both tables same ClientID in both filter rows that dont match:

SELECT c.ClientID, c.FirstName, C.LastName, p.PetNbr, p.PetName, P.PetType
FROM CLIENT c
INNER JOIN Pet p on c.ClientID = p.ClientID;

LEFT OUTER JOIN: combine rows from 2 or more tables return rows from Client left table even if no match in right table for the pet puts NULL for no match:

SELECT C.ClientID, C.FirstName, C.LastName, P.PetNbr, P.PetName, P.PetType
FROM CLIENT C
LEFT OUTER JOIN PET P ON C.ClientID = P.ClientID;

RIGHT OUTER JOIN combine rows from 2 tables return rows from right pet table even if no match in left table client:

SELECT C.ClientID, C.FirstName, C.LastName, P.PetNbr, P.PetName, P.PetType
FROM CLIENT C
RIGHT OUTER JOIN PET P ON C.ClientID = P.ClientID;

FULL OUT JOIN combine rows from 2 or more tables return all rows from both even if no match all nulls:

SELECT C.ClientID, C.FirstName, C.LastName, P.PetNbr, P.PetName, P.PetType
FROM CLIENT C
FULL OUTER JOIN PET P ON C.ClientID = P.ClientID;

Subquery to Find all clients who have a pet of type cat:

SELECT C.ClientID, C.FirstName, C.LastName
FROM CLIENT C
WHERE C.ClientID IN (SELECT P.ClientID FROM PET P WHERE P.PetType = 'Cat');

Not In Subquery check if values does not exist Clients who dont have a dog:

SELECT C.ClientID, C.FirstName, C.LastName
FROM CLIENT C
WHERE C.ClientID NOT IN (SELECT P.ClientID FROM PET P WHERE P.PetType = 'Dog');

Correlated Subquery for aggregate op on and use result in outer query find all clients who have more than 2 pets:

Subquery with Join find all clients who have a pet with higher pet number than aver pet number for pet type:

SELECT C.ClientID, C.FirstName, C.LastName
FROM CLIENT C
WHERE C.ClientID IN (
    SELECT P.ClientID
    FROM PET P
    INNER JOIN (
        SELECT PetType, AVG(PetNbr) AS AvgPetNbr
        FROM PET
        GROUP BY PetType
    ) AS AvgPetNbr ON P.PetType = AvgPetNbr.PetType AND P.PetNbr > AvgPetNbr.AvgPetNbr
);

Subquery with Aggregate fn and use reult in outer query find all clients who have a total pet number > than average total pet number for all clients:

SELECT C.ClientID, C.FirstName, C.LastName
FROM CLIENT C
WHERE C.ClientID IN (
    SELECT P.ClientID
    FROM PET P
    GROUP BY P.ClientID
    HAVING SUM(P.PetNbr) > (SELECT AVG(SUM(PetNbr)) FROM PET GROUP BY ClientID)
);

CREATE TRIGGER FOR CLIENT ID: TRIGGER_CLIENT_ID_SEQ: generates a unique ClientID sequence number for each new client inserted.

CREATE TRIGGER TRIGGER_CLIENT_ID_SEQ
BEFORE INSERT ON CLIENT
FOR EACH ROW
BEGIN
    SELECT CLIENT_ID_SEQ.NEXTVAL INTO :NEW.ClientID FROM DUAL;
END;

TRIGGER_CLIENT_NAME_CHECK: checks if the Name column is not null and has a minimum length of 2 characters.

CREATE TRIGGER TRIGGER_CLIENT_NAME_CHECK
BEFORE INSERT OR UPDATE ON CLIENT
FOR EACH ROW
BEGIN
    IF :NEW.Name IS NULL OR LENGTH(:NEW.Name) < 2 THEN
        -- Raise an error or handle the situation accordingly
    END IF;
END;

TRIGGER_PET_ID_SEQ: generates a unique PetNbr sequence number for each new pet inserted, starting from 1 for each client.

CREATE TRIGGER TRIGGER_PET_ID_SEQ
BEFORE INSERT ON PET
FOR EACH ROW
BEGIN
    SELECT PET_ID_SEQ.NEXTVAL INTO :NEW.PetNbr FROM DUAL;
    :NEW.PetNbr := :NEW.PetNbr + (SELECT MAX(PetNbr) FROM PET WHERE ClientID = :NEW.ClientID);
END;

CLIENT_ID_SEQ: generates a unique sequence number for each new client inserted.:

CREATE SEQUENCE CLIENT_ID_SEQ START WITH 1 INCREMENT BY 1 NOCACHE;

DATABASE ANALYSIS & DESIGN

Software crisis/depression = high main costs failure include: lack of req specification, lack of development methods, poor decomposition of design gave way to structured approach to development called the Information System Lifecycle (ISLC). Database = component view at wider perspective

Database System Development Lifecycle (DSDLC):

  1. DB planning
  2. System definition
  3. Requirement collection + analysis
  4. DB design
  5. DBMS selection (optional)
  6. Application design
  7. Prototyping (optional)
  8. Implementation
  9. Data conversion + loading
  10. Testing
  11. Operational maintenance
  1. DB planning: Management activities, SDLC effect/efficiency/integrated with IS strategy/mission statement define aims/clear path/mission objectives per task/standards for collecting data/formatting/documentation/design+implementation. SCOPE/OBJECTIVES/REQUIREMENTS
  2. System Definition: Scope + boundaries of DB system/user view/user view defines requirements of the database for management supervision/one or more user views/don’t leave out users/breakdown requirements using views. Pet manager SYS ENVIR/BOUND
  3. Requirement Collection + analysis: Information about part of organization uses DB use 4 requirements of new system. FACT FINDING gather information for view, description of data use or generation/details of how data is used or generated/additional requirements for new database, information used to identify requirements/how to manage requirements for DB with many views = 3 types: 1. centralized approach/2. view integration approach./combination of approaches.
    • Centralized approach: Requirements for user views merged into a single set of requirements/data model is created representing all user views during the DB design stage.
    • View integration: Requirements for each user are separate lists data models for each view created and merged later in DB design/data model for a single user view or subset of all views = local data model. Each model includes diagrams + documentation describing requirements but not all user views. Local data models merged later in design to produce global data model that has all views.
  4. DB Design: Create design support mission statement+objectives+requirements. User + system requirements
    • Approaches 4 types:
      1. Top Down
      2. Bottom up
      3. Inside out
      4. Mixed
    • Main purpose of data modeling to understand meaning/semantics of data, facilitate communication about requirements, build data model answer questions: entities, relations, attributes.
    • Data model used to understand 3 things:
      1. User perspective of data
      2. Nature of data independent of physical representation
      3. Use of data across views
    • Criteria for optimal data model: Structural validity, simplicity, expressibility, non-redundancy, shareability, extensibility, integrity, diagrammatic representation.
    • 3 DB Design Phases:
      1. Conceptual
      2. Logical
      3. Physical
    • Conceptual DB: Construct model of data independent of physical considerations/data model built using information in user requirements specification/source of logical information.
    • Logical DB Design: Construct model of data model e.g. relational/independent of DBMS and physical considerations/Refined conceptual model mapped onto the logical model.
    • Physical DB design: Produce a description of DB implementation on secondary storage: describes base relations, file organization, indexes to access to data, integrity+ security steps, specific to DBMS.
    • Logical/Conceptual DB design: External schema + conceptual schema.
    • Physical database: Internal schema/physical storage.
  5. DBMS Selection: Select DBMS to support the system done BEFORE LOGICAL DESIGN if sufficient information about system requirements.
    • STEPS for DBMS SELECTION:
      1. Define terms of reference of study
      2. Shortlist 2-3 products
      3. Evaluate products
      4. Recommended selection/produce report. DB Design types: 1. Bottom up: fundamental attribute/entities/relations/ grouped e.g. normalization. Simple DB with a small number of attributes. Not for complex. TOP DOWN for complex systems starts with data models with a few high-level entities and relations and top-down requirements to identify low-level entities use ER MODEL. INSIDE OUT same as bottom-up but identify major entities set + branch out. Mixed strategy bottom-up plus top-down before combining
    • DBMS select factors: Scalability, compatibility, support, security cost. Open/source/Oracle enter/functionality
  6. Application Design Phase: Design user interface and application programs that use/process DB, user GUI, etc.
  7. Prototyping (optional): Building a working data model of the DB system allows users / designers to visualize / final system 1. requirements prototype (discarded) 2. evolutionary prototype (included in the system)
  8. Implementation: Create physical database definitions and application programs. Use DDL
  9. Data Conversion + Loading: Loading data from the old system and when possible convert old existing applications to run on the new DB.
  10. Testing: Tested for faults + errors. Validated against user requirements. Learnability, robust/performance/recover/adapt
  11. Operational Maintenance: System fully implemented. Continuously monitored + maintained new requirements incorporation through SDLC. 5 fact-finding = examining documentation/interviewing/observing enterprise in operation/research/questionnaires

Benefits of CASE:

  1. Automation: simplifies design + code generation
  2. Consistency: uniform design standards Oracle data modeler.

Phases of DB DESIGN:

  1. Conceptual: ABSTRACT represent ERD.
  2. Logical: Convert to relational schema
  3. Physical: Define storage structures indexes.

DB NORMALIZATION: Identify required attributes/ + aggregation into relations into normalized relations based on functional dependencies. Identify keys; Candidate keys; identify possible keys for unique id. PRIMARY KEY: selected unique identifier. Alt Keys candidate keys not chosen as the primary key

Multivalued + Composite Attributes: Multivalued = more than one e.g. phone numbers. Composite: multiple sub-attributes (address: street number, street name, city, zip

DB NORMALIZATION:

  • 1NF: Eliminate repeating groups ensure atomic values.
  • 2NF: Remove partial dependencies (for composite keys).
  • 3NF: Remove transitive dependents (non-key attribute dependent on other non-key attributes).
  • BCNF: Further refine to remove remaining anomalies.

Functional Dependencies: Relationship where one attribute determines another. E.g. StudentId -> StudentName.

Entity-Relationship Model (ERM). Such a model is usually represented as an Entity-Relationship Diagram (ERD).

Enhanced Entity-Relationship Model (EERM) is introduced as a way of representing additional semantic concepts. These include: specialization/generalization, aggregation, and composition.


Specify and use fact-finding techniques for the DSDLC.

Fact-Finding Techniques: These techniques help gather necessary information to design the database system.

  • Interviews: Interview the clinic’s staff to understand what information they need about clients and pets.
  • Questionnaires: Distribute questionnaires to all employees to gather input on what features they need in the new system.
  • Document Analysis: Review existing paper records of clients and pets to understand what data is currently being tracked.
  • Observation: Observe how the staff interacts with the current system or manual processes to identify inefficiencies.
  • Prototyping: Create a prototype of the database to get feedback from users on whether it meets their needs. Method for deciding central or view integration approach or combo = examine overlap in data between user views.


Missions Statement for DB: The purpose of the DB is to support the veterinary clinic and maintain and manage data between __ and the __ and facilitate communication between ____. Mission objectives: to maintain data on xyz, to perform searches on xyz. User requirements = detail of data being held and how it’s used. System requirements: features.

DATA MODELS: ER model: Entities = Client and Pet. Attributes Client=ClientId, Name, Deductible. Pet: PetNbr,PetName,PetType,ClientID(FK). Relationships: 1:M Clients can have one to many pets. M:1 A pet belongs to one client many to one. CUSTOMER: Customers, Products, Orders, Order_product. Order_Product ord_id (fk), and product_id(fk)

EER DIAGRAMS: Enhanced entity relationship diagrams used to overcome limitations of ER diagrams. More semantic concepts for complex systems that need specialization/generalization/inheritance/aggregation and composition. Superclass: entity type of one or more distinct subclass e.g. STAFF. Subclass are distinct subgroups of entity types that can be found in the data model. E.g. types of staff which are a subclass of the superclass allows subclasses to possess all attributes of superclass avoid describing similar concepts more than once and save time and improve readability. Entity and subclass = a type hierarchy type hierarchies are known by different names like specialization hierarchy or generalization hierarchy. A shared subclass has more than one superclass which is multiple inheritance. Specialization process maximizes the difference between members of an entity by different features top-down approach to defining a set of superclasses and related subclasses. Generalization process= minimize differences in entities with common features. Reverse specialization. Types of constraints for Specialization/Generalization: 1. Participation constraint: whether every member in superclass must participate as a member of subclass (mandatory or optional {Optional, OR}. e.g. types of staff don’t have to have a job role. Disjoint constraints: whether superclass has more than one subclass, disjoint subclasses can use OR next to participation constraint. If a subclass generation not disjoint = non-disjoint

Aggregation: represents a HAS A or IS PART OF relationship between entity and type one is whole and other is part. Composition: a type of aggregation that is an association between entities and strong ownership between whole and part

LmscWAMIJuAAAAAElFTkSuQmCC