Mastering SQL Queries and Core Database Concepts
Fundamental SQL Retrieval Examples
Basic Selection and Filtering
List full details of all hotels
SELECT * FROM Hotel;List full details of all hotels in London
SELECT * FROM Hotel WHERE city = 'London';List names and addresses of all guests in London
Alphabetically ordered by name. Strictly speaking, using
LIKE '%London%'might also find rows with an address like: ’10 London Avenue, New York’.SELECT guestName, guestAddress FROM Guest WHERE address LIKE '%London%' ORDER BY guestName;List double or family rooms below £40.00 per night
Ordered in ascending order of price. (Note:
ASCis the default setting).SELECT * FROM Room WHERE price < 40 AND type IN ('D', 'F') ORDER BY price;List bookings with no specified dateTo
SELECT * FROM Booking WHERE dateTo IS NULL;
SQL Aggregate Functions
Count the total number of hotels
SELECT COUNT(*) FROM Hotel;Calculate the average price of a room
SELECT AVG(price) FROM Room;Calculate total revenue per night from all double rooms
SELECT SUM(price) FROM Room WHERE type = 'D';Count distinct guests who booked in August
SELECT COUNT(DISTINCT guestNo) FROM Booking WHERE (dateFrom <= DATE'2004-08-01' AND dateTo >= DATE'2004-08-01') OR (dateFrom >= DATE'2004-08-01' AND dateFrom <= DATE'2004-08-31');
Advanced SQL: Subqueries and Joins
Using Subqueries for Data Retrieval
List price and type of all rooms at the Grosvenor Hotel
SELECT price, type FROM Room WHERE hotelNo = (SELECT hotelNo FROM Hotel WHERE hotelName = 'Grosvenor Hotel');List all guests currently staying at the Grosvenor Hotel
SELECT * FROM Guest WHERE guestNo IN (SELECT guestNo FROM Booking WHERE dateFrom <= CURRENT_DATE AND dateTo >= CURRENT_DATE AND hotelNo = (SELECT hotelNo FROM Hotel WHERE hotelName = 'Grosvenor Hotel'));
Using Joins for Complex Retrieval
List room details at Grosvenor Hotel, including current guest name
Details include the name of the guest staying in the room, if the room is occupied. This uses a LEFT JOIN.
SELECT r.*, XXX.guestName FROM Room r LEFT JOIN (SELECT g.guestName, h.hotelNo, b.roomNo FROM Guest g, Booking b, Hotel h WHERE g.guestNo = b.guestNo AND b.hotelNo = h.hotelNo AND hotelName = 'Grosvenor Hotel' AND dateFrom <= CURRENT_DATE AND dateTo >= CURRENT_DATE) AS XXX ON r.hotelNo = XXX.hotelNo AND r.roomNo = XXX.roomNo;
Grouping and Counting
List the number of rooms in each hotel
SELECT hotelNo, COUNT(roomNo) AS count FROM Room GROUP BY hotelNo;List the number of rooms in each hotel in London
SELECT h.hotelNo, COUNT(r.roomNo) AS count FROM Room r, Hotel h WHERE r.hotelNo = h.hotelNo AND city = 'London' GROUP BY h.hotelNo;
SQL Examples for Accounts Payable (AP) Systems
Filtering and Distinct Values
List all distinct account numbers
SELECT DISTINCT account_number FROM ap.invoice_line_items;List vendor IDs with invoices greater than 2000
SELECT vendor_id FROM invoices WHERE invoice_total > 2000;List vendor IDs with invoice totals between 10000 and 30000
SELECT vendor_id FROM invoices WHERE invoice_total >= 10000 AND invoice_total <= 30000;Find all vendors located in NY, NJ, or PA
Two equivalent methods:
SELECT vendor_id, vendor_contact_last_name, vendor_contact_first_name, vendor_phone, vendor_state FROM ap.vendors WHERE vendor_state IN ('NY', 'NJ', 'PA');SELECT vendor_id, vendor_contact_last_name, vendor_contact_first_name, vendor_phone, vendor_state FROM ap.vendors WHERE vendor_state = 'NY' OR vendor_state = 'NJ' OR vendor_state = 'PA';Find vendors in NY, NJ, or PA, excluding New York and Philadelphia
SELECT vendor_id, vendor_contact_last_name, vendor_contact_first_name, vendor_phone, vendor_state, vendor_city FROM ap.vendors WHERE vendor_state IN ('NY', 'NJ', 'PA') AND vendor_city NOT IN ('new york','philadelphia');Find all vendors who use PO Box as their address
SELECT vendor_id, vendor_contact_last_name, vendor_contact_first_name, vendor_phone, vendor_address1 FROM ap.vendors WHERE vendor_address1 LIKE 'PO Box%';Find vendors without a phone number and using a PO Box address
SELECT vendor_id, vendor_contact_last_name, vendor_contact_first_name, vendor_phone, vendor_address1 FROM ap.vendors WHERE vendor_address1 LIKE 'PO Box%' AND vendor_phone IS NULL;
Advanced Aggregation and Grouping
Count vendors with invoices greater than 10000
SELECT COUNT(vendor_id) AS mycount FROM invoices WHERE invoice_total > 10000;Count distinct vendors with invoices greater than 10000
SELECT COUNT(DISTINCT vendor_id) AS mycount FROM invoices WHERE invoice_total > 10000;Find average, minimum, and maximum invoice total
SELECT MIN(invoice_total) AS mymin, MAX(invoice_total) AS mymax, AVG(invoice_total) AS myavg FROM invoices;Count vendors per state, ordered by count descending
SELECT vendor_state, COUNT(vendor_id) AS mycount FROM ap.vendors GROUP BY vendor_state ORDER BY mycount DESC;Identify the vendor with the highest total invoice sum
SELECT vendor_id, COUNT(invoice_id) AS mycount, SUM(invoice_total) AS mysum FROM ap.invoices GROUP BY vendor_id ORDER BY mysum DESC LIMIT 1;List states having at least three vendors (using HAVING)
SELECT vendor_state, COUNT(vendor_id) AS mycount FROM ap.vendors GROUP BY vendor_state HAVING COUNT(vendor_id) > 3 ORDER BY mycount DESC;
Core Database Management System Concepts
The Three-Level Database Architecture
Three distinct levels describe how data items can be viewed:
- External Level
- Conceptual Level
- Internal Level
Data Independence
There are two kinds of data independence: logical and physical.
Physical Data Independence
- Refers to immunity of the conceptual schema to changes in the internal schema.
- Internal schema changes (e.g., using different file organizations, storage structures/devices) should not require changes to conceptual or external schemas.
Logical Data Independence
- Refers to immunity of external schemas to changes in the conceptual schema.
- Conceptual schema changes (e.g., addition/removal of entities) should not require changes to the external schema or rewrites of application programs.
Objective of the Three Levels
- All users should be able to access the same data.
- Users should not need to know physical database storage details.
- A user’s view is immune to changes made in other views.
- The DBA should be able to change the conceptual structure of the database without affecting all users.
- The DBA should be able to change database storage structures without affecting the users’ views.
- The internal structure of the database should be unaffected by changes to physical aspects of storage.
Data Languages (DDL and DML)
Data Definition Language (DDL)
Allows the DBA or user to describe and name entities, attributes, and relationships required for the application, plus any associated integrity and security constraints.
Data Manipulation Language (DML)
Provides basic data manipulation operations on data held in the database.
- Procedural DML: Allows the user to tell the system exactly how to manipulate data.
- Non-Procedural DML: Allows the user to state what data is needed rather than how it is to be retrieved.
Data Models
A Data Model comprises:
- A structural part: Consisting of a set of rules according to which databases can be constructed.
- A manipulative part: Defining the types of operation that are allowed on the data.
- Possibly a set of integrity rules: Ensures data is accurate.
Purpose of Data Models
To represent data in an understandable way.
Categories of data models include:
- Object-based
- Record-based
- Physical
Functions of a DBMS
- Data Storage, Retrieval, and Update.
- A User-Accessible Catalog (metadata).
- Transaction Support (a series of changes by a single user; ensures all are made or none are made).
- Concurrency Control Services (ensures database updates correctly when multiple users are updating concurrently).
- Recovery Services (recovering damaged database).
- Authorization Services (only authorized users can access the database).
- Support for Data Communication (allows users to access the database over a network).
- Integrity Services (ensures the consistency and correctness of the data by setting constraints).
Relational Model Terminology
- Relation: A table with columns and rows. Has specific characteristics and applies only to the logical structure of the database, not the physical structure.
- Relational Database: A collection of normalized relations with distinct relation names.
- Attribute: A named column of a relation.
- Degree: The number of attributes in a relation.
- Domain: The set of allowable values for one or more attributes.
- Tuple: A row of a relation.
- Cardinality: The number of tuples in a relation.
Database Keys
- Superkey: An attribute, or set of attributes, that identifies tuples of a relation uniquely.
- Candidate Key: A minimal superkey.
- Primary Key: The candidate key chosen for use in identification of tuples. A relation must always have a primary key.
- Foreign Key: An attribute, or set of attributes, within one relation that is the candidate key of another relation.
Purpose of Views
- Provides powerful and flexible security mechanism by hiding parts of the database from certain users.
- Permits users to access data in a customized way, so that the same data can be seen by different users in different ways, at the same time.
- Can simplify complex operations on base relations.
SQL Language Structure and Syntax
SQL is a transform-oriented language with two major components:
- A DDL for defining database structure.
- A DML for retrieving and updating data.
SQL Command Examples
CREATE TABLE example
CREATE TABLE Staff(staffNo VARCHAR(5), lName VARCHAR(15), salary DECIMAL(7,2));INSERT INTO example
INSERT INTO Staff VALUES ('SG16', 'Brown', 8300);SELECT example with filtering
SELECT staffNo, lName, salary FROM Staff WHERE salary > 10000;
SQL Syntax Rules
SQL statements consist of reserved words and user-defined words.
- Reserved words: A fixed part of SQL. Must be spelled exactly as required and cannot be split across lines. (Often represented in UPPER-CASE).
- User-defined words: Made up by the user and represent names of various database objects such as relations, columns, and views. (Often represented in lower-case).
Syntax Notation Conventions
|indicates a choice among alternatives.- Curly braces
{}indicate a required element. - Square brackets
[]indicate an optional element. - Ellipsis
...indicates optional repetition (zero or more).
The Six Main SQL Clauses (Order of Execution)
These clauses define the structure of a standard SQL query:
- FROM: Specifies the table(s) to be used.
- WHERE: Filters rows based on a condition.
- GROUP BY: Forms groups of rows with the same column value.
- HAVING: Filters groups subject to some condition.
- SELECT: Specifies which columns are to appear in the output.
- ORDER BY: Specifies the order of the output.
Standard SQL Aggregate Functions
- COUNT: Returns the number of values in the specified column.
- SUM: Returns the sum of values in the specified column.
- AVG: Returns the average of values in the specified column.
- MIN: Returns the smallest value in the specified column.
- MAX: Returns the largest value in the specified column.
