Database Management: Keys, Relationships, and Normalization
Database Keys and Relationships
A foreign key is the primary key of one table that has been placed into another table to create a common attribute.
In the relational model, keys are important because they are used to ensure that each row in a table is uniquely identifiable.
- A superkey is any key that uniquely identifies each row.
- A candidate key can be described as a minimal superkey, a superkey without any unnecessary attributes.
- A secondary key is defined as a key that is used strictly for data retrieval purposes.
Referential integrity dictates that the foreign key must contain values that match the primary key in the related table or must contain null.
For example, the CUSTOMER table’s primary key is CUS_CODE. The CUSTOMER primary key column has no null entries, and all entries are unique. This is an example of entity integrity.
Entity-Relationship Modeling and Attributes
The UML notation of entity-relationship modeling can be used for both conceptual and implementation modeling. The entity-relationship diagram (ERD) represents the conceptual database as viewed by the end user.
A domain is the set of possible values for a given attribute.
- Ideally, an entity identifier is composed of one attribute.
- A composite attribute can be further subdivided to yield additional attributes.
- A simple attribute is one that cannot be subdivided.
- A derived attribute is indicated in the Chen notation by a dashed line that connects the attribute and an entity. The decision to store derived attributes in database tables depends on the processing requirements and the constraints placed on a particular system.
A relationship is an association between entities.
Database Normalization
Normalization works through a series of stages called normal forms. For most purposes in business database design, three stages are as high as you need to go in the normalization process.
- A table that has all key attributes defined, has no repeating groups, and all its attributes are dependent on the primary key is said to be in 1NF (First Normal Form).
- Some very specialized applications may require normalization beyond the 4NF (Fourth Normal Form).
- Of the following normal forms, DKNF (Domain-Key Normal Form) is mostly of theoretical interest.
Predicate logic, used extensively in mathematics, provides a framework in which an assertion (statement of fact) can be verified as either true or false.
Each table column represents an attribute.
Date attributes contain calendar dates stored in a special format known as the Julian date format.
- Attribute A determines attribute B if all of the rows in the table that agree in value for attribute A also agree in value for attribute B.
- A repeating group derives its name from the fact that a collection of multiple entries of the same type can exist for any single key attribute occurrence.
- A transitive dependency exists when there are functional dependencies such that Y is functionally dependent on X, Z is functionally dependent on Y, and X is the primary key.
- An attribute that is part of a key is known as a prime attribute.
SQL Queries and Data Manipulation
A SELECT query specifies which data should be retrieved and how it should be filtered, aggregated, and displayed.
- The HAVING command restricts the selection of grouped rows based on a condition.
- The query used to list unique values for V_CODE, where the list will produce only a list of those values that are different from one another, is:
SELECT DISTINCT V_CODE FROM PRODUCT;
- A natural join will select only the rows with matching values in the common attribute(s).
- According to the rules of precedence, operations within parentheses should be completed first.
- The DEFAULT command defines a default value for a column when no value is given.
- When using an inner join, only rows from the tables that match on a common value are returned.
- An alias is an alternate name given to a column or table in any SQL statement.
- The SQL command that allows a user to permanently save data changes is COMMIT.
Specialization Hierarchies in Database Design
A specialization hierarchy can have many levels of supertype/subtype relationships.
- The default comparison condition for the subtype discriminator attribute is the equality comparison.
- The property of inheritance enables an entity subtype to inherit the attributes and relationships of the supertype.
- The specialization hierarchy depicts the arrangement of higher-level entity supertypes (parent entities) and lower-level entity subtypes (child entities).
- Within a specialization hierarchy, every subtype can have only one supertype to which it is directly related.
- The extended entity-relationship model (EERM) is sometimes referred to as the enhanced entity relationship model.
- A subtype discriminator is the attribute in the supertype entity that determines to which entity subtype each supertype occurrence is related.
- At the implementation level, the supertype and its subtype(s) depicted in a specialization hierarchy maintain a 1:1 relationship.
- One important inheritance characteristic is that all entity subtypes inherit their primary key attribute from their supertype.
- An entity supertype is a generic entity type that is related to one or more entity subtypes.