Database Keys: Super, Candidate, Primary & 3NF Concepts

Super Key vs. Candidate Key Differences

Understanding the distinctions between Super Keys and Candidate Keys is crucial in database design. Below are their key characteristics:

Super Key Characteristics

  • A Super Key is an attribute (or set of attributes) used to uniquely identify all attributes in a relation.
  • Not all Super Keys can be Candidate Keys.
  • The collection of Super Keys helps establish the criteria for selecting Candidate Keys.
  • In a relation, the number of Super Keys is typically greater than the number of Candidate Keys.
  • Super Key attributes may contain NULL values (if the Super Key is not also a Candidate Key).

Candidate Key Characteristics

  • A Candidate Key is a subset of a Super Key; it is a minimal Super Key.
  • All Candidate Keys are Super Keys.
  • The collection of Candidate Keys helps establish the criteria for selecting the Primary Key.
  • In a relation, the number of Candidate Keys is less than the number of Super Keys.
  • Candidate Key attributes cannot contain NULL values.

Primary Key vs. Candidate Key Explained

The relationship and differences between Primary Keys and Candidate Keys are fundamental to relational database structure:

Primary Key Characteristics

  • A Primary Key is a minimal Super Key. There is one and only one Primary Key in a relation.
  • No attribute of a Primary Key can contain a NULL value.
  • The explicit specification of a Primary Key for a relation, while fundamental for integrity, might be considered optional in some Database Management System (DBMS) implementations, though it is strongly advised.
  • The Primary Key identifies the most important attribute (or set of attributes) for uniquely identifying records in the relation.
  • It is confirmed that a Primary Key is always a Candidate Key.

Candidate Key Characteristics (in relation to Primary Key)

  • A relation can have more than one Candidate Key.
  • Candidate Key attributes cannot contain NULL values.
  • However, a relation inherently possesses Candidate Keys, and one must be designated (implicitly or explicitly) for the relation to be well-defined and to select a Primary Key.
  • A Candidate Key specifies an attribute (or set of attributes) that qualifies to be a Primary Key.
  • However, while any Candidate Key could be chosen as the Primary Key, only one is selected. Not every Candidate Key will become the Primary Key if multiple exist.

Transitive Dependencies and Third Normal Form (3NF)

A transitive dependency in a database occurs when a non-prime attribute (an attribute not part of any Candidate Key) is functionally dependent on another non-prime attribute, which in turn is functionally dependent on the Primary Key. In simpler terms, it means that if A → B and B → C, then A → C represents a transitive dependency, where A is the Primary Key, and B and C are non-prime attributes, and B is not itself a candidate key.

In the context of Third Normal Form (3NF), transitive dependencies are not permitted. To achieve 3NF, all non-prime attributes must be directly dependent on the entire Primary Key, without any transitive dependencies.

Example of Transitive Dependency

Consider a table with columns: EmployeeID (Primary Key), DepartmentID, and ManagerName. Here, DepartmentID is functionally dependent on EmployeeID (as an employee belongs to a department), and ManagerName is functionally dependent on DepartmentID (as each department has a manager). However, ManagerName is indirectly dependent on EmployeeID through DepartmentID, creating a transitive dependency. To normalize this table into 3NF, you would split it into two tables: Employee (EmployeeID, DepartmentID) and Department (DepartmentID, ManagerName), thereby removing the transitive dependency.

Understanding Second Normal Form (2NF)

A table in Second Normal Form (2NF) has the following characteristics:

  • It is already in First Normal Form (1NF), meaning it has no repeating groups and all attribute values are atomic.
  • All non-prime attributes are fully functionally dependent on the entire Primary Key.
  • In other words, every non-prime attribute must depend on the whole Primary Key, not just a part of it (if the Primary Key is composite).
  • This means there are no partial dependencies, where a non-prime attribute is dependent on only a portion of a composite Primary Key.