Database Design
Day 1
Data
- Raw, unorganized facts
- (Unformatted Data)
Information
- Organized data with context
- (Formatted Data)
Metadata
- DATA that describes your DATA
- The “lens” we use to understand what data means
- Describes properties of data so we can infer information
Day 2
“Old” Way Process of Filing Systems
- Data stored in files managed by the application
- EX:
- STUDENT Files <-> Student Processing Applications <-> Users
- FACULTY and STAFF Files <-> Faculty and Staff Processing Applications <-> Users
- ALUMNI Files <-> Alumni Processing Applications <-> Users
3 Limitations of File-Processing System
- Lack of Data Integrity: Values may be incorrect, inconsistent, or out of data in isolated systems
- Lack of Standards: It may be difficult to keep data in the same format. Difficult to update data when standards change
- Lack of Flexibility/Maintainability: Dependent on a programmer to modify the data structures
Day 3
2 Fundamental Problems that lead to Limitations
- Lack of Integration: Data are stored in separate, isolated files within the file system
- Lack of Program-Data Independence: The structure of the data (i.e., Metadata) is embedded in the application
Data Integration
- When we separate the DATA from the APPLICATION, great things happen:
- Many applications can use the same data
- Data can be updated independent of the application
- Data can be modified without knowledge of programming
Integrated Data Process of Filing Systems
- Database <-> Database Management System <->
(THEN EACH ARE SEPARATE BUT ALL CONNECTED DO DMS)
– STUDENT Files <-> Student Processing Applications <-> Users
– FACULTY and STAFF Files <-> Faculty and Staff Processing Applications <-> Users
– ALUMNI Files <-> Alumni Processing Applications <-> Users
The real world
- Applications used to be installed directly on your computer – now applications are often “in the cloud”
Day 4
ANSI/SPARC 3-Schema Architecture
- Contains External Schema, Conceptual Schema, Internal Schema
Schema
- A description of your metadata
- (Data that describes the data that describes your data …)
- The schema is a map that shows how things are related
- What individual pieces of data make up a larger data element
- How data is related to other data – EX: how students, courses, instructors and classrooms are related
- Mapping where data is logically stored to where it is physically stored
- Most of your time will be likely be spent thinking about conceptual and external schemas
ANSI/SPARC 3-Schema Architecture (3 Perspectives of Metadata)
- (1)Individual User Views: External Schema
- (2)Global View: Conceptual Schema
- (3)Storage View: Internal Schema
Then Last “Level” is Stored Database
VERY NICE
Day 5
Internal Schema
- The nuts and bolts of the database
- Describes the physical organization of the stored data (e.g., how the data is actually laid out on storage devices)
- Describes the mechanism used to implement access strategies (e.g. indexes, hashed addresses, etc.)
- Concerned with the efficiency of data storage and access mechanisms
- Technology Dependent
- Depends on the hardware/software you are using
- Are you using Oracle, MSSQL, MySQL, etc?
- What type of server/storage/network configuration do you have?
Conceptual Schema
- Core of the architecture – how is your data logically organized?
- Represents of the global view of the structure of the entire database for a community of users
- Describes all data items and relationships between data together with integrity constraints
- Separates data from the program (or views from the physical storage structure)
- Technology Independent
- Just describes your data – will be the same regardless of the technology you use
External Schema
- Each application (or user) has different data they care about
- Other data should not be exposed for the sale of simplicity & security
- A particular application’s “view” of the data
- The external schema creates “views” of the data so the applications can remain unaware of the underlying conceptual schema
Data Independence
- When a schema at a lower level is changed, only the mapping information (managed by DBMS) between this schema and higher-level schemas need to be changed
- The higher-level schemas themselves are unchanged
- Hence, the application programs need not be changed since they refer to external schemas
- External views are unaffected by changes to the internal structure because of the conceptual schema between the external views and the internal schema
- External views are tailored for and accessible to application programs – the conceptual schema is not directly accessible by application program(s).
Physical Data Independence
- The capacity to change the internal schema without having to change the conceptual schema
- EX: The internal schema may be changed when certain file structures are reorganized or new indexes are created to improve database performance
Logical Data Independence
- Definition: External views unaffected by design changes (growth or restructuring) in conceptual schema
- How? External views generated exclusively through logical reference to elements in the conceptual schema
- Consequence: External views unaffected by changes to other external views
Day 6
“Old” File System Process: 2-Schema Architecture
- (1)Programmers’ Views: External Schema
- (2)Storage View: Internal Schema
- Then Last “Level” is Stored Database
- OLD AND BUSTED
Database
- An integrated set of files
- We still use files – but the DBMS is a system for managing the files and data contained within
- A database is a collection of files whose records are logically related to one another. In contrast with that of a file-processing system, integration of data as needed is the responsibility of the DBMS software instead of the programmer
- A database is self-describing in that the metadata is recorded within the database (i.e.,
the schemas), not in application programs.
Database System
- A system, generally, a set of interrelated components working together for some purpose
- Database systems were created to overcome the limitations of the old “file system” way of doing things
- Includes data and metadata
- Data consists of recorded facts that have implicit meaning
- Viewed through the lens of metadata, the meaning of recorded data becomes explicit
Components of a DBMS
- A data dictionary: The metadata about your data
- One or more query languages (i.e., SQL)
- A data manipulation language (SQL, PL/SQL) for accessing the database
- A data definition
language (SQL) to define the structure of data - Tools for generating reports
- DBMS Utilities: User security, importing data, data conversion, backup/restore, performance monitoring, reorganizing/indexing data
Model
- Simplified expression of observed or unobservable reality used to perceive relationships in the outside world
- A model is an approximation & entails assumptions
- A blue print for designing databases
- All models are wrong, but some are useful
- IF a model was perfectly correct, it would be the real thing!
Data Modeling Stages
- Conceptual modeling:
– Product: Conceptual Schema - Logical model/design:
– Product: Logical schema - Physical design:
Product: Physical/Internal schema
Day 7
Entity-Relationship (ER) Model
ER modeling grammar obeys the properties of a semantic data modeling technique:
– Expressiveness, Simplicity, Minimality, Unique interpretation, Formality
Purpose:
– Communication/presentation device used by an analyst to interact with the
end-user community
– A design tool at the highest level of abstraction to convey a deeper level understanding to the database designer
An ER diagram that portrays entity types, attributes, and relationships among entity types
– Expressiveness, Simplicity, Minimality, Unique interpretation, Formality
Purpose:
– Communication/presentation device used by an analyst to interact with the
end-user community
– A design tool at the highest level of abstraction to convey a deeper level understanding to the database designer
An ER diagram that portrays entity types, attributes, and relationships among entity types
Business Rule
A statement of a specific condition or procedure relevant to the universe of interest (application domain) being modeled
Business rules may be explicitly stated, but are often implied in the requirements specification and must be inferred
– Problem: People often don’t mean exactly what they explicitly say
– Problem: When you infer things you may get them wrong
The process of developing business rules from the requirements specification is not quite scientific, but it can be systematic.
– Go through the spec step-by-step
– An iterative process
Systematic analysis will also facilitate identification of ambiguities which, when clarified by the user community, will yield additional business rules and also facilitate correction of other business rules
Business rules might be defined in the application, the database, or both!
Business rules may be explicitly stated, but are often implied in the requirements specification and must be inferred
– Problem: People often don’t mean exactly what they explicitly say
– Problem: When you infer things you may get them wrong
The process of developing business rules from the requirements specification is not quite scientific, but it can be systematic.
– Go through the spec step-by-step
– An iterative process
Systematic analysis will also facilitate identification of ambiguities which, when clarified by the user community, will yield additional business rules and also facilitate correction of other business rules
Business rules might be defined in the application, the database, or both!
Data Dictionary
Metadata describing the Data in the tables
Entity Type
– Conceptual representation of an object type
– A set of related attributes
– Has relationship(s) with other entity types
– A set of related attributes
– Has relationship(s) with other entity types
Entity Instance
An occurrence of an entity type
Entity Class
A set of entity types that have shared properties
2 Types of Entities
Strong: always exists
Weak: only exists when based on a strong entity
(Represented by a double box)
Weak: only exists when based on a strong entity
(Represented by a double box)
Attribute Type
Examples include …..
– Numeric: ONLY numbers – useful for doing mathematical operations
– Alphabetic: ONLY letters – Joe Smith
– Alphanumeric: Letters and numbers – 55 West Main Street
– Date/Time: 2:14 AM, 8/29/1997
– Numeric: ONLY numbers – useful for doing mathematical operations
– Alphabetic: ONLY letters – Joe Smith
– Alphanumeric: Letters and numbers – 55 West Main Street
– Date/Time: 2:14 AM, 8/29/1997
2 Types of Attributes Classification
Composite: Can be meaningfully divided into smaller attributes
Atomic: Just like an atom, it cannot be divided no further, an atomic attribute can be (meaningfully) divided no further
Atomic: Just like an atom, it cannot be divided no further, an atomic attribute can be (meaningfully) divided no further
2 Types of Attributes Category
Single: EX: An employee has one employee ID, one first name, one last name, etc…..
Multi-value: EX: An employee may have multiple skills; Database design, C# Programming, Java Programming, Basket weaving
Multi-value: EX: An employee may have multiple skills; Database design, C# Programming, Java Programming, Basket weaving
2 Types of Attribute Source
Stored: Can be stored; things that don’t generally change, such as name
Derived: Can be derived; things that change with time, such as years of service
– Derived by subtracting start date from today’s date
Derived: Can be derived; things that change with time, such as years of service
– Derived by subtracting start date from today’s date
2 Types of Attribute Domain
Explicit Domain Constraints:
– Sex: [M, F]
– Student_type: [Fr, So, Jr, Sr, Gr]
Implicit Domain Constraints:
– Age: [1-120]
– Salary: [17,000-3,000,000]
– Sex: [M, F]
– Student_type: [Fr, So, Jr, Sr, Gr]
Implicit Domain Constraints:
– Age: [1-120]
– Salary: [17,000-3,000,000]
3 Types of Attribute Roles
Unique Identifier: An attribute (atomic or composite) whose values are distinct for each entity instance in the entity set
– Employee ID, SSN, ISBN, UPC, etc …..
Key Attribute:
Attribute that is a constituent part of a unique identifier
A key attribute is a proper subset of a unique identifier
Non-key: Any attribute that is not a constituent part of (subset of) a unique identifier
– Name, weight, classification
– Employee ID, SSN, ISBN, UPC, etc …..
Key Attribute:
Attribute that is a constituent part of a unique identifier
A key attribute is a proper subset of a unique identifier
Non-key: Any attribute that is not a constituent part of (subset of) a unique identifier
– Name, weight, classification
Entity & Attribute Data Integrity Constraints
Data integrity constraints are rules that govern behavior of data at all times in a database
– Technical expressions of business rules
They must be preserved across all 3 tiers of data modeling – conceptual, logical, and physical
Some constraints cannot be expressed explicitly in an ERD and are therefore carried forward in textual form (i.e., semantic integrity constraints)
– Technical expressions of business rules
They must be preserved across all 3 tiers of data modeling – conceptual, logical, and physical
Some constraints cannot be expressed explicitly in an ERD and are therefore carried forward in textual form (i.e., semantic integrity constraints)
2 Types of Data Integrity Constraints pertaining to Entity Types and Attributes are …..
The domain constraint imposed on an attribute to ensure that its observed value is not outside the defined domain
The role (“unique key” or non-key) constraint that requires entities of an entity type to be uniquely identifiable.
The role (“unique key” or non-key) constraint that requires entities of an entity type to be uniquely identifiable.
Day 4
Binary Relationship
(Simplest relationship)
Degree n = 2
^(Pilot, Flight)
EX: Captain Tom flies flight DB2016
Degree n = 2
^(Pilot, Flight)
EX: Captain Tom flies flight DB2016
Ternary Relationship
Degree n = 3
^(Professor, Subject, Course)
EX: Professor Grimes teaches the subject Database in course MIS 3376
^(Professor, Subject, Course)
EX: Professor Grimes teaches the subject Database in course MIS 3376
Quaternary Relationship
Degree n = 4
^(Patient, Physician, Medication, Illness)
EX: Dr. Sal prescribes Ibuprofen to treat Jeff Cam for blood pressure
^(Patient, Physician, Medication, Illness)
EX: Dr. Sal prescribes Ibuprofen to treat Jeff Cam for blood pressure
Unary Relationships
(Also known as “recursive relationship”)
Degree n = 1
EX: A manager is just an employee that supervises other employees
In relationships with 2+ degrees, multiple entities interact BUT
With unary relationships, one entity interacts with itself
Degree n = 1
EX: A manager is just an employee that supervises other employees
In relationships with 2+ degrees, multiple entities interact BUT
With unary relationships, one entity interacts with itself
Role Names
Optional, but can occasionally help with understanding
EX: PILOT (role name: Captain_of) FLIES (role name: Commanded_by) FLIGHT
EX: PILOT (role name: Captain_of) FLIES (role name: Commanded_by) FLIGHT
Participation
Does an instance of an entity type have to be related to an instance of the other entity type in order to exist?
Vertical Line (I) means MANDATORY
Open Circle means OPTIONAL
Vertical Line (I) means MANDATORY
Open Circle means OPTIONAL
Cardinality
Maximum number of instances of an entity type that relate to a single instance of an associated entity type through a relationship
4 Types of Cardinality
1) m:n – An entity instance in Set A is associated with no more than “m” (many) entity instances in Set B. An entity instance in Set B is associated with no more than “n” (many) entity instances in set A
2) 1:n – An entity instance in Set A is associated with no more than “n” (many) entity instances in Set B. An entity instance in Set B is related to no more than 1 entity instance in Set A
3) n:1 – The reverse 1:n (these are often combined)
4) 1:1 – An entity instance in Set A is associated with no more than 1 entity instance in Set B. An entity instance in Set B is related to no more than 1 entity instance in Set A
2) 1:n – An entity instance in Set A is associated with no more than “n” (many) entity instances in Set B. An entity instance in Set B is related to no more than 1 entity instance in Set A
3) n:1 – The reverse 1:n (these are often combined)
4) 1:1 – An entity instance in Set A is associated with no more than 1 entity instance in Set B. An entity instance in Set B is related to no more than 1 entity instance in Set A
Cardinality: Parent-Child Relationships with Examples
One coach (Parent) has many players
– 1:n
– One Dana Holgorsen coaches many players
– Players have one coach
Many children have one mother (Parent)
– n:1
– Many children are raised by one Kate Gosselin
– Children have one mother
Many shoppers visit many stores
– m:n (1:n + n:1)
– One shopper visits many stores + Many shoppers visit one store
(In these types of Cardinalities BOTH SIDES CAN BE CONSIDERED PARENTS)
One department has one department chair
– 1:1
– One faulty (Dr.Johnson) is chair of one department (DISC)
– A faculty member cannot chair two departments, and a department cannot have two faculty members as chair!
(In these types of Cardinalities NEITHER SIDES CAN BE CONSIDERED PARENTS)
– 1:n
– One Dana Holgorsen coaches many players
– Players have one coach
Many children have one mother (Parent)
– n:1
– Many children are raised by one Kate Gosselin
– Children have one mother
Many shoppers visit many stores
– m:n (1:n + n:1)
– One shopper visits many stores + Many shoppers visit one store
(In these types of Cardinalities BOTH SIDES CAN BE CONSIDERED PARENTS)
One department has one department chair
– 1:1
– One faulty (Dr.Johnson) is chair of one department (DISC)
– A faculty member cannot chair two departments, and a department cannot have two faculty members as chair!
(In these types of Cardinalities NEITHER SIDES CAN BE CONSIDERED PARENTS)
Instance Diagram
Good to help with understanding, not practical on a large scale
Similar to an “object diagram” in universal modeling language (UML)
Similar to an “object diagram” in universal modeling language (UML)
Attributes in a Relationship
May not only be part of an entity, but may also be part of the relationship
Highly dependent on Business Rules
Highly dependent on Business Rules
Base/Strong vs. Weak Entity Types
Base (or strong) entity types are those where entities have independent existence
– a base entity type has a unique identifier
Weak entity types are those where entities do not have an independent existence
– Duplicate entity instances may be present
– A weak entity type does not have a unique identifier
– Must be related to a base (strong) entity to be identified – known as “identifying relationship”
– A weak entity type has a “partial key” – also known as a “discriminator”
Double Square = weak entity
Double Diamond = identifying relationship
Primary Key identified as solid underline for Base(Strong) Entity
Partial Key identified as dotted underline for Weak Entity
– a base entity type has a unique identifier
Weak entity types are those where entities do not have an independent existence
– Duplicate entity instances may be present
– A weak entity type does not have a unique identifier
– Must be related to a base (strong) entity to be identified – known as “identifying relationship”
– A weak entity type has a “partial key” – also known as a “discriminator”
Double Square = weak entity
Double Diamond = identifying relationship
Primary Key identified as solid underline for Base(Strong) Entity
Partial Key identified as dotted underline for Weak Entity
Database Design is …..
both a SCIENCE and an ART
4 Types of Deletion Constraints
Restrict Rule, Cascade Rule, Set Null Rule, Set Default Rule
Restrict Rule (R)
When an attempt is made to delete an entity instance from a parent entity in a relationship, the deletion should be disallowed if child entity instances are related to the parent entity instance. The restrict rule is imposed on the parent entity type in the relationship.
– When a deletion constraint is not specified, the restrict rule is usually implied by default
– When a deletion constraint is not specified, the restrict rule is usually implied by default
Cascade Rule (C)
When an attempt is made to delete an entity instance from a parent entity in a relationship, if all child entity instances related to this parent in this relationship should also be deleted along with the deletion of the parent entity instance, the cascade rule applies and is imposed on the child entity type in the relationship.
Set Null Rule (N)
When an attempt is made to delete an entity instance from a parent entity in a relationship, if all child entity instances related to this parent in this relationship should be retained but no longer referenced to this parent while the deletion of the parent entity is allowed, the “set null” rule applies and is imposed on the child entity type in the relationship
Set Default Rule (D)
When an attempt is made to delete a parent entity instance in a relationship, if all child entity instances related to this parent in this relationship should be retained despite the deletion of the parent entity instance by shifting the parent reference to a predefined default parent, the “set default” rule applies and is imposed on the child entity type in the relationship.
Deletion Rules (Examples)
Deletion of an instance from a child entity type in m:n or 1:n relationship requires no action
– When we remove a player (child) from the team, no impact to the coach-player relationship
Deletion of an instance from a parent entity type in a relationship requires some type of action
– When we remove the coach (parents) from the team, what is the team going to do now?
– Refuse to let him go? Stop playing? Play with no coach? Find a new coach?
– When we remove a player (child) from the team, no impact to the coach-player relationship
Deletion of an instance from a parent entity type in a relationship requires some type of action
– When we remove the coach (parents) from the team, what is the team going to do now?
– Refuse to let him go? Stop playing? Play with no coach? Find a new coach?
Day 5
LOOK OVER NOTATIONS SYMBOLS AND EXAMPLES OF HOW TO CREATE AN ER MODEL USING A STORY EXAMPLE IN PPT DAY 5
2 Ways of Identifying Entities
Synthesis approach (bottom up)
– List all discernible data elements in the narrative, treating the all as attributes
– Group them together in a way that makes sense (to you)
– Clusters of attributes are (or might be) entities
Analysis approach (top down)
– Nouns are modeled as entity types or as attributes
– Identify relationships between the entities (typically verbs)
– List all discernible data elements in the narrative, treating the all as attributes
– Group them together in a way that makes sense (to you)
– Clusters of attributes are (or might be) entities
Analysis approach (top down)
– Nouns are modeled as entity types or as attributes
– Identify relationships between the entities (typically verbs)
Day 6
COMPARE AND LOOK AT “LOOK ACROSS” AND “LOOK NEAR” METHODS IN PPT DAY 6
Reading ER Diagrams
Both look across and look near mean the same thing BUT
Use one or the other, but never both at the same time!
Use one or the other, but never both at the same time!
Decomposing
Breaking the long complex story into smaller, more manageable chunks
LOOK AT PROFESSORS EXAMPLE OF ADDING ON DELETION CONSTRAINTS TO DAY 5 IN PPT DAY 6
Semantic Integrity Constraints
(WITH EXAMPLES)
(WITH EXAMPLES)
Attribute-Level Business Rule
– Each plant has a plant number that ranges from 10-20
– Gender is either male or female
– Project locations are confined to the cities of Bellaire, Blush Ash, Mason, Stafford, & Sugarland
Entity-Level Business
– An employee cannot be his or her own supervisor
– An employee cannot be his or her own supervisor
– Every plant is managed by an employee who works in the same plant
Miscellaneous Business Rules:
– Each plant has at least 3 buildings
– Each plant must have at least 100 employees
– The salary of an employee cannot exceed the salary of the employee’s supervisor
– Each plant has a plant number that ranges from 10-20
– Gender is either male or female
– Project locations are confined to the cities of Bellaire, Blush Ash, Mason, Stafford, & Sugarland
Entity-Level Business
– An employee cannot be his or her own supervisor
– An employee cannot be his or her own supervisor
– Every plant is managed by an employee who works in the same plant
Miscellaneous Business Rules:
– Each plant has at least 3 buildings
– Each plant must have at least 100 employees
– The salary of an employee cannot exceed the salary of the employee’s supervisor
DAY 7
2 Issues that will blow your mind
Multi-value attributes don’t really exist*
– Exist in conceptual models, but most DBMS will only allow for a single value
– Can be solved by creating a 1:m or m:n relationship
M:N relationships don’t really exist*
– Remember, they are really two 1:m and m:1 relationships
– Exist in conceptual models, but most DBMS will only allow for a single value
– Can be solved by creating a 1:m or m:n relationship
M:N relationships don’t really exist*
– Remember, they are really two 1:m and m:1 relationships
LOOK AT THE MODEL FOR MULT-VALUE ATTRIBUTES, MANY TO MANY RELATIONSHIP, GERUND, STORY EXAMPLE IN PPT DAY 7
2 ways to model a many to many relationship
Employee Project:
– Many employees are
assigned to a project
– A project has many employees assigned to it
Dependent-Hobby:
– A dependent may have
many hobbies
– A hobby may have many dependents that participate
– Many employees are
assigned to a project
– A project has many employees assigned to it
Dependent-Hobby:
– A dependent may have
many hobbies
– A hobby may have many dependents that participate
Dependent-hobby
When you decompose the M:N relationship into two 1:m relationships
The Participation entity is a gerund
The Participation entity is a gerund
Gerund
Product of decomposing m:n relationships
Also called a “composite entity” or “bridge entity”
Takes the primary key from each participating entity to create a set of 1:n and m:1 relationships
DO NOT HAVE PARTIAL KEYS
(How we capture the attributes of a relationship)
Looks and acts like a weak entity BUT not the same since weak entities are a product of the business rules
Also called a “composite entity” or “bridge entity”
Takes the primary key from each participating entity to create a set of 1:n and m:1 relationships
DO NOT HAVE PARTIAL KEYS
(How we capture the attributes of a relationship)
Looks and acts like a weak entity BUT not the same since weak entities are a product of the business rules
Foreign Key Placement (Use of Example to understand)
In a 1:M Employee and Plant relationship, Foreign Key (FK) goes with the “child” side
The Foreign Key is an attribute in the child entity (employees in this case) that refers to the primary key of the entity on the other side of the relationship
In a 1:1 relationship, we place the Foreign Key based on participation
– Foreign Key goes with the entity that has total (mandatory) participation
If both sides of a 1:1 have mandatory participation
– It doesn’t matter where you place the Foreign Key (pretty rare to find)
If both sides of a 1:1 have optional participation
– Put it with the entity that will have fewer instances
– Use gerund to ensure you have no NULL values!
The Foreign Key is an attribute in the child entity (employees in this case) that refers to the primary key of the entity on the other side of the relationship
In a 1:1 relationship, we place the Foreign Key based on participation
– Foreign Key goes with the entity that has total (mandatory) participation
If both sides of a 1:1 have mandatory participation
– It doesn’t matter where you place the Foreign Key (pretty rare to find)
If both sides of a 1:1 have optional participation
– Put it with the entity that will have fewer instances
– Use gerund to ensure you have no NULL values!
Modeling Errors
Errors may be in:
Syntax: using the symbols incorrectly
– Relatively easy to spot
Semantics: not accurately reflecting the business rules
– More difficult to spot, because these are often judgement calls
Syntax: using the symbols incorrectly
– Relatively easy to spot
Semantics: not accurately reflecting the business rules
– More difficult to spot, because these are often judgement calls
DAY 8
Relational Data Model
Main Concepts:
Database = Collection of relations
Relation = Two-dimensional table
Tuple = Row of related data values in the table
Attribute = Column in the table
Domain = Set of possible atomic values of an attribute
Database = Collection of relations
Relation = Two-dimensional table
Tuple = Row of related data values in the table
Attribute = Column in the table
Domain = Set of possible atomic values of an attribute
Conceptual Modeling (Leading to the Motivation for Logical Data Modeling)
Completion of conceptual modeling phase results in a picture of data requirements at high level of abstraction
(During conceptual modeling, we are not constrained by technology limitations that will be used for implementation
– We got to use stuff like multi-value attributes and m:n relationships)
(Conceptual schema may contain constructs not directly compatible with technology intended for implementation
– Like multi-value attributes and m:n relationships!)
(Further refinement may be required to eliminate data redundancy in design
– Getting rid of multi-value attributes
– Decomposing m:n relationships into 1:m using gerunds)
(During conceptual modeling, we are not constrained by technology limitations that will be used for implementation
– We got to use stuff like multi-value attributes and m:n relationships)
(Conceptual schema may contain constructs not directly compatible with technology intended for implementation
– Like multi-value attributes and m:n relationships!)
(Further refinement may be required to eliminate data redundancy in design
– Getting rid of multi-value attributes
– Decomposing m:n relationships into 1:m using gerunds)
The Motivation for Logical Data Modeling
Transforming conceptual schema to something that is more compatible with implementation technology of choice is achieved via logical data modeling
Logical data modeling phase serves as transition from technology-independent conceptual schema to technology-dependent design that can actually be implemented
(Conceptual modeling uses some of the same concepts/terms as logical modeling, but in a slightly different (but really the same) way
– Cardinality
– Degree of relationships
– “Relations” are different than “relationships”)
Logical data modeling phase serves as transition from technology-independent conceptual schema to technology-dependent design that can actually be implemented
(Conceptual modeling uses some of the same concepts/terms as logical modeling, but in a slightly different (but really the same) way
– Cardinality
– Degree of relationships
– “Relations” are different than “relationships”)
Relation: (Two-Dimensional Table) (2 Components of a Relation)
Heading: A single tuple listing the attributes (Relation Schema)
Body: Collection of data tuples
Body: Collection of data tuples
LOOK AT THE EXAMPLES OF DEGREE OF THE RELATION: NUMBER OF ATTRIBUTES AND CARDINALITY OF THE RELATION: NUMBER OF TUPLES IN PPT DAY 8
Characteristics of a Relation
A relation is a mathematical term that resembles a two-dimensional table
Has a heading, which is a tuple of attributes, also known as the relation schema
Has a body, which is made up of many tuples of data containing the same attributes
Attributes of relation schema have unique names
Values of an attribute in a relation come from same domain
Order of arrangement of tuples does not matter
Order of attributes does not matter
Each attribute value in tuple is atomic; hence, composite and multi-valued
attributes are not allowed in a relation
Derived attributes are not captured in relation schema
All tuples in relation must be distinct (i.e., relation schema must have unique identifier)
Has a heading, which is a tuple of attributes, also known as the relation schema
Has a body, which is made up of many tuples of data containing the same attributes
Attributes of relation schema have unique names
Values of an attribute in a relation come from same domain
Order of arrangement of tuples does not matter
Order of attributes does not matter
Each attribute value in tuple is atomic; hence, composite and multi-valued
attributes are not allowed in a relation
Derived attributes are not captured in relation schema
All tuples in relation must be distinct (i.e., relation schema must have unique identifier)
Naming Conventions
Attribute names MUST be unique within a relation
Attribute names MUST be unique across the entire conceptual schema
HOWEVER, most (all?) modern DBMS specify attributes as: relation.attributename
– Since the DBMS will not let you have duplicate relation names or attribute names within a relation, this is guaranteed to be unique across the conceptual schema
Each attribute name begins with up to a three-letter prefix that represents an abbreviation of the name of relation schema to which attribute belongs
– This prefix is followed by an underscore character
– Only first letter of prefix is capitalized
Following the underscore character is suffix that corresponds to attribute name itself
– This suffix may contain only lowercase letters, a pound sign (#), and underscore characters, and corresponds to name of the attribute in conceptual data model
– Examples: Pl_name, Pl_p#, Pl_budget
Attribute names MUST be unique across the entire conceptual schema
HOWEVER, most (all?) modern DBMS specify attributes as: relation.attributename
– Since the DBMS will not let you have duplicate relation names or attribute names within a relation, this is guaranteed to be unique across the conceptual schema
Each attribute name begins with up to a three-letter prefix that represents an abbreviation of the name of relation schema to which attribute belongs
– This prefix is followed by an underscore character
– Only first letter of prefix is capitalized
Following the underscore character is suffix that corresponds to attribute name itself
– This suffix may contain only lowercase letters, a pound sign (#), and underscore characters, and corresponds to name of the attribute in conceptual data model
– Examples: Pl_name, Pl_p#, Pl_budget
Set Theory and Relation Algebra
Database theory is based on set theory
Manipulations referred to as “Relational Algebra”
Manipulations referred to as “Relational Algebra”
Set Theory Operators
EX:
Some people are from Houston
Some people attend UH
Union: People that are from Houston OR go to UH
Intersection: People that are from Houston AND go to UH
Difference:
People from Houston but do NOT go to UH
People that go to UH but NOT from Houston
Some people are from Houston
Some people attend UH
Union: People that are from Houston OR go to UH
Intersection: People that are from Houston AND go to UH
Difference:
People from Houston but do NOT go to UH
People that go to UH but NOT from Houston
Relational Algebra Primer
Relational algebra just lets us abstractly talk about relations and data (like algebra lets us abstractly talk about numbers)
An attribute is defined as an ordered set (N, D)
– N is the name of the attribute
– D is the domain of the attribute
A set of attributes can be represented as a vector (C)
– C is the set {(N1,D1), (N2,D2), (N3,D3),…(Nn,Dn)}
So…. A relation is a set (R,C) where R is the name of the relation schema and C is the list of attributes that make up R
You database is made up of multiple relations (R,C)
Each (R,C) is equivalent to…
▫ R(A1, A2, A3,…An) which is equivalent to … • R({(N1,D1), (N2,D2), (N3,D3),…(Nn,Dn)})
An attribute is defined as an ordered set (N, D)
– N is the name of the attribute
– D is the domain of the attribute
A set of attributes can be represented as a vector (C)
– C is the set {(N1,D1), (N2,D2), (N3,D3),…(Nn,Dn)}
So…. A relation is a set (R,C) where R is the name of the relation schema and C is the list of attributes that make up R
You database is made up of multiple relations (R,C)
Each (R,C) is equivalent to…
▫ R(A1, A2, A3,…An) which is equivalent to … • R({(N1,D1), (N2,D2), (N3,D3),…(Nn,Dn)})
Relational Algebra Primer
(WITH EXAMPLES)
(WITH EXAMPLES)
We can substitute symbols to represent data and do abstract calculations just like in traditional algebra
– Area of a room is L x W
– L = 10 W=20 A = L x W = ? A = 10 x 20 = 200
– If we want a room that is 4 times as large, we can nest our equations
– 4(A)4(L x W)4(10 x 20)4(200)800
For databases
– We consider a relation (R) as being made up of attributes A1, A2, A3,…An
– The relation schema is R(A1, A2, A3,…An)
– Area of a room is L x W
– L = 10 W=20 A = L x W = ? A = 10 x 20 = 200
– If we want a room that is 4 times as large, we can nest our equations
– 4(A)4(L x W)4(10 x 20)4(200)800
For databases
– We consider a relation (R) as being made up of attributes A1, A2, A3,…An
– The relation schema is R(A1, A2, A3,…An)
DAY 9
