Fundamentals of DBMS and Database Concepts
Data, Information, Knowledge
Data – a fact that can be recorded or stored, e.g., person name, age, gender, weight, etc. Information – when data is processed, organized, structured, or presented in a given context so as to make it useful, it is called information. Knowledge – it is the appropriate collection of information.
DBMS (Database Management System)
DBMS = Database + Set of programs.
- Database – a collection of logically related data; e.g., books database in a library, student database in a university.
- Management – manipulation, searching, and security of data; e.g., viewing results on the GTU website, searching exam papers on the GTU website.
- System – programs or tools used to manage databases; e.g., SQL Server Management Studio Express, Oracle.
- DBMS – software designed to define, manipulate, retrieve, and manage data in a database; e.g., Microsoft SQL Server, Oracle, MySQL, SQLite, MongoDB, etc.
Examples of DBMS
- Online telephone directory
- Other web and mobile applications
Database System Applications
- DBMS is a computerized record-keeping system.
- DBMS is required wherever data needs to be stored, for example:
- E-commerce (Flipkart, Amazon, ShopClues, eBay, etc.)
- Online television streaming (Hotstar, Netflix, etc.)
- Social media (WhatsApp, Facebook, etc.)
- Universities, colleges, and schools
- Library management systems
File Processing System (FPS) or File System
In computer science, a File Processing System (FPS) is a way of storing, retrieving, and manipulating data that is present in various files.
Advantages of File Processing System
- Cost friendly: minimal or no setup and usage fee.
- Easy to use: file systems require basic learning and understanding.
- High scalability: one can switch from smaller to larger files as needed.
Disadvantages of File Processing System
- Data redundancy and inconsistency – multiple file formats.
- Difficulty in accessing data – need to write a new program to carry out each new task.
- Data isolation – data scattered in various files and formats makes writing new application programs to retrieve appropriate data difficult.
Problems with File Systems
- Integrity problems – when new constraints are added, it is difficult to change programs to enforce them.
- Atomicity problems – failures may leave the database in an inconsistent state with partial updates carried out. Example: transfer of funds from one account to another should either complete or not happen at all.
- Concurrent access by multiple users – example: two people reading a balance (say 100) and withdrawing 50 each at the same time.
- Security problems
- Poor data control
- Limited data sharing
- Excessive programming effort
Application of DBMS
- Providing application flexibility with relational databases.
- Object-oriented applications and the need for more complex databases.
- Early DB applications.
- Extending DB capabilities for new applications.
Advantages of DBMS over File Management Systems
- Minimal data redundancy.
- Program-data independence.
- Efficient data access.
- Improved data sharing.
- Improved security.
- Economy of scale.
- Reduced program maintenance.
- Improved backup.
- Improved data quality.
Purpose of a DBMS
- Compactness – no need for excessive paperwork.
- Speed.
- Accuracy.
- Protection.
Benefits of the Database Approach
- Data can be shared.
- Redundancy can be reduced.
- Inconsistency can be avoided.
- Security can be enforced.
- Conflicting requirements can be balanced.
- Integrity can be maintained.
Advantages of a DBMS (Detailed)
- Reduce data redundancy (duplication) – avoids unnecessary duplication by storing data centrally.
- Remove data inconsistency – by eliminating redundancy, data inconsistency is reduced.
- Data isolation – users can easily retrieve data per their requirements.
- Guaranteed atomicity – transactions either execute fully or not at all.
- Implement integrity constraints – business rules can be enforced, e.g., do not allow storing an amount less than Rs. 0 in a balance.
- Sharing of data among multiple users – more than one user can access the same data simultaneously.
- Restricting unauthorized access – users can access only the data they are authorized to view.
- Providing backup and recovery services – regular auto or manual backups can be used to restore the database if it becomes corrupt.
Basic Terms of DBMS
- Data – raw, unorganized facts that need processing. Example: marks of students (Student1 = 50/100, Student2 = 25/100).
- Information – when data is processed and structured to be useful. Example: result of students (Pass or Fail) (Student1 = Pass, Student2 = Fail).
- Metadata – data about data; e.g., table name, column name, data type, authorized user and access privileges for a table.
- Data dictionary – an information repository that contains metadata.
- Data warehouse – an information repository that stores data for analysis and reporting.
- Field – a character or group of characters that have specific meaning, e.g., values of Emp_Name, Address, Mobile_No in a Faculty table.
- Record / Tuple – a collection of logically related fields, e.g., (Emp_Name, Address, Mobile_No, Subject) forms a record for the Faculty.
ANSI-SPARC Three-Level Architecture
Internal Level
- Has an internal schema which describes the physical storage structure of the database.
- The internal schema is also known as the physical schema.
- It uses the physical data model and defines how the data will be stored in storage blocks.
- The physical level describes complex low-level data structures in detail.
Conceptual Level
- The conceptual schema describes the design of a database at the conceptual (logical) level.
- The conceptual schema describes the structure of the whole database.
- It specifies what data are to be stored and what relationships exist among those data.
- Internal implementation details are hidden at this level.
- Programmers and database administrators typically work at this level.
External Level
- The external level contains several schemas called subschemas.
- An external schema is also known as a view schema.
- Each view schema describes the part of the database that a particular user group is interested in and hides the remaining database from that user group.
- The view schema describes end-user interaction with the database system.
Data Abstraction in DBMS
Database systems are made up of complex data structures. To ease user interaction, developers hide internal irrelevant details from users. This process of hiding irrelevant details from users is called data abstraction.
Types of Data Independence
- Physical Data Independence – the ability to modify the physical schema without requiring changes in the logical (conceptual) schema and application programs. Modifications at the internal level may be necessary to improve performance (e.g., changes in file structures, compression techniques, hashing algorithms, storage devices).
- Logical Data Independence – the ability to modify the conceptual schema without requiring changes in application programs. Modifications at the logical level are necessary when the logical structure of the database changes. Application programs are heavily dependent on logical structures of the data they access, so changes in logical structure often require program changes.
Types of Database Users
- Naive Users (End Users) – unsophisticated users with little or no knowledge of the DBMS; they interact with the database via application software or tools, e.g., a clerk in a bank.
- Application Programmers – programmers who write software using tools such as Java, .NET, PHP, etc.; e.g., software developers.
- Sophisticated Users – interact with the database system without using an application program; use query tools like SQL, e.g., analysts.
- Specialized Users (DBA) – write specialized database application programs and use administration tools, e.g., database administrators.
Role of the DBA (Database Administrator)
- Schema definition – DBA defines the logical schema of the database.
- Storage structure and access method definition – DBA decides how data are represented and how to access them.
- Defining security and integrity constraints – DBA determines various security and integrity constraints.
- Granting authorization for data access – DBA determines which users need access to which parts of the database.
- Liaison with users – DBA provides necessary data to users.
- Assisting application programmers – DBA assists application programmers in developing application programs.
- Monitoring performance – DBA monitors performance and makes necessary changes for optimization.
- Backup and recovery – DBA backs up the database to storage devices (e.g., DVD, CD, magnetic tape, or other storage) to protect data during failures.
