Database Storage Structures and Index Management Techniques
Database Storage Structures
There are several methods for storing user data in a database, each optimized for different performance and scalability requirements.
Regular Tables
A regular table (generally referred to as a “table”) is the most commonly used form of storing user data.
- The database administrator has very limited control over the distribution of rows in an un-clustered table.
- Rows can be stored in any order depending on the activity on the table.
Partitioned Tables
A partitioned table enables the building of scalable applications. It has the following characteristics:
- A partitioned table has one or more partitions, each storing rows partitioned using range, hash, composite, or list partitioning.
- Each partition is a segment and can be located in a different tablespace.
- Partitions are useful for large tables that can be queried or manipulated using several processes concurrently.
- Special commands are available to manage partitions within a table.
Index-Organized Tables (IOT)
An index-organized table is similar to a heap table but includes a primary key index on one or more of its columns.
- Instead of maintaining two separate storage spaces (one for the table and one for a B-tree index), an IOT maintains a single B-tree containing the primary key and other column values.
- An overflow segment may exist if the
PCTTHRESHOLDvalue is set, resulting in longer row lengths requiring the overflow area. - IOTs provide fast key-based access for queries involving exact matches and range searches.
- Storage requirements are reduced because key columns are not duplicated in the table and index.
- Non-key columns are stored in the index unless the index entry becomes very large; in that case, the DBMS server provides an
OVERFLOWclause.
Clustered Tables
A clustered table provides an optional method for storing table data.
- A cluster is made up of a table or group of tables that share the same data blocks.
- These tables are grouped because they share common columns and are often used together, optimizing retrieval performance.
Essential Database Data Types
Character Data Types
Character data can be stored as either fixed-length or variable-length strings.
- Fixed-Length: Data types such as
CHARandNCHARare stored with padded blanks. NCHARis a Globalization Support data type enabling storage of fixed-width or variable-width character sets.- The maximum size is typically 2,000 bytes per row.
- The default size is 1 character or 1 byte, depending on the character set.
Numeric Data
Numbers in a database are always stored as variable-length data and can store up to 38 significant digits. Numeric data types require:
- One byte for the exponent.
- One byte for every two significant digits in the mantissa.
- One byte for negative numbers if the number of significant digits is less than 38 bytes.
Large Object (LOB) Data Types
- CLOB: Used for large fixed-width character data.
- NCLOB: Used for large fixed-width national character set data.
TIMESTAMP Data Type
This data type stores the date and time, including fractional seconds up to 9 decimal places.
TIMESTAMP WITH TIME ZONEandTIMESTAMP WITH LOCAL TIME ZONEcan use time zones to factor in items such as Daylight Savings Time.TIMESTAMPandTIMESTAMP WITH LOCAL TIME ZONEcan be used in primary keys.TIMESTAMP WITH TIME ZONEcannot be used in primary keys.
ROWID and UROWID Data Types
The database server provides the universal ROWID (UROWID) data type.
UROWIDsupportsROWIDsof foreign tables (non-Oracle tables) and can store all kinds ofROWIDs.- A
UROWIDis required to store aROWIDfor rows stored in an Index Organized Table. - The value of the parameter
COMPATIBLEmust be set higher to useUROWID.
Advanced Row and Table Structures
Nested Tables
Nested tables provide a means of defining a table as a column within another table. They are useful for storing sets that may have a large number of records, such as the items in an order.
Extended and Restricted ROWIDs
Extended ROWID
An Extended ROWID needs 10 bytes of storage on disk and is displayed using 18 characters. It consists of the following components:
- A Data Object Number, which is assigned to each data object (table or index) when it is created, and is unique within the database.
Restricted ROWID
Earlier database versions used the restricted ROWID format. A restricted ROWID used only six bytes internally and did not contain the data object number.
Structure of a Row
- Row data is stored in database blocks as variable-length records.
- Columns for a row are generally stored in the order in which they are defined.
- Any trailing
NULLcolumns are not stored, optimizing space.
Table Management and Storage Parameters
The CREATE TABLE Command
The CREATE TABLE command is used to create relational tables or object tables.
CREATE TABLE hr.employees (...)The TABLESPACE Clause
The TABLESPACE clause specifies where the table will be created. If omitted, the database creates the object in the default tablespace of the schema owner.
Changing Storage Parameters
Storage parameters can be modified using the ALTER TABLE command:
ALTER TABLE hr.employees PCTFREE 30 PCTUSED 50 STORAGE(NEXT 500K MINEXTENTS 2 MAXEXTENTS 100);Manual Allocation of Extents
Extents may need to be allocated manually for specific control:
- To control the distribution of extents across files.
- Before loading data in bulk to avoid dynamic extension of the table.
Syntax for Manual Allocation:
ALTER TABLE [schema.]table ALLOCATE EXTENT[(SIZE integer[K|M] [DATAFILE ‘filename’])]Relocating Tables
A non-partitioned table can be moved without using the Export or Import utility. This process also allows storage parameters to be changed. This is useful for:
- Moving a table from one tablespace to another.
- Reorganizing the table to eliminate row migration.
Truncating Tables
Truncating a table deletes all rows in the table and immediately releases the used space.
Index Management and Classification
Index Classification
Indexes can be classified based on their logical design or physical implementation.
Partitioned and Non-Partitioned Indexes
Partitioned indexes are often used with partitioned tables to improve scalability and manageability. They are used for large tables to store index entries corresponding to an index in several segments.
B-Tree Index Structure
B-tree indexes (or index trees) consist of three main levels:
- Root: Contains entries that point to the next level in the index.
- Branch Blocks: Point to blocks at the next level in the index.
- Leaf Nodes: Contain the actual index entries that point to rows in the table.
The leaf blocks are doubly linked to facilitate scanning the index in both ascending and descending order of key values.
Coalescing Indexes
When index fragmentation occurs, you can rebuild or coalesce the index. Coalescing an index is a block rebuild operation that is typically done online.
ALTER INDEX hr.employees_idx COALESCE;Index Validity and Dropping Indexes
Best practices for index management include:
- Drop and re-create indexes before bulk loads.
- Drop indexes that are infrequently needed and build them when necessary.
DROP INDEX hr.departments_name_idx;Rebuilding Indexes
A new index is built using an existing index as the data source. Key benefits of rebuilding:
- Sorts are often not needed, resulting in better performance.
- The old index is deleted after the new index is built.
- The resulting index does not contain any deleted entries. Therefore, this index uses space more efficiently.
Note: During the rebuild, sufficient space is needed to accommodate both the old and the new index in their respective tablespaces.
Manual Index Space Allocation and Deallocation
The command to allocate or deallocate index space follows similar rules as table space management:
ALTER INDEX [ schema. ] index
{ ALLOCATE EXTENT ( [ SIZE integer [ K | M ] ][ DATAFILE ‘filename’ ] )
| DEALLOCATE UNUSED [KEEP integer [ K | M ] ] }Index Types
Various index types are available to optimize different query patterns:
- B-Tree Index: The standard index structure.
- Bitmap Index: Highly effective for columns with low cardinality (few distinct values).
- Domain Indexes: User-defined indexes for specialized data types.
- Function-Based Indexes: Indexes built on the result of a function or expression.
- Single-Column and Concatenated Indexes: Indexes built on one column or multiple columns, respectively.
Table and Column Maintenance Operations
Database administrators frequently perform maintenance tasks such as:
- Managing unused columns.
- Dropping a column.
- Dropping a table.
- Truncating a table (deleting all rows and releasing space).
