Database Storage Structures and Index Management Techniques

Database Storage Structures

There are several methods for storing user data in a database, each optimized for different performance and scalability requirements.

Regular Tables

A regular table (generally referred to as a “table”) is the most commonly used form of storing user data.

  • The database administrator has very limited control over the distribution of rows in an un-clustered table.
  • Rows can be stored in any order depending on the activity on the table.

Partitioned Tables

A partitioned table enables the building of scalable applications. It has the following characteristics:

  • A partitioned table has one or more partitions, each storing rows partitioned using range, hash, composite, or list partitioning.
  • Each partition is a segment and can be located in a different tablespace.
  • Partitions are useful for large tables that can be queried or manipulated using several processes concurrently.
  • Special commands are available to manage partitions within a table.

Index-Organized Tables (IOT)

An index-organized table is similar to a heap table but includes a primary key index on one or more of its columns.

  • Instead of maintaining two separate storage spaces (one for the table and one for a B-tree index), an IOT maintains a single B-tree containing the primary key and other column values.
  • An overflow segment may exist if the PCTTHRESHOLD value is set, resulting in longer row lengths requiring the overflow area.
  • IOTs provide fast key-based access for queries involving exact matches and range searches.
  • Storage requirements are reduced because key columns are not duplicated in the table and index.
  • Non-key columns are stored in the index unless the index entry becomes very large; in that case, the DBMS server provides an OVERFLOW clause.

Clustered Tables

A clustered table provides an optional method for storing table data.

  • A cluster is made up of a table or group of tables that share the same data blocks.
  • These tables are grouped because they share common columns and are often used together, optimizing retrieval performance.

Essential Database Data Types

Character Data Types

Character data can be stored as either fixed-length or variable-length strings.

  • Fixed-Length: Data types such as CHAR and NCHAR are stored with padded blanks.
  • NCHAR is a Globalization Support data type enabling storage of fixed-width or variable-width character sets.
  • The maximum size is typically 2,000 bytes per row.
  • The default size is 1 character or 1 byte, depending on the character set.

Numeric Data

Numbers in a database are always stored as variable-length data and can store up to 38 significant digits. Numeric data types require:

  • One byte for the exponent.
  • One byte for every two significant digits in the mantissa.
  • One byte for negative numbers if the number of significant digits is less than 38 bytes.

Large Object (LOB) Data Types

  • CLOB: Used for large fixed-width character data.
  • NCLOB: Used for large fixed-width national character set data.

TIMESTAMP Data Type

This data type stores the date and time, including fractional seconds up to 9 decimal places.

  • TIMESTAMP WITH TIME ZONE and TIMESTAMP WITH LOCAL TIME ZONE can use time zones to factor in items such as Daylight Savings Time.
  • TIMESTAMP and TIMESTAMP WITH LOCAL TIME ZONE can be used in primary keys.
  • TIMESTAMP WITH TIME ZONE cannot be used in primary keys.

ROWID and UROWID Data Types

The database server provides the universal ROWID (UROWID) data type.

  • UROWID supports ROWIDs of foreign tables (non-Oracle tables) and can store all kinds of ROWIDs.
  • A UROWID is required to store a ROWID for rows stored in an Index Organized Table.
  • The value of the parameter COMPATIBLE must be set higher to use UROWID.

Advanced Row and Table Structures

Nested Tables

Nested tables provide a means of defining a table as a column within another table. They are useful for storing sets that may have a large number of records, such as the items in an order.

Extended and Restricted ROWIDs

Extended ROWID

An Extended ROWID needs 10 bytes of storage on disk and is displayed using 18 characters. It consists of the following components:

  • A Data Object Number, which is assigned to each data object (table or index) when it is created, and is unique within the database.

Restricted ROWID

Earlier database versions used the restricted ROWID format. A restricted ROWID used only six bytes internally and did not contain the data object number.

Structure of a Row

  • Row data is stored in database blocks as variable-length records.
  • Columns for a row are generally stored in the order in which they are defined.
  • Any trailing NULL columns are not stored, optimizing space.

Table Management and Storage Parameters

The CREATE TABLE Command

The CREATE TABLE command is used to create relational tables or object tables.

CREATE TABLE hr.employees (...)

The TABLESPACE Clause

The TABLESPACE clause specifies where the table will be created. If omitted, the database creates the object in the default tablespace of the schema owner.

Changing Storage Parameters

Storage parameters can be modified using the ALTER TABLE command:

ALTER TABLE hr.employees PCTFREE 30 PCTUSED 50 STORAGE(NEXT 500K MINEXTENTS 2 MAXEXTENTS 100);

Manual Allocation of Extents

Extents may need to be allocated manually for specific control:

  • To control the distribution of extents across files.
  • Before loading data in bulk to avoid dynamic extension of the table.

Syntax for Manual Allocation:

ALTER TABLE [schema.]table ALLOCATE EXTENT[(SIZE integer[K|M] [DATAFILE ‘filename’])]

Relocating Tables

A non-partitioned table can be moved without using the Export or Import utility. This process also allows storage parameters to be changed. This is useful for:

  • Moving a table from one tablespace to another.
  • Reorganizing the table to eliminate row migration.

Truncating Tables

Truncating a table deletes all rows in the table and immediately releases the used space.

Index Management and Classification

Index Classification

Indexes can be classified based on their logical design or physical implementation.

Partitioned and Non-Partitioned Indexes

Partitioned indexes are often used with partitioned tables to improve scalability and manageability. They are used for large tables to store index entries corresponding to an index in several segments.

B-Tree Index Structure

B-tree indexes (or index trees) consist of three main levels:

  1. Root: Contains entries that point to the next level in the index.
  2. Branch Blocks: Point to blocks at the next level in the index.
  3. Leaf Nodes: Contain the actual index entries that point to rows in the table.

The leaf blocks are doubly linked to facilitate scanning the index in both ascending and descending order of key values.

Coalescing Indexes

When index fragmentation occurs, you can rebuild or coalesce the index. Coalescing an index is a block rebuild operation that is typically done online.

ALTER INDEX hr.employees_idx COALESCE;

Index Validity and Dropping Indexes

Best practices for index management include:

  • Drop and re-create indexes before bulk loads.
  • Drop indexes that are infrequently needed and build them when necessary.
DROP INDEX hr.departments_name_idx;

Rebuilding Indexes

A new index is built using an existing index as the data source. Key benefits of rebuilding:

  • Sorts are often not needed, resulting in better performance.
  • The old index is deleted after the new index is built.
  • The resulting index does not contain any deleted entries. Therefore, this index uses space more efficiently.

Note: During the rebuild, sufficient space is needed to accommodate both the old and the new index in their respective tablespaces.

Manual Index Space Allocation and Deallocation

The command to allocate or deallocate index space follows similar rules as table space management:

ALTER INDEX [ schema. ] index
{ ALLOCATE EXTENT ( [ SIZE integer [ K | M ] ][ DATAFILE ‘filename’ ] )
| DEALLOCATE UNUSED [KEEP integer [ K | M ] ] }

Index Types

Various index types are available to optimize different query patterns:

  • B-Tree Index: The standard index structure.
  • Bitmap Index: Highly effective for columns with low cardinality (few distinct values).
  • Domain Indexes: User-defined indexes for specialized data types.
  • Function-Based Indexes: Indexes built on the result of a function or expression.
  • Single-Column and Concatenated Indexes: Indexes built on one column or multiple columns, respectively.

Table and Column Maintenance Operations

Database administrators frequently perform maintenance tasks such as:

  • Managing unused columns.
  • Dropping a column.
  • Dropping a table.
  • Truncating a table (deleting all rows and releasing space).