Fundamentals of Channel Capacity and Data Coding

Channel Capacity Concepts

Channel capacity is the maximum amount of reliable information that a communication channel can carry, typically expressed in bits per second (bps), with an arbitrarily low probability of error.

There are two primary contexts for channel capacity: one for continuous-time channels and another for discrete-time channels.

Discrete Memoryless Channels

A discrete memoryless channel is defined by:

  • An input alphabet X = {x1, x2, …, xn}, which is the set of symbols that can be transmitted through the channel.
  • An output alphabet Y = {y1, y2, …, ym}, which is the set of symbols that can be received at the channel output.

Key concepts for discrete channels include input entropy, output entropy, and conditional entropy (input given output, and output given input).

Continuous Channels and Shannon’s Formula

For continuous channels, calculating channel capacity is different and requires the use of the sampling theorem and other statistical concepts.

In 1949, Claude Shannon demonstrated that the maximum theoretical capacity of a band-limited communication channel with Additive White Gaussian Noise (AWGN) is given by the Shannon-Hartley theorem:

C = B log₂(1 + SNR)

This equation shows that channel capacity (C) is limited by its bandwidth (B) and its signal-to-noise ratio (SNR).

The maximum spectral efficiency depends on the quality of the channel (i.e., the noise level).

Asymptotic Equipartition Property (AEP)

The Asymptotic Equipartition Property (AEP) is a fundamental concept in information theory. This property states that for a sequence x1, x2, x3, …, xn generated by an independent and identically distributed (i.i.d.) source with entropy H, the probability of observing such a sequence is approximately 2-nH.

While the Law of Large Numbers tells us about the average behavior of i.i.d. random variables, the AEP allows us to categorize sequences generated by a source into two types: typical and atypical.

Formula

As n tends to infinity, the AEP allows us to separate the sequences generated into these two types. Typical sequences are those whose empirical properties (like average log-probability) are close to their expected values.

The AEP implies that for a sequence x1, x2, …, xn of i.i.d. random variables, the probability p(x1, x2, …, xn) is approximately 2-nH. This suggests a high probability that if x1, x2, …, xn are i.i.d., then the probability of the sequence being “typical” is:

Pr((x₁, x₂, ..., xₙ) | p(x₁, x₂, ..., xₙ) ≈ 2⁻ⁿᴴ) ≈ 1

Source Coding Principles

Defining Source Codes

Let Ax = {x1, x2, …, xn} be the set of symbols of a given source alphabet. A code is defined as a mapping from these source symbols to sequences of symbols from a D-ary code alphabet, Ad. For simplicity, we can assume Ad = {0, 1, …, D-1} (e.g., a binary alphabet if D=2).

A block code maps each symbol from the source alphabet Ax to a fixed sequence of symbols from the code alphabet Ad. Each of these fixed sequences is called a codeword.

Xᵢ ------- c ---- c(Xᵢ)

Where c represents the coding function.

The length of a codeword c(xi) is the number of symbols from the Ad alphabet that comprise it. The average length of a code is defined as:

Formula

Types of Codes

A block code is called nonsingular if all its codewords are distinct.

Formula

A code c is called uniquely decodable if any sequence of codewords can be unambiguously decoded back into a unique sequence of source symbols. This means that two different sequences of source symbols will always produce different sequences of codewords.

Formula

There are two main types of uniquely decodable codes:

  1. Non-prefix codes
  2. Prefix codes (also known as instantaneous codes)

An instantaneous code (or prefix code) is a uniquely decodable code where no codeword is a prefix of another codeword. This property allows for immediate decoding of a sequence without needing to look ahead at subsequent symbols.

The hierarchy of codes, from most general to most restrictive, is:

  • Nonsingular Codes
  • Uniquely Decodable Codes
  • Instantaneous Codes

Data Coding and Codecs

This section discusses types of information processing, often referred to as codecs (coder-decoders).

  1. Line Codes

    Line codes are used to transmit digital data over a physical transmission line. They often incorporate clock synchronization information directly within the data stream, allowing the receiver to recover the clock signal from the incoming data. Examples include Manchester coding or NRZ (Non-Return-to-Zero) encoding.

  2. Voice Codecs

    Voice codecs are designed to compress speech signals, significantly reducing the data rate required for transmission. This is crucial for applications like satellite and cellular telephony, where bandwidth is limited. A common uncompressed speech rate is 64 kbit/sec (using Pulse Code Modulation, PCM).

    Differential Pulse Code Modulation (DPCM) is a type of predictive encoding that can also be applied to video signals. It encodes the difference between the current sample and a predicted sample.

    Linear Predictive Coding (LPC) is another advanced predictive encoding method. LPC is often more effective than DPCM from the standpoint of cost, maintaining quality, and reducing processing delay. Both DPCM and LPC are widely used in satellite and cellular telephony systems.