Greedy Strategy and Dynamic Programming Algorithms

Module 2: Greedy Strategy

1. Huffman Coding (Data Compression)

Definition: A greedy algorithm used for lossless data compression. It assigns variable-length codes to characters based on their frequency.

Working Principle:

  1. Count the frequency of each character.
  2. Place characters in a priority queue (Min-Heap) based on frequency.
  3. Pick two nodes with the lowest frequencies and create a new internal node with the sum of their frequencies.
  4. Repeat until only one node (the root) remains.
  5. Assign ‘0’ to the left branch and ‘1’ to the right branch.

Example: For characters {A:5, B:9, C:12, D:13, E:16, F:45}, the most frequent character (F) gets the shortest code (e.g., ‘0’), which saves space.

Time Complexity: $O(n \log n)$.

2. Fractional Knapsack Problem

Definition: A problem where we have a knapsack with a weight capacity $W$ and a set of items, each with a weight and a profit. The goal is to maximize profit.

Working Principle (Greedy):

  1. Calculate the profit-to-weight ($P/W$) ratio for every item.
  2. Sort items in descending order of this ratio.
  3. Take the full item if it fits.
  4. If the remaining capacity is less than the next item’s weight, take a fraction of that item.

Time Complexity: $O(n \log n)$ (due to sorting).

3. P, NP, NP-Complete, and NP-Hard

  • P (Polynomial): These are problems solvable in $O(n^k)$ time (e.g., Binary Search).
  • NP (Non-deterministic Polynomial): These are problems where a solution can be verified in polynomial time (e.g., Sudoku).
  • NP-Hard: These are problems at least as hard as the hardest problems in NP. They do not necessarily have to be in NP.
  • NP-Complete: This is a problem that is both in NP and NP-Hard. If any NP-complete problem is solved in polynomial time, then all NP problems can be solved in polynomial time.

Module 3: Dynamic Programming (DP)

4. 0/1 Knapsack Problem (DP)

Definition: Unlike the fractional version, items cannot be broken; you either take the item (1) or leave it (0).

Working Principle:

  1. Create a 2D table $V[n+1][W+1]$ where $n$ is the number of items and $W$ is capacity.
  2. Formula: $V[i, w] = \max(V[i-1, w], \text{profit}_i + V[i-1, w-\text{weight}_i])$.
  3. The result is located in the last cell of the table.

Time Complexity: $O(n \times W)$.

5. Matrix Chain Multiplication (MCM)

Definition: Given a sequence of matrices, find the most efficient way to multiply them by minimizing total scalar multiplications.

Working Principle: It uses Dynamic Programming to find the optimal placement of parentheses. It does not actually multiply the matrices; it calculates the cost.

Recursive Formula: $m[i, j] = \min_{i \leq k < j} \{m[i, k] + m[k+1, j] + p_{i-1} \cdot p_k \cdot p_j\}$.

Time Complexity: $O(n^3)$.

6. Longest Common Subsequence (LCS)

Definition: This is the process of finding the longest sequence of characters that appears in the same relative order in two strings.

Working Principle:

  1. Build a 2D table.
  2. If characters match ($X[i] == Y[j]$): $L[i][j] = 1 + L[i-1][j-1]$.
  3. If they do not match: $L[i][j] = \max(L[i-1][j], L[i][j-1])$.

Time Complexity: $O(m \times n)$.

7. Floyd-Warshall Algorithm

Definition: This algorithm finds the shortest paths between all pairs of vertices in a weighted graph.

Working Principle: It uses a distance matrix and iteratively updates it by considering each vertex $k$ as an intermediate point between vertices $i$ and $j$.

Formula: $D_{i,j}^{(k)} = \min(D_{i,j}^{(k-1)}, D_{i,k}^{(k-1)} + D_{k,j}^{(k-1)})$.

Time Complexity: $O(V^3)$.


Exam Writing Tips for Algorithm Questions

For a five-mark question, follow this structure:

  • Heading: (e.g., “0/1 Knapsack Problem”)
  • Short Definition: (2 lines)
  • Core Formula/Logic: (Use LaTeX-style notation if possible)
  • Small Example: (e.g., a $2 \times 2$ table or a 3-node graph)
  • Time Complexity: (e.g., $O(n^2)$ )