Greedy Strategy and Dynamic Programming Algorithms
Module 2: Greedy Strategy
1. Huffman Coding (Data Compression)
Definition: A greedy algorithm used for lossless data compression. It assigns variable-length codes to characters based on their frequency.
Working Principle:
- Count the frequency of each character.
- Place characters in a priority queue (Min-Heap) based on frequency.
- Pick two nodes with the lowest frequencies and create a new internal node with the sum of their frequencies.
- Repeat until only one node (the root) remains.
- Assign ‘0’ to the left branch and ‘1’ to the right branch.
Example: For characters {A:5, B:9, C:12, D:13, E:16, F:45}, the most frequent character (F) gets the shortest code (e.g., ‘0’), which saves space.
Time Complexity: $O(n \log n)$.
2. Fractional Knapsack Problem
Definition: A problem where we have a knapsack with a weight capacity $W$ and a set of items, each with a weight and a profit. The goal is to maximize profit.
Working Principle (Greedy):
- Calculate the profit-to-weight ($P/W$) ratio for every item.
- Sort items in descending order of this ratio.
- Take the full item if it fits.
- If the remaining capacity is less than the next item’s weight, take a fraction of that item.
Time Complexity: $O(n \log n)$ (due to sorting).
3. P, NP, NP-Complete, and NP-Hard
- P (Polynomial): These are problems solvable in $O(n^k)$ time (e.g., Binary Search).
- NP (Non-deterministic Polynomial): These are problems where a solution can be verified in polynomial time (e.g., Sudoku).
- NP-Hard: These are problems at least as hard as the hardest problems in NP. They do not necessarily have to be in NP.
- NP-Complete: This is a problem that is both in NP and NP-Hard. If any NP-complete problem is solved in polynomial time, then all NP problems can be solved in polynomial time.
Module 3: Dynamic Programming (DP)
4. 0/1 Knapsack Problem (DP)
Definition: Unlike the fractional version, items cannot be broken; you either take the item (1) or leave it (0).
Working Principle:
- Create a 2D table $V[n+1][W+1]$ where $n$ is the number of items and $W$ is capacity.
- Formula: $V[i, w] = \max(V[i-1, w], \text{profit}_i + V[i-1, w-\text{weight}_i])$.
- The result is located in the last cell of the table.
Time Complexity: $O(n \times W)$.
5. Matrix Chain Multiplication (MCM)
Definition: Given a sequence of matrices, find the most efficient way to multiply them by minimizing total scalar multiplications.
Working Principle: It uses Dynamic Programming to find the optimal placement of parentheses. It does not actually multiply the matrices; it calculates the cost.
Recursive Formula: $m[i, j] = \min_{i \leq k < j} \{m[i, k] + m[k+1, j] + p_{i-1} \cdot p_k \cdot p_j\}$.
Time Complexity: $O(n^3)$.
6. Longest Common Subsequence (LCS)
Definition: This is the process of finding the longest sequence of characters that appears in the same relative order in two strings.
Working Principle:
- Build a 2D table.
- If characters match ($X[i] == Y[j]$): $L[i][j] = 1 + L[i-1][j-1]$.
- If they do not match: $L[i][j] = \max(L[i-1][j], L[i][j-1])$.
Time Complexity: $O(m \times n)$.
7. Floyd-Warshall Algorithm
Definition: This algorithm finds the shortest paths between all pairs of vertices in a weighted graph.
Working Principle: It uses a distance matrix and iteratively updates it by considering each vertex $k$ as an intermediate point between vertices $i$ and $j$.
Formula: $D_{i,j}^{(k)} = \min(D_{i,j}^{(k-1)}, D_{i,k}^{(k-1)} + D_{k,j}^{(k-1)})$.
Time Complexity: $O(V^3)$.
Exam Writing Tips for Algorithm Questions
For a five-mark question, follow this structure:
- Heading: (e.g., “0/1 Knapsack Problem”)
- Short Definition: (2 lines)
- Core Formula/Logic: (Use LaTeX-style notation if possible)
- Small Example: (e.g., a $2 \times 2$ table or a 3-node graph)
- Time Complexity: (e.g., $O(n^2)$ )
