Algorithms and Complexity
Posted on Jun 13, 2024 in Computers
*** LOOP INVARIANTS ***
Initialization:
1. The loop invariant states that after the 0th iteration…
2. Loop Invariant at alpha = 0
3. Before the first iteration …
4. Explanation of all the code before the loop
5. Thus the invariant is true after the 0th iteration (before first)
Maintenance:
1. Assume that the loop invariant is true after the (alpha – 1)th iteration. I will prove that after the (alpha)th iteration…
2. State loop invariant at alpha = alpha
3. On the (alpha)th iteration…
4. Explanation of all the code within the body of the loop (might need to show mathematical logic)
5. As we can see, the invariant holds true after the (alpha)th iteration, thus proving maintenance.
*** BIG-THETA, BIG-OH, BIG-OMEGA ***
For the following definitions, recall that N+ is the set of all positive natural numbers, i.E.,
N = {1, 2, 3, . . .}.
Big-Theta
Let g(n) be a function. We write Θ(g(n)) to denote the set of all functions
F(n) for which there exist positive real numbers c1, c2 and an n0 ∈ N+ such that for all
Natural numbers n ≥ n0, >>> 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n).
Big-Oh
Let g(n) be a function. We write O(g(n)) to denote the set of all functions
F(n) for which there exists a positive real number c and a positive natural number n0 such
That for all n ≥ n0. O-notation provides an upper bound, but not a lower bound. Compare this to Θnotation, which provides both. As a result, if f(n) = Θ(g(n)), then f(n) = O(g(n)), too. >>> 0 ≤ f(n) ≤ cg(n).
Big-Omega
On the other hand, if we only want to provide a lower bound, we can use Ω-notation:
Let g(n) be a function. We write Ω(g(n)) to denote the set of all functions
F(n) for which there exists a positive real number c and a positive natural number n0 such
That for all n ≥ n0, >>> 0 ≤ cg(n) ≤ f(n).
Example: Suppose that f1, f2, g1, g2are asymptotically positive. Prove that if f1 = theta(g1) and f2 = theta(g2), then f1/f2 = theta(g(n)) where g(n) = g1/g2.
From the definition, there exists positive real numbers A1, A2, B1, B2 and positive natural numbers n1 and n2 such that,A1g1 <= f1 <= B1g1 where Vn >= n1 and A2g2 <= f2 <= B2g2 where Vn >= n2. Hence for all n >= max{n1, n2}, (A1/B2)g(n) = (A1g1/B2g2) <= (f1/f2) <= (B1g1/A2g2) = (B1/A2)(g(n)) where g(n) = (g1/g2)*** TRACING ALGORITHMS ***
Tracing algorithms (both iterative and recursive) in order to find the number of times a particular line runs, both when the size of the input is a concrete number (e.G., a list of size 5), and when the input is a variable (e.G., a list of sizen. In the latter case, you’ll need to come up with a general formula that works for all n.
*** FINDING THE ORDER IN WHICH RECURSIVE CALLS ARE MADE IN A RECURSIVE ALGORITHM ***
*** DESIGNING A BRUTE FORCE SEARCH ALGORITHM ***
def brute_force_minimize(objective_function, search_space):
min_value = float(‘inf’) # Python for “∞”
min_theta = None
for theta in search_space:
value = objective_function(theta)
if value < min_value:
min_value = value
min_theta = theta
return min_theta
*** IDENTIFYING SEARCH SPACE OF B.F. PROBLEMS ***
The brute force strategy can be used to design algorithms for solving what are known as
search problems. In this type of problem, the goal is to find an object which satisfies a
Certain criterion. The object is assumed to come from a set of possible solutions called
The search space. The search space may be finite or infinite; if it is finite, the brute-force
Approach to solving the problem is to systematically check each possible solution until a
Valid solutions is found.
Exponential Search Space:
For each point in data,
We must make a decision: does it belong to cluster A or cluster B? Since there are two
Choices for each of n points, the total number of possible partitions is 2n. Therefore, the size of the search space is 2n , where n is the number of data points. This
Means that the brute-force search has a time complexity of Ω(2n ); that is, it takes exponential time (option of groups)
Factorial Search Space: Since it is a finite data set, there are finitely-many orders (sequences) in which the people can speak. There are n choices for the first speaker, n − 1
For the second, n − 2 for the third, and so on, for a total of n(n − 1)(n − 2)· · · 3 · 2 · 1 = n!
Possible orderings (or permutations). A brute-force search to minimize Ldiff will have a
Time complexity of Ω(n!); that is, it will take factorial time.
*** FINDING SIZE OF SEARCH SPACE AND DETERMINING IF FINITE OR INFINITE, IF EXPONENTIAL OR QUADRATIC ETC. ***
*** WRITING BASIC ALGORITHMS IN PSEUDOCODE ****** SOLVING BASIC RECURRENCES BY UNROLLING THEM ***
We can solve a recurrence by following the below procedure
1. Unroll the recurrence until a pattern emerges. We have already done this above. Let’s
Agree to call the “first step” of unrolling the point in time before we have unrolled
The recurrence. Then at the first step we have the original recurrence: T(n) = T(n −
1) + c. The “second step” of unrolling is then the point in time after unrolling once.
At the second step we found: T(n) = T(n −2) + 2c. At the third step (after unrolling
Twice), we had: T(n) = T(n − 3) + 3c. We could continue, but by now a pattern has
Emerged.
Placing these results into a table can make it easier to identify the pattern:
Step Formula
1 T(n − 1) + c
2 T(n − 2) + 2c
3 T(n − 3) + 3c
2. Find a general formula for T(n) in the kth step of unrolling. The table above makes it
Clear that on the kth step of unrolling we will obtain the formula T(n) = T(n − k) +
K · c. You can check your general formula by plugging in a particular value of k and
Verifying that the result matches what you have in the table above. For instance,
Taking k = 3 yields T(n) = T(n − 3) + 3c, which is indeed what is in the table.
3. Calculate the step number in which the base case is reached. The argument to T decreases
With each step until it eventually reaches zero, which is the base case of this recurrence. On what step does it reach zero, exactly? On the kth step, the argument is
N − k. We therefore solve n − k = 0 for k, resulting in k = n. Hence the base case is
Reached on the nth step
4. Replace k with this step number in the general formula for T(n). We found above that, on
The kth step, T(n) = T(n − k) + k · c. Replacing k by n (the number of steps needed
To reach the base case) we arrive at a new formula:
T(n) = T(n − n) + n · c
= T(0) + n · c
We know from Equation 2.1 that T(0) = c0. So:
= c0 + n · c
At this point, we’ve removed all T’s from the right hand side of the equation. We have
Therefore solved the recurrence: its solution is T(n) = c0 + n · c. You can verify that this
Formula will produce the same result as using Equation 2.1 to recursively calculate T(n),
For all integer values of n ≥ 0.
Therefore, the time it takes factorial to run on an input of size n is T(n) = c0 + c · n =
Θ(n). That is, it takes linear time.
Another Example:
Solving (MergeSort) Recurrence Relations
You can actually solve the recurrence relation given above. We’ll sketch how to do that here. We’ll write n instead of O(n) in the first line below because it makes the algebra much simpler.
T(n) = 2 T(n/2) + n
= 2 [2 T(n/4) + n/2] + n
= 4 T(n/4) + 2n
= 4 [2 T(n/8) + n/4] + 2n
= 8 T(n/8) + 3n
= 16 T(n/16) + 4n
= 2k T(n/2k) + k n [this is the Eureka! Line]
We know that T(1) = 1 and this is a way to end the derivation above. In particular we want T(1) to appear on the right hand side of the = sign. This means we want:
n/2k = 1 OR n = 2k OR log2 n = k
Continuing with the previous derivation we get the following since k = log2 n:
= 2k T(n/2k) + k n
= 2log2 n T(1) + (log2n) n
= n + n log2 n [remember that T(1) = 1]
= O(n log n)
So we’ve solved the recurrence relation and its solution is what we “knew” it would be. To make this a formal proof you would need to use induction to show that O(n log n) is the solution to the given recurrence relation, but the “plug and chug” method shown above shows how to derive the solution — the subsequent verification that this is the solution is something that can be left to a more advanced algorithms class.
- Identifying the recurrence that represents the run time of an algorithm (e.G., merge sort is T(n) = 2 T(n/2) + n)
Selection Sort:
This algorithm is easy to implement in code. The straightforward approach involves creating a new list for storing the sorted output. At each step, we pop the smallest element from the input list and append it to the output list. While this implementation is correct, it uses more memory than is necessary.
def selection_sort(arr):
“””In-place selection sort.”””
n = len(arr)
if n <= 1:
return
for barrier_ix in range(n-1):
min_ix = find_minimum(arr, start=barrier_ix)
arr[barrier_ix], arr[min_ix] = arr[min_ix], arr[barrier_ix] # swap
def find_minimum(arr, start):
“””Finds index of minimum. Assumes arr is non-empty.”””
n = len(arr)
min_value = arr[start]
min_ix = start
for i in range(start + 1, n):
if arr[i] < min_value:
min_value = arr[i]
min_ix = i
return min_ix
MergeSort rECURRENCE =
def mergesort(arr): #IMPORT MATH
“””Sorts array using merge sort.”””
if len(arr) > 1: # split the array in half
middle = math.Floor(len(arr) / 2)
left = arr[:middle]
right = arr[middle:]
# sort the left half
mergesort(left)
# sort the right half
mergesort(right)
# merge the left and right halves
merge(left, right, arr)
def merge(left, right, arr):
“””Merges sorted arrays left and right into arr.”””
left.Append(float(‘inf’))
right.Append(float(‘inf’))
for ix in range(len(arr)):
if left[0] < right[0]:
arr[ix] = left.Pop(0)
else:
arr[ix] = right.Pop(0)
The Mergesort Subroutine:
The sorting happens in merge! It is interesting to note that no actual sorting occurs in the main mergesort function. Instead, the sorting occurs in the merge subroutine. To see this, consider what happens when we execute mergesort([5, 2]). The function splits this array into left = [5] and right = [2] and makes recursive calls to mergesort([5]) and mergesort([2]). These recursive calls exit without doing anything, since their inputs are arrays of size one. Hence mergesort calls merge([5], [2], [5, 2]). The merge function is given two “stacks of cards”, each with one element. It will take the smaller of the two cards and place it into the first slot in arr, then it will place the other card into the second slot – this is where the elements are sorted! Before calling merge, arr is [5, 2]. After calling merge, arr is [2, 5], and in sorted order.
*** DETERMINING THE TIME COMPLEXITY OF A B.F. ALGORITHM ***
Suppose T1(n), . . . , T6(n) are functions describing the runtime of six algorithms. Furthermore, suppose we have the following bounds on each function:
T1(n) = Θ(n2 )
T2(n) = O(n3 )
T3(n) = Ω(n)
T4(n) = O(n4 ) and T4 = Ω(n2 )
T5(n) = Θ(n5 )
T6(n) = Θ(√n)
Solutions
a) T1(n) + T2(n). Solution: T1(n) + T2(n) is O(n 3 ) and Ω(n 2 ).
b) T1(n) + T5(n) + T2(n) Solution: Θ(n 5 )
c) T4(n) + T5(n) Solution: Θ(n 5 )
d) T1(n) + T4(n) Solution: O(n 4 ) and Ω(n 2 )
e) T3(n) + T6(n) Solution: Ω(n)
f) T2(n) + T3(n) + T4(n) + T5(n) Solution: Ω(n 5 )
*** USING ASSUMPTION THAT INPUT IS SORTED TO MAKE MORE EFFICIENT SOLUTIONS ****** UNDERSTANDING SELECTIONSORT, MERGESORT AND MERGESORT SUBROUTINE ***
*** BINARY SEARCH ***
import math
def binary_search(things, target, start, stop):
“””
Searches things[start:stop] for the target.
Assumes that `things` is sorted.
“””
if start >= stop:
return None
middle = math.Floor((start + stop) / 2)
if things[middle] == target:
return middle
elif things[middle] > target:
return binary_search(things, target, start, middle)
else:
return binary_search(things, target, middle + 1, stop)
*** TIME COMPLEXITY OF MERGE AND SELECTION ***
We have seen that the time complexity of merge sort is
Θ(n log n). Compare this to the time complexity of selection sort, which is Θ(n2 ).
*** Important Recurrence Relation Formulas*
**
Factorial = {C for n = 0; T(n-1) + C for n >=1}
Mergesort = {1 for n <=1; T(n) = 2T(n/2) + n for n>=2}
Selectionsort = {T(n)=T(n-1) + n-1 }
(n − 1) + (n − 2) + . . . + 3 + 2 + 1.
This is the sum of the first n − 1 natural numbers Recall (from the first homework) that the
Sum of the first n natural numbers is n(n + 1)/2. Here we have the first n − 1 numbers,
And so we replace n by n − 1 to obtain: (n − 1)n/2 for the total number of times that the
Loop body in find_minimum runs over the course of selection sort. Since n(n − 1)/2 =
Θ(n
2
), we say that the sort runs in quadratic time.
BinarySearch = {1 for n = 1; Tworst (n/2) + 1 for n >= 2}
BinarySearchRuntime = theta(logn)