cs
1. why it is much easier to do algorithm analysis than it is to do algorithm design. Your answer should include an example of a problem for which an algorithm already exists and all that is required of you is to analyze it versus another problem for which no algorithm exists and you are required to design one.
v When one does algorithm analysis, there is already an algorithm that is to be analyzed. What one does then is to apply the rules of analysis as we had studied them in class – rules for the sequence structure, rules for the selection structure, rules for the repetition/iteration structure, and recurrence rules for when recursion is involved in the algorithm. Granted, some of the algorithms have these structures nested and non-trivially composed, but there is a “specimen” to be studies and analyzed. This is not the case when one has to design an algorithm. One may be fortunate enough to be assigned a problem for which an algorithm has already been designed and all that one has to do is tailor-fit the extant algorithm. Or, one may be lucky to be given a problem that is simply a variant of another for which an algorithm already exists and all that is needed to do is make an adjustment – perhaps in the interface, perhaps in how the data should be input, perhaps in how the output should be presented, perhaps in problem-size-related parameters. At times, though, the problem is one that is novel and for which an algorithm does not yet exist. One then has to devise a solution using known algorithmic approaches, if applicable: an incremental approach or a recursive approach come to mind. The task becomes extra difficult when one has to demonstrate that the algorithm is “correct” because this is an often intractable exercise. The definition of algorithm correctness requires that it produce the expected output for all possible input. For truly critical problems, a proof of correctness is required (think software for medical devices or for managing subway trains). For most instances, testing is the (sole) means used to show program “correctness”. But one can only test an algorithm if it already has been designed and implemented. This is almost always much harder than it is to analyze an existing algorithm. 2. what graphs are and their importance in the design and analysis of algorithms?Graphs are a means of representing relationships between objects in a, well, “graphical” way. They are perhaps the second simplest mathematical structure for this reason – if a set is the simplest mathematical structure, a graph being defined as two sets, a vertex set and an edge set, must be close behind. For simple graphs, the edge set is a subset of the Cartesian product of the vertex set with itself. So it’s like a (simple) graph is made of one-and-a-half sets! As basic structures, they can represent all sorts of objects and their relationships with each other. Even programs can be represented with graphs – the vertices can be the statements in the program and the edges can represent the flow of logic. This is used prominently in software engineering theory. In the design of algorithms, graph theoretic approaches help solve shortest path problems, maximum flow problems, search problems, and many more. In the design of algorithms, graphs (in particular, trees) have been used prominently in proving, among others, lower bound results – the Ω(nlgn) lower bound on binary- comparison sorting comes to mind. 3. Briefly define what an NP-complete problem is and then comment on the statement above. Statement ( “ perhaps the most compelling reason why theoretical computer scientist believe that P not equal to NP comes from the existence of the class of “NP-complete” problems “). Formally speaking, an NP-complete problem is a problem that is in NP and for which any other problem in NP can be transformed into. That is, we can solve the problem in NP by transforming it to the NP-complete problem, solve this transformed version, then derive the solution to the original from its solution. There are literally thousands of NP-complete problems and over the span of around 40 years, the best algorithm designers and complexity theory minds have not devised a polynomial solution to any of them. With so many “targets” for such an effort and so many minds working at a deterministic polynomial solution, why has not one of them been polynomially solved? If even just one of the thousands is, then all problems in NP could then be solved in deterministic polynomial time (that is, NP collapses into P, i.e., P=NP). Maybe the reason for this is that the opposite is true, that is, P is properly contained in NP, and, hence P≠NP. 4. Explain the differences and similarities (if any) between a heap and a binary search tree. Can a binary tree be both? Obviously both are binary tree structures and both contain values in their nodes. There is a fundamental difference between the two, that is maxHeap (<x <x)child nodes and bst (<x>x)child nodes. For a (max) heap, both subtrees hanging off a node contain values that are less than the value stored at the node. For a binary search tree, the left subtree contains values less than that stored at the node while the right subtree contains values greater than it. This “condition” is true at all levels of the tree structures. So a binary tree can only be both if the right subtree is absent, for example, an empty tree, or a tree with just one node, or one that consists of just a root and its left child (if one honors the additional requirement that a heap must be a complete binary tree). Or a binary tree that consists of nodes that only have left children (if the additional requirement is not honored). 5. why MERGE-SORT is not an “in-place” sorting algorithm. An “in-place” sorting algorithm requires no more than O(1) additional memory to carry out its sorting work. This is the case with INSERTION-SORT and HEAPSORT. However, MERGE-SORT makes use of an auxiliary array that is of size n (same as the size of the input array) to receive the result of the merge step after the two recursive calls with each half of the input array have returned the two halves already sorted within themselves. Hence, it makes use of θ(n) additional space. Which disqualifies it from being an “in-place” sorting algorithm. Merge-Sort has a role in sorting, in any case. It is useful in sorting huge collections of values because it can do the merge step with “page-sized” portions of results from lower-level recursive calls and store partial results of this merge in the same manner. So when the size of the set to be sorted cannot fit in the computer’s RAM, MERGE-SORT can come to the rescue. 6. differences between sorting algorithms that submit to the Omega(nlgn) lower bound and those that can achieve O(n) complexity. Supply examples as necessary. v The key phrase here is “binary comparison.” Sorting algorithms that base their sorting “logic” on binary comparisons of the items to be sorted cannot escape the Ω(nlg n) lower bound on the best case complexity of the algorithm. This is the case with INSERTION-SORT, with BUBBLE-SORT, with MERGE-SORT, and with HEAPSORT. Their worst case complexity is Ω(nlg n). On the other hand, those algorithms that do not base their sorting “logic” on binary comparisons can achieve the absolute lower bound on the worst-case complexity of any sorting algorithm — Ω(n). Recall that this result is due to the fact that any sorting algorithm must examine all the values to be sorted. Neglecting even one of them can mean an incorrect algorithm. There were two deterministic algorithms discussed in Chapter 8 that achieve this: COUNTING- SORT and RADIX-SORT. The third algorithm, BUCKET-SORT, achieves the lower bound with its expected complexity. Ch8begins with a demonstration of the absolute lower bound on the worst-case complexity of sorting based on binary comparisons – a clever demonstration that uses decision trees. It then continues to show that by “changing the rules” one can “beat” this absolute lower bound.In the future, if anyone claims that s/he has invented a sorting algorithm that works in Ω(n) worst-case time, then it must not be doing the sorting based on binary comparisons, or his/her analysis of the algorithm must be flawed. There could be third reason for the claim – s/he is using a different model of computation where the unit cost is not the same as the unit cost we have been using. This may be the case with a new paradigm of computation – quantum computing. But that is a topic for another course beyond CSCI 4101. 7. Justify the choice of time complexity as the primary criterion when comparing alternative algorithms to solve a problem. Why is it usually chosen ahead of space complexity for comparison purposes? What about the development time of software designers? Time complexity is the primary criterion used when comparing alternative algorithms to solve a problem because time is irrecoverable. Space can be recycled or can be expanded by adding more memory to the computing device. On the other hand, time cannot be expanded per se – although one can virtually do this by using a faster computer, an enhancement that is probably more costly than just increasing available memory.Of course, this discussion presupposes that the alternative solutions are both correct. Otherwise, one or both would be useless since they do not really solve the problem.We do not discuss other criteria that may be relevant to real-world operations that develop software solutions because there is more variability in them. Take for example, development time spent by software designers. Some programmers are simply more efficient at producing code than others so time of development might not be an accurate gauge of which alternative solution should be adopted. On the one hand, a more efficient solution might take forever to program negating any time savings in running time. On the other hand, a less efficient solution might take little time being implemented but if the resulting program has to be run many, many times as in the span of several years, then a more efficient solution might be preferable since the total time savings would still be significant.
Also, different companies may have different pay scales and so if labor is cheap (as in third world countries), it may be advantageous to have a more efficient algorithm implemented even if it took a long time to do this. On the other hand, if labor costs are steep, then a quick and dirty solution that may not be the most efficient may be preferable especially if the resulting program is going to be run only a few number of times.There may be other reasons pertaining to the relative difficulty of actually proving in a mathematically rigorous way that an algorithm is correct that might make correctness (in the total sense) not the primary criterion