COMP 3403 — Algorithm Analysis Part 6 — Chapters 11 and 12 Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University
Chapter 11 Limitations of Algorithm Power: Computational Complexity Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 188 Review and Looking Forward • So far, we have looked at – techniques for analyzing the time complexity and space complexity of a given algorithm – a number of algorithm design techniques – – the analysis of most of these algorithms • We must also consider the following question: are (any of) these algorithms the best possible? • In other words, what is the minimum possible amount of time (or space) required to solve a given problem? – that is, what is the lower bound of the complexity of any algorithm which solves the given problem? Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 189 Lower Bounds • Lower bound: an estimate on a minimum amount of work needed to solve a given problem • Examples: – the number of comparisons needed to find the largest element in a set of n numbers the number of comparisons needed to sort an array of size n – – the number of multiplications needed to multiply two n × n matrices – • A lower bound can be – an efficiency class ( Ω() ) – • A lower bound is said to be tight if there exists an algorithm with the same efficiency as the lower bound Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 190 Lower Bound Examples Problem Lower Bound Tight? Ω( n log n ) sorting yes Ω(log n ) searching in a sorted array yes Ω( n log n ) element uniqueness yes n -digit integer multiplication Ω( n ) no Ω( n 2 ) multiplication of n × n matrices no • For example, we saw that we could multiply two n × n matrices in O ( n 2 . 72 ) time (roughly) but there is no obvious reason why we need more than n 2 time to – do this goal: either find a O ( n 2 ) algorithm for matrix multiplication, – or prove that ω ( n 2 ) time is required Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 191 Techniques for Establishing Lower Bounds • Trivial lower bounds • Information-theoretic arguments (decision trees) • Adversary arguments • Problem reduction Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 192 Trivial Lower Bounds • Idea: just counting the amount of given input or required output may give us a useful lower bound • Examples to multiply two n × n matrices, we must read 2 · n 2 input numbers – and write n 2 output numbers: thus we need at least n 2 time (i.e., Ω( n 2 ) time) – to find the maximum number in an array of n numbers, each number must be examined, taking Ω( n ) time – polynomial evaluation: p ( x ) = a n x n + a n − 1 x n − 1 + · · · + a 1 x + a 0 to – evaluate this, we must look at every coefficient, and thus we have to do Ω( n ) work (tight, because we have O ( n ) algorithms for this) • Counting the amount of input is not always useful; e.g., – – the amount of input/output may be tiny compared to any known algorithm (such as in the travelling salesman problem) Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 193 Information Theoretic Arguments • Information theory deals with (among other things) the amount of information contained in a given collection of data • Q: how much information is in this image? That is, if you were talking to someone on the phone and had to describe this image to them, how much would you have to say so that they could draw the same picture? • A 390-byte PostScript program creates this – Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 194 Information Theoretic Arguments • An information theoretic approach to lower bounds uses the amount of information which any algorithm solving the problem must generate e.g., the game of guessing a number m between 0 and n − 1 – there are n possible answers, which means we need ⌈ log 2 n ⌉ bits to represent the answer if each question we ask (e.g., “is m bigger than 31”) gains one – bit of information, we must ask ⌈ log 2 n ⌉ questions, at a minimum, on average note: if you ask a question like “is m 42?” you might be lucky – and get the answer with one question, but if the guess is wrong you have gained very little information, increasing the number of overall questions which you must ask • Note that you can ask your questions in such a way that you fill in the bits of the binary representation of m – – Q: if the answer is no, what is the next question? Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 195 Information Theoretic Arguments: Decision Trees • Many algorithms for sorting and searching data involve a sequence of comparison operations, applied to the input data • We can represent these algorithms pictorially; example: Decision tree for finding a minimum of three numbers • Note that in this case there are more leaves on the tree than there are possible answers (note ⌈ log 2 3 ⌉ = 2 , we must ask 2 questions, the tree must have height at least 2) Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 196 Information Theoretic Arguments: Decision Trees: 2 • Here is a decision tree for selection sort – • Note: selection sort does redundant comparisons! Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 197 Information Theoretic Arguments: Decision Trees: 3 • A decision tree’s leaves may not all be at the same depth – Decision tree for the three-element insertion sort • Note that no redundant comparisons are done! • In the worst case, ⌈ log 2 6 ⌉ = 3 questions are needed Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 198 Information Theoretic Arguments: Decision Trees: 4 • Consider the problem of sorting n distinct numbers: there are n ! possible permutations of n numbers – sorting these numbers is equivalent to finding the permutation which sorts the numbers APL sort operator! • The decision tree for such a sort must have at least n ! leaves – this means the height of the tree (== the number of comparisons done from the root to the farthest leaf) must be ≥ ⌈ log 2 n ! ⌉ thus the worst-case number of comparisons C worst ( n ) ≥ ⌈ log 2 n ! ⌉ – • Applying Stirling’s formula, we get √ � n = n log 2 n − n log 2 e +log 2 n +log 2 2 π � n ⌈ log 2 n ! ⌉ ≈ log 2 2 πn ≈ n log 2 n e 2 2 Thus about n log 2 n comparisons are required in the worst case by any comparison-based sorting algorithm • A more difficult analysis shows C avg ( n ) ≥ log 2 n ! Surprising? • Since we have O ( n log 2 n ) (average and worst case) sorting algorithms, these bounds are (asymptotically) tight Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 199 Lower Bounds by Problem Reduction • Idea: if some problem P is at least as hard as another problem Q , then a lower bound for Q is also a lower bound for P – common mistake: reversing this and saying “ . . . then a lower bound for P is also a lower bound for Q ” • To use the problem reduction idea, find a problem Q with a known lower bound that can be reduced † to your problem P in question – the reduction must show how we can turn any instance of Q into – an instance of P , so that a solution to the P instance gives us a solution to the Q instance † the reduction procedure must follow some technical rules for validity Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 200 Lower Bounds by Problem Reduction: Example • Example: suppose P is the problem of finding a MST for n points in the Cartesian plane Q is the element uniqueness problem – • This reduction from Q to P proceeds as follows: take an instance of Q : { x 1 , x 2 , . . . , x n } – create a corresponding instance of P : { ( x 1 , 0) , ( x 2 , 0) , . . . , ( x n , 0) } – – – observe: the Q instance has two equal elts ⇐ ⇒ P ’s MST has a 0-length edge (GEQ: why?) – thus we can conclude that P (MST) must be at least as hard as Q (uniqueness) and thus (the time complexity of) MST is in Ω( n log n ) – Jim Diamond, Jodrey School of Computer Science, Acadia University
Chapter 11 201 Adversary Arguments • Also known as “devil’s advocate” arguments • Idea: you employ your algorithm to try to solve some problem without seeing the data ⇒ the operations are done by an “adversary”, who does the calculations, announces the results of comparisons, and so on – the adversary must play “fairly” in the sense that there must be some set of data which would result in the stated results • One way to think about this is an adversary which generates bad data in “real time” • Example: in quicksort, it is necessary to pick a “good” pivot to avoid O ( n 2 ) behaviour – – the adversary would generate data for which the median of those three values is the second smallest of all values in the sub-array repeating this, your quicksort would run in O ( n 2 ) time – Jim Diamond, Jodrey School of Computer Science, Acadia University
Recommend
More recommend