a closer look at big o notation
play

A closer look at big- O notation. We all know that in a formula y ax - PDF document

A closer look at big- O notation. We all know that in a formula y ax + the values of b = both a (slope) and b (intercept) are important. If y is the cost of executing a program on a problem of size x , then b determines the value at x = 0


  1. A closer look at big- O notation. We all know that in a formula y ax + the values of b = both a (slope) and b (intercept) are important. If y is the cost of executing a program on a problem of size x , then • b determines the value at x = 0 – the fixed cost of execution; • a determines how fast the cost grows as the problem size increases. a is the constant of proportionality of a linear-cost program. Richard Bornat 1 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  2. It’s obviously untrue that all linear-cost programs have the same execution time: 600 500 400 300 200 100 0 10 20 30 40 50 60 70 80 90 100 0 2x+180 5x+25 The 2 x cost grows more slowly, so the corresponding program is to be preferred on ‘sufficiently large’ problems. But all the problems we ever consider may be smaller than 50, so the 5 x program might be better for us. Richard Bornat 2 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  3. You should already be persuaded that whatever the constants of proportionality, x 2 formulæ will overtake x formulæ at sufficiently large values of x : 12000000 10000000 8000000 6000000 4000000 2000000 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 100x+1000000 0.1x^2+20 No matter what the disparity in fixed costs (20 vs 1 million), no matter what the cost of the inner loop (0.1 vs 100), the quadratic program will cost more than the linear program on sufficiently large problems. ( ) is worse k And the same is true for all the other powers: O N ( ) on sufficiently large problems whenever k j than O N > no j matter what the fixed costs or the constants of proportionality. Richard Bornat 3 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  4. It doesn’t matter if a quadratic program has a large linear component. Eventually it will grow just like x 2 . At small scales it might look linear: 0.1x^2+100x+10000 11200 11000 10800 10600 10400 10200 10000 9800 9600 9400 0 1 2 3 4 5 6 7 8 9 10 Richard Bornat 4 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  5. Over a larger scale it looks simply quadratic: 0.1x^2+100x+10000 12000000 10000000 8000000 6000000 4000000 2000000 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 The higher power eventually dominates the lower, no matter what the constants of proportionality. Richard Bornat 5 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  6. So if the cost of executing a program on a problem of size N is given by a polynomial formula 2 k a a N a N ... a N where a n ! 0, we say it is + + + + 0 1 2 n ( ) , neglecting smaller powers of N (because on O N k large problems N k will dominate). ( ) is to be preferred to And then we say that O N k ( ) whenever k O N j > neglecting the constants j (because on large problems N k will a a , ,..., a n 0 1 dominate N j ). This notation is a convenient approximation . • It shouldn’t tempt us to neglect the constants of proportionality when ( ) algorithms. comparing two O N k ( ) may be • We should be aware that O N k ( ) on small problems, worse than O N j even though k > . j • Experiment rules. ( ) where k < 0 . I hope you k No interesting algorithm is O N can justify this assertion. Richard Bornat 6 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  7. One last wrinkle. We sometimes write algorithms which are mixed, because of different constants of proportionality. ( ) for small For example, an algorithm which is O N 2 ( ) for larger values – because values of N , and O N lg N the N 2 algorithm is quick and easy to set up on small problems, perhaps. ( ) . Such an algorithm, in the limit, is O N lg N Hence the definitions on p121 of Weiss. Big-O notation gives upper bounds on execution costs. He also gives definitions of " ... ( ) (big-Omega, a notation for lower bounds), # ... ( ) (big-Theta, upper and lower bounds) and o ... ( ) (little-o, upper bound only). In this course we are mostly concerned with worst- case calculations, and with finding an upper bound on the worst case of a program’s execution. Richard Bornat 7 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  8. On logarithms: log , ln and lg . In various examples, as we shall see, we prefer ( ) to O N O lg N ( ) , because lg N grows more slowly than N . x = When N > 0 and b N , we say that log b N x . = Here b is the base , and x is the logarithm . log b N is the power to which you must raise b to get N . A logarithm is rarely a whole number ... Richard Bornat 8 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  9. log b N Fact 0 . b N . = That’s the definition of a logarithm! Fact 1 . If N J K , then log N log J log K . = × = + b b b + = log J log K log J log K x y x y + b b b , so b b b J K N . × = × = × = b b b b This is why logarithms were popular in my schooldays: they convert multiplication problems into addition problems. ( ) = 2 Fact 2 . If log b N x , then log b N 2 x . = 2 x x x 2 b b b N N N = × = × = Fact 2a . In general, log . ( ) = y N y log N b b Another reason for the popularity of logarithms: they convert exponentiation problems into multiplication problems. Richard Bornat 9 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  10. Fact 3 . log N k log N , where k is a constant. = b c log c N N c , by definition. = ( ) , taking log b of both sides. log N log N log c = c b b ( ) = log cN log log log c N × c , by fact 2a. b c b log b c is a constant, because c is a constant. So log N = k log N , where k is a constant. b c So the base doesn’t matter in big-O calculations. ( ) programs run just like Therefore O log N b ( ) programs, neglecting constants of O log N c proportionality. Richard Bornat 10 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  11. Computer scientists are especially interested in base 2. For all sorts of reasons: • lg N is the number of bits in the binary numeral representation of N ; therefore lg N is the number of bits needed to represent all the numbers 0..N in binary numeral notation; • lg N is the number of times you must double (starting from 1) before you reach or exceed N ; • lg N is the number of times you must halve (starting from N ) before you reach 0. The last point is the crucial one in this course: we shall consider algorithms which work by repeated halving , stopping when they reach a problem of size 0 (in lg N steps) or 1 (in lg N $ 1 steps). For these reasons we use a special notation for base-2 logarithms. Richard Bornat 11 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  12. Calculating execution costs. Mostly addition and multiplication. All costs assessed on the kind of machine we are using as a model: sequential, no significant parallel executions. 0. The cost of arithmetic, comparison and storage operations is constant in time and zero in space. Some arithmetic or comparison operations might take longer than others, because of the size of the data. This does not contradict point 0. T1. The execution time of (time taken to evaluate) the formula f1 op f2 is T T T , where T + + f1 f2 op op is some small constant depending on the operator op and the types of the formulæ f1 and f2 . What goes for binary operators goes similarly for all the other kinds of operators - but see below for choice instructions and choice formulæ. T2. If the execution time of I 1 is T 1 , and the execution time of I 2 is T 2 , then the execution time of I I ; is T T . + 1 2 1 2 Richard Bornat 12 18/9/2007 I2A 98 slides 4 Dept of Computer Science

  13. T3. The execution time of the instruction for (INIT; COND; INC) BOD is , ( ) + ( ) T T T T T ... T T T + + + + + + + INIT COND BOD(v0) INC COND BOD(vN) INC COND where v v , ,..., v N are the successive values set 0 1 up by INIT and INC to control the execution of BOD . It follows that if T BOD is independent of the values v i , and if ( ) execution time and T COND , T INC and T BOD are all O f N ( ) T INIT is ( ) or better, then the for is O N ( ) execution time. O f N ( ) × ( ) f N while instructions can be treated as a special kind of for , without INIT or INC . I think I can neglect the cost of jumps. T4. The execution time of the instruction if (COND) THEN else ELSE is either T T + COND THEN (if COND is non-zero) or T T (otherwise). + COND ELSE The same goes for choice formulæ COND ? THEN : ELSE . Single-armed choice instructions if (COND) THEN can be treated as if (COND) THEN else {} I neglect the cost of jumps. T5. The execution time of the block { decls instrs } is T T . + decls instrs Richard Bornat 13 18/9/2007 I2A 98 slides 4 Dept of Computer Science

Recommend


More recommend