Runtime Complexity CS 331: Data Structures and Algorithms Michael - PowerPoint PPT Presentation

Runtime Complexity CS 331: Data Structures and Algorithms Michael Lee <lee@iit.edu>

So far, our runtime analysis has been based on empirical data — i.e., runtimes obtained from actually running our algorithms

This data is very sensitive to: - platform (OS/compiler/interpreter) - concurrent tasks - implementation details (vs. high-level algorithm)

Also, doesn’t always help us see long-term / big picture trends

Reframing the problem: Given an algorithm that takes input size n , find a function T ( n ) that describes the runtime of the algorithm

input size might be: - the magnitude of the input value (e.g., for numeric input) - the number of items in the input (e.g., as in a list) An algorithm may also be dependent on more than one input .

def sort ( vals ): # input size = len(vals) def factorial ( n ): # input size = n def gcd ( m , n ): # input size = (m, n)

fundamentally, runtime is determined by the primitive operations carried out during execution of the algorithm (in compiled code, by the interpreter, etc.)

E.g., factorial cost times def factorial ( n ): c 1 1 prod = 1 c 2 n – 1 for k in range (2, n +1): c 3 n – 1 prod *= k c 4 1 return prod T ( n ) = c 1 + ( n − 1)( c 2 + c 3 ) + c 4 Messy! Per-instruction costs are machine specific, and obscure big picture runtime trends.

times def factorial ( n ): prod = 1 1 for k in range (2, n +1): n – 1 prod *= k n – 1 return prod 1 T ( n ) = 2( n − 1) + 2 = 2 n Simplification #1: ignore actual cost of each line of code. Easy to see that runtime is linear w.r.t. input size.

E.g., insertion sort def insertion_sort ( lst ): for i in range (1, len ( lst )): for j in range ( i , 0, -1): if lst [ j ] < lst [ j -1]: lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] else: break i init: [5, 2, 3, 1, 4] j insertion: [2, 3, 5, 1, 4]

times def insertion_sort ( lst ): for i in range (1, len ( lst )): n – 1 ? for j in range ( i , 0, -1): ? if lst [ j ] < lst [ j -1]: ? lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] ? else: ? break ?’s will vary based on initial “sortedness” ... useful to contemplate worst case scenario

times def insertion_sort ( lst ): def insertion_sort ( lst ): for i in range (1, len ( lst )): for i in range (1, len ( lst )): n – 1 ? for j in range ( i , 0, -1): for j in range ( i , 0, -1): ? if lst [ j ] < lst [ j -1]: if lst [ j ] < lst [ j -1]: ? lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] ? else: else: ? break break worst case arises when list values start out in reverse order !

times def insertion_sort ( lst ): def insertion_sort ( lst ): for i in range (1, len ( lst )): for i in range (1, len ( lst )): n – 1 1, 2, ..., (n – 1) for j in range ( i , 0, -1): for j in range ( i , 0, -1): 1, 2, ..., (n – 1) if lst [ j ] < lst [ j -1]: if lst [ j ] < lst [ j -1]: 1, 2, ..., (n – 1) lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] 0 else: else: 0 break break worst case analysis is our default mode of analysis hereafter unless otherwise noted

Recall: arithmetic series = 15 e.g., 1+2+3+4+5 Sum can also be found by: - adding first and last term (1+5=6) - dividing by two (to find average) (6/2=3) - multiplying by num of values (3 ⨉ 5=15)

n t = n ( n + 1) i.e., X 1 + 2 + · · · + n = 2 t =1 n − 1 t = ( n − 1) n and X 1 + 2 + · · · + ( n − 1) = 2 t =1

times def insertion_sort ( lst ): n – 1 for i in range (1, len ( lst )): 1, 2, ..., ( n – 1) for j in range ( i , 0, -1): 1, 2, ..., ( n – 1) if lst [ j ] < lst [ j -1]: 1, 2, ..., ( n – 1) lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] 0 else: 0 break

times def insertion_sort ( lst ): n – 1 for i in range (1, len ( lst )): P n − 1 for j in range ( i , 0, -1): t =1 t P n − 1 if lst [ j ] < lst [ j -1]: t =1 t P n − 1 lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] t =1 t 0 else: 0 break

times def insertion_sort ( lst ): n – 1 for i in range (1, len ( lst )): ( n – 1) n /2 for j in range ( i , 0, -1): ( n – 1) n /2 if lst [ j ] < lst [ j -1]: ( n – 1) n /2 lst [ j ], lst [ j -1] = lst [ j -1], lst [ j ] 0 else: 0 break T ( n ) = ( n − 1) + 3( n − 1) n 2 = 2 n − 2 + 3 n 2 − 3 n = 3 2 n 2 − n 2 − 1 2

T ( n ) = 3 2 n 2 − n 2 − 1 i.e., runtime of insertion sort is a quadratic function of its input size.

T ( n ) = 3 2 n 2 − n 2 − 1 Simplification #2: only consider leading term ; i.e., with the highest order of growth

T ( n ) = 3 2 n 2 − n 2 − 1 Simplification #3: ignore constant coefficients

T ( n ) = 3 2 n 2 − n 2 − 1 we use the notation T ( n ) = O ( n 2 ) [ read: T ( n ) is big-oh of n 2 ] to indicate that n 2 describes the asymptotic worst-case runtime behavior of the insertion sort algorithm, when run on input size n

formally, f ( n ) = O ( g ( n )) means that there exists constants c, n 0 such that 0 ≤ f ( n ) ≤ c · g ( n ) for all n ≥ n 0

i.e., f ( n ) = O ( g ( n )) intuitively means that g (multiplied by a constant factor) sets an upper bound on f as n gets large — i.e., an asymptotic bound

cg.n/ f .n/ n n 0 f .n/ D O.g.n// (b) (from Cormen, Leiserson, Riest, and Stein, Introduction to Algorithms)

g ( n ) = 3 2 n 2 f ( n ) = 3 2 n 2 − n 2 − 1 x 0

technically, f = O ( g ) does not imply a tight bound e.g., n = O ( n 2 ) is true, but there is no constant c such that c ⋅ n 2 will approximate the growth of n , as n gets large but we will generally try to find the tightest bounding function g

<latexit sha1_base64="E494fM/C71zsYKYfLM1BiCZDOyI=">ACLXicbVDLSgMxFM34rPVdekmWARXZUYEdVfUhcsK9gGdUjLpnTaYyQzJHbEM/SE3/oILiri1t8wfYDa9pLA4Zxzk3tPkEh0HWHztLyuraem4jv7m1vbNb2NuvmTjVHKo8lrFuBMyAFAqKFBCI9HAokBCPXi4Hun1R9BGxOoe+wm0ItZVIhScoaXahRtfQoilPKV+AF2hsoihFk+DvO8vONYFqvPr0aLbQ3/QLhTdkjsuOg+8KSiSaVXahTe/E/M0AoVcMmOanptgK2MaBZdgH04NJIw/sC40LVQsAtPKxtsO6LFlOjSMtb0K6Zj925GxyJh+FinHbRnZrURuUhrphetDKhkhRB8clHYSopxnQUHe0IDRxl3wLGtbCzUt5jmnG0AedtCN7syvOgdlryzkqXd2fF8tU0jhw5JEfkhHjknJTJLamQKuHkmbySIflwXpx359P5mliXnGnPAflXzvcPaDyl4Q=</latexit> E.g., binary search length ⇒ N def contains(lst, x): lo = 0 hi = len(lst) - 1 # iterations = O (?) while lo <= hi:  mid = (lo+hi) // 2   if x < lst[mid]:     hi = mid - 1   constant time  elif x > lst[mid]: lo = mid + 1      else :     return True else : return False

E.g., binary search length ⇒ N def contains(lst, x): lo = 0 hi = len(lst) - 1 # iterations = O (?) while lo <= hi: mid = (lo+hi) // 2 reduces search-space by ½ if x < lst[mid]: hi = mid - 1 elif x > lst[mid]: worst-case: x < min(lst) lo = mid + 1 else : return True else : return False

E.g., binary search length ⇒ N def contains(lst, x): lo = 0 hi = len(lst) - 1 # iterations ≈ # times we can divide while lo <= hi: length until = 1 mid = (lo+hi) // 2 if x < lst[mid]: hi = mid - 1 elif x > lst[mid]: lo = mid + 1 else : return True else : return False

E.g., binary search length = 1024 def contains(lst, x): lo = 0 hi = len(lst) - 1 # iterations ≈ # times we can divide while lo <= hi: length until = 1 mid = (lo+hi) // 2 if x < lst[mid]: hi = mid - 1 elif x > lst[mid]: lo = mid + 1 else : Iteration return True 0 1 2 3 4 5 6 7 8 9 10 else : Elements return False 1024 512 256 128 64 32 16 8 4 2 1 remaining

length = N 1 = N / 2 x # iterations ≈ # times we can divide 2 x = N length until = 1 log 2 2 x = log 2 N ≈ log 2 N x = log 2 N = O (log 2 N ) [ recall: log a x = log b x / log b a ] = O (log N )

<latexit sha1_base64="E494fM/C71zsYKYfLM1BiCZDOyI=">ACLXicbVDLSgMxFM34rPVdekmWARXZUYEdVfUhcsK9gGdUjLpnTaYyQzJHbEM/SE3/oILiri1t8wfYDa9pLA4Zxzk3tPkEh0HWHztLyuraem4jv7m1vbNb2NuvmTjVHKo8lrFuBMyAFAqKFBCI9HAokBCPXi4Hun1R9BGxOoe+wm0ItZVIhScoaXahRtfQoilPKV+AF2hsoihFk+DvO8vONYFqvPr0aLbQ3/QLhTdkjsuOg+8KSiSaVXahTe/E/M0AoVcMmOanptgK2MaBZdgH04NJIw/sC40LVQsAtPKxtsO6LFlOjSMtb0K6Zj925GxyJh+FinHbRnZrURuUhrphetDKhkhRB8clHYSopxnQUHe0IDRxl3wLGtbCzUt5jmnG0AedtCN7syvOgdlryzkqXd2fF8tU0jhw5JEfkhHjknJTJLamQKuHkmbySIflwXpx359P5mliXnGnPAflXzvcPaDyl4Q=</latexit> E.g., binary search length ⇒ N def contains(lst, x): lo = 0 hi = len(lst) - 1 # iterations = O (log N ) while lo <= hi:  mid = (lo+hi) // 2   if x < lst[mid]:     hi = mid - 1   constant time  elif x > lst[mid]: lo = mid + 1      else :     return True binary-search ( N ) = O (log N ) else : return False

So far: - linear search = O ( n ) - insertion sort = O ( n 2 ) - binary search = O (log n )

def quadratic_roots ( a , b , c ): discr = b **2 - 4* a * c if discr < 0: return None discr = math . sqrt ( discr ) return (- b + discr )/(2* a ), (- b - discr )/(2* a ) = O (?)

def quadratic_roots ( a , b , c ): discr = b **2 - 4* a * c if discr < 0: return None discr = math . sqrt ( discr ) return (- b + discr )/(2* a ), (- b - discr )/(2* a ) Always a fixed (constant) number of LOC executed, regardless of input. = O (?)

Runtime Complexity CS 331: Data Structures and Algorithms Michael - PowerPoint PPT Presentation

Runtime Complexity CS 331: Data Structures and Algorithms Michael Lee <lee@iit.edu> So far, our runtime analysis has been based on empirical data i.e., runtimes obtained from actually running our algorithms This data is very sensitive

Testing Concurrency Runtime via a Testing Concurrency Runtime via a Stochastic Stress Framework

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Background Background Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya

Kolmogorov Complexity of Categories Complexity Programing Language Kolmogorov Noson S.

IN 5210 Complexity Theory Complexity Complexity: Socio-technical (Internet, globalization)

Communication Complexity Lecture 23 Computing with remote inputs 1 Communication Complexity

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Runtime Complexity CS 331: Data Structures and Algorithms Computer Science Science So far, our

Horizon Runtime Efficient Event Scheduling in Runtime Efficient Event Scheduling in

Runtime systems Runtime systems Functional program are very high-level: its not obvious how to

The The SeETL RunTime RunTime SeETL Utilities Presentation Utilities Presentation

TenantGuard: Scalable Runtime Verification of Cloud-Wide VM-Level Network Isolation Han Song

Runtime System COMP 524: Programming Languages Based in part on slides and notes by J. Erickson,

Characteristics of Adapti tive Runtime Systems in HPC Laxmikant (Sanjay) Kale

WoT Runtime, Scripting, Bindings Zoltan Kis, Intel WoT Runtime WoT RT Script 1 Things Things

Fragmented Log Structured Merge Trees (Part 1) Presented by Deepak Varghese Pebble DB

LOOPS Loops Loops Loops! How can we repeat a piece of code without having to write it out over

Neutrino energy reconstruction in the DUNE far detector Nick Grant, Tingjun Yang 1 Updates

CPSC 231 - Lab LOOPS Based on Ryan Henry's Slides Loooooooooooo...oooop Sometimes we need to do

Computational Geometry Lecture 8: Range trees 1 Computational Geometry Lecture 8: Range trees

ASSIGNMENT AND LOOPS CSSE 120 Rose-Hulman Institute of Technology Outline (some of chapters 2

CS 356 Unit 3 IEEE 754 Floating Point Representation 3.2 Floating Point Used to represent

Edit distance Dynamic Programming Edit distance and its variants Misspellings make approximate