61a lecture 35
play

61A Lecture 35 ! cant deal with huge data ! cant deal with infinite - PDF document

Last time: sequential data and iterators Sequences ! The sequence abstraction so far ! Length ! Element selection Lists and tuples ! Store all elements up-front 61A Lecture 35 ! cant deal with huge data ! cant deal with infinite


  1. Last time: sequential data and iterators Sequences ! The sequence abstraction so far ! Length ! Element selection • Lists and tuples ! Store all elements up-front 61A Lecture 35 ! can’t deal with huge data ! can’t deal with infinite sequences Iterators Monday, 28th November, 2011 ! Store how to compute elements ! Compute one element at a time ! Delay evaluation 2 Today: modularity, processing pipelines, and Last time: sequential data and iterators coroutines Streams -- a unit of delayed evaluation. Modularity in programs so far ! 2 elements, first and rest. ! Helper functions a.k.a “subroutines” ! “first” is stored Coroutines: what are they? ! “compute_rest” is stored ! calculate “rest” on demand Coroutines in python Native python iterator interface Types of coroutines ! __iter__() ! __next__() Multitasking ! for-loops rely on these methods Generator functions ! Functions that use yield to output values ! Creates a generator object ! __iter__() and __next__() automatically defined 3 4 Modularity so far: helper functions Modularity with Coroutines coroutine Modularity in programming? Main function ! Helper functions! • a.k.a. “subroutines” subroutine coroutine coroutine coroutine ! A sub-program responsible for a small piece of computation subroutine coroutine subroutine Coroutines are also sub-computations A main function is subroutine The difference: no main function responsible for calling all the subroutines Separate coroutines link together to form a complete pipeline subroutine 5 6

  2. Coroutines vs. subroutines: a conceptual difference Coroutines in python, or, the many faces of “yield” Previously: generator functions coroutine Main function ! Produce data with yield pauses execution subroutine local variables preserved def letters_generator(): coroutine coroutine coroutine resumes when .__next__ is called current = 'a' returns the yielded value while current <= 'd': subroutine yield current coroutine current = chr(ord(current)+1) subroutine pauses execution Now: coroutines local variables preserved co lleagues that co operate ! Consume data with yield resumes when .send(data) is called subroutine assigns value to yielded data value = (yield) subroutine value = (yield) (yield) returns the sent data. sub ordinate to a main function send(data) Execution resumes 7 8 Coroutines in Python Example: print out strings that match a pattern def match(pattern): Consuming data with yield: print('Looking for ' + pattern) execution starts try: ! value = (yield) while True: ! Execution pauses waiting for data to be sent resumes here stops here, waiting s = (yield) s = “the Jabberwock ...” for data if pattern in s: Send a coroutine data using send(...) print(s) match found except GeneratorExit: catch exception Start a coroutine using ___next__() print("=== Done ===") Signal the end of a computation using close() Step 1: Initialize does nothing >>> m = match(“Jabberwock”) • Raises GeneratorExit exception inside coroutine creates a new object Step 2: Start with __next__() >>> m.__next__() ‘Looking for Jabberwock’ Step 3: Send data >>> m.send(“the Jabberwock with eyes of flame”) ‘the Jabberwock with eyes of flame’ Step 4: close the coroutine >>> m.close() 9 ‘=== Done ===’ 10 Pipelines: the power of coroutines A simple pipeline coroutine read match words words coroutine coroutine coroutine coroutine We can chain coroutines together to achieve complex behaviors Create a pipeline Coroutines send data to others downstream 11 12

  3. A simple pipeline: reading words A simple pipeline read read match match read match words words words words words words def read(text, next_coroutine): needs to know where to send() for word in text.split(): read next_coroutine.send(word) next_coroutine.close() for word in text.split(): f o r l o o p next_coroutine.send(word) (yield) -- wait for next send read for word in text.split(): f o r l o o p send -- activate (yield) next_coroutine.send(word) while True: (yield) -- wait for next send w h i l e l o o p line = (yield) if pattern in line: print(line) match send -- activate (yield) l o o p value = (yield) next_coroutine 13 14 A simple pipeline Produce, Filter, Consume for word in text.split(): next_coroutine.send(word) Coroutines can have different roles in a pipeline next_coroutine.close() Based on how they use send() and yield Commending offending spending Commending is spending is read offending matcher send send send line = (yield) GeneratorExit text = ‘Comm ‘ending’ closed paused paused paused paused ... filter filter producer consumer (yield) (yield) ( y i e l d ) >>> matcher = match('ending') >>> matcher.__next__() ‘Looking for ending’ The consumer The producer The filter >>> text = 'Commending spending is offending to people pending lending!' only consumes data only sends data consumes with (yield) >>> read(text, matcher) and sends results ‘Commending’ downstream ‘spending’ ‘offending’ There can be many ‘pending’ layers of filters ‘lending!’ last word! ‘=== Done ===’ 15 16 Example: simple pipeline Breaking down match Producer Consumer Producer Consumer read matcher text = ‘Comm ‘ending’ read words match words def match(pattern): def read(text, next_coroutine): print('Looking for ' + pattern) for word in text.split(): try: next_coroutine.send(word) while True: next_coroutine.close() s = (yield) find matches print if pattern in s: print(s) except GeneratorExit: filter consumer print("=== Done ===") 17 18

  4. Breaking down match Multitasking find matches print def match_filter(pattern, next_coroutine): print('Looking for ' + pattern) consumer filter try: while True: s = (yield) coroutine if pattern in s: next_coroutine.send(s) except GeneratorExit: coroutine next_coroutine.close() coroutine coroutine def print_consumer(): coroutine >>> printer = print_consumer() print('Preparing to print') >>> printer.__next__() try: ‘Preparing to print’ while True: >>> matcher = match_filter('pend', printer) line = (yield) >>> matcher.__next__() print(line) ‘Looking for pend’ We do not need to be restricted to just one next step except GeneratorExit: >>> text = 'Commending spending is offending' print("=== Done ===") >>> read(text, matcher) ‘spending’ ‘=== Done ===’ 19 20 Read-to-many Matching multiple patterns def read(text, next_coroutine): match for word in text.split(): read coroutine ‘mend’ next_coroutine.send(word) next_coroutine.close() print read_to_many >>> printer = print_consumer() match def read_to_many(text, coroutines): ‘pe’ >>> printer.__next__() coroutine ‘Preparing to print’ for word in text.split(): >>> m = match_filter('mend', printer) for coroutine in coroutines: >>> m.__next__() coroutine.send(word) ‘Looking for mend’ read_to_many >>> p = match_filter("pe", printer) for coroutine in coroutines: >>> p.__next__() coroutine.close() ‘Looking for pe’ Any questions? coroutine >>> read_to_many(text, [m, p]) ‘Commending’ ‘spending’ ‘people’ ‘pending’ ‘=== Done ===’ 21 22 NEXT TIME MAP REDUCE http://www.infobarrel.com/Top_10_Tips_For_Snowboard_Beginners

Recommend


More recommend