Categories • Parallel Batch Processing • Stream Databases / Processing Engines • Stream Languages • Distributed Programming • Processing Streams/FIFOs 40
Relationship to Other Work • A middleground • Below full SDMS and stream languages • Above non-stream oriented distributed communications • Roughly analogous to non-streaming: • Distributed Data Structures (Gribble, et al.) • BerkeleyDB (Seltzer, et al.) • Boxwood (MacCormick, et al.) 41
Key Differentiation • No computational model imposed • Lightweight; loose coupling • Stream data substrate 42
One Example Unblinking Eyes, for $20 Million, at Freedom Tower September 24, 2008 “If you have ever wondered how security guards can possibly keep an unfailingly vigilant watch on every single one of dozens of television monitors, each depicting a different scene, the answer seems to be (as you suspected): they can’t. Instead, they can now rely on computers to constantly ana- lyze the patterns, sizes, speeds, angles and motion picked up by the camera and determine - based on how they have been pro- grammed - whether this constitutes a possible threat. In which case, the computer alerts the security guard whose own eyes may have been momentarily diverted. Or shut.” The New York Times , City Room blog 43
Common Traits • Continuous streams • Heavy-weight • Unstructured • Feature detection/extraction • Computationally intensive • Distributed analysis • Historical data 44
Current Approaches • Centrally controlled & managed execution • Stream Data Management System (SDMS) • Stream Processing Engines • Declarative query languages • Continuous queries (CQ) • Independent communicating components • General distributed programming • Loose coupling • Lower level / more procedural 45
Gap • Integrate: • Higher-level time-based model • General distributed application approach • Keep looser coupling • Independent communicating components • Distributed data structures 46
Relationship to Other Work • A middleground • Below full SDMS • Above non-stream oriented distributed communications • Roughly analogous to non-streaming: • Distributed Data Structures (Gribble, et al.) • BerkeleyDB (Seltzer, et al.) • Boxwood (MacCormick, et al.) 47
Programming Model • Distributed data structures – channel • Threads communicate via channels Feature Detector Producer Analysis Video Features Producer Feature Detector Video Features 48
Time Variables Variable Param(s) Definition now - current time newest timestamp of newest visible ch item in channel ch oldest timestamp of oldest visible ch item in channel ch newest-after ch , ts timestamp of newest visi- ble item in channel ch if the item’s timestamp is af- ter ts , ∞ otherwise next ts + ǫ (the smallest ts timestamp > ts ) 49
Producer Code Example while ( true ) { Item item = new Item ( dataBuffer ) ; // Put the item i n t o the channel with // the d e f a u l t timestamp of ’now ’ chanConn . putItem ( item ) ; . . . } 50
Consumer Code Example Time32 lower = Time32 . newest ; Time32 upper = Time32 . v_now ; while ( true ) { Item item = chanConn . getItem ( lower , upper ) ; // Do something with the item . . . lower = Time32 . next ( item . getTimestamp ( ) ) ; } 51
Channel Groups • Only modify the visibility of items • One channel may belong to many groups Group A reference stream Text Stream ... Audio Stream ... Text Stream Group B 52
Channel Groups • Only modify the visibility of items • One channel may belong to many groups Group A reference stream Text Stream ... Audio Stream ... Text Stream Group B 52
Persistent Representation • Application provided pickling handler • Arbitrary N-to-1 mapping function • Transforms item for storage • Vary pickling function by load • Preserves time interval correspondence Before Pickling After Pickling 53
Local Channel Gets Microseconds per channel get 1.4 max 1.30 average 1.2 0.96 1 0.8 0.68 0.70 0.57 0.59 0.6 0.49 0.46 0.43 0.43 0.42 0.40 0.4 11 101 1001 10001 100001 1000001 items Increasing items in the channel 54
Local Channel Gets Microseconds per channel get 3 2.81 max average 2.7 2.39 2.4 2.09 2.06 2.05 2.03 2.1 2.01 1.85 1.8 1.73 1.59 1.60 1.60 1.5 1.33 1.17 1.17 1.19 1.2 0.99 0.92 0.9 0.79 0.61 0.6 1 2 3 4 5 6 7 8 9 10 interval size Increasing items in an interval 54
Remote Channel Gets Frame drops (motion JPEG, ~30fps) 6.83333 7 sum over all max of any client 6 5.16667 5 4 4 3 2.66667 2 1.16667 1 1 1 1 0 0 0 0 0 0 0 0 0 1 2 4 8 12 16 24 32 simultaneous consumers Increasing consumers – MJPEG (sum and max) 55
Remote Channel Gets Frame drops (motion JPEG, ~30fps) 0.5 average per client 0.4 0.3 0.215278 0.213542 0.166667 0.2 0.0972222 0.1 0 0 0 0 0 1 2 4 8 12 16 24 32 simultaneous consumers Increasing consumers – MJPEG (average) 55
Remote Channel Gets Frame drops (uncompressed, ~30fps) 10 9.5 sum over all max of any client 8 6 4 3 2 1.33333 1 0 0 0 0 0 0 0 1 2 4 8 12 simultaneous consumers Increasing consumers – RGB (sum and max) 55
Remote Channel Gets Frame drops (uncompressed, ~30fps) 1 average per client 0.791667 0.8 0.6 0.4 0.166667 0.2 0 0 0 0 1 2 4 8 12 simultaneous consumers Increasing consumers – RGB (average) 55
Remote Channel Gets Frame drops (RGB, ~30fps, replication) 12 sum over all max of any client 10 10.6667 10 9.33333 9 8 6 4 3 3 3 2 2 0 16 18 20 22 simultaneous consumers Increasing consumers – Replicated (sum and max) 55
Remote Channel Gets Frame drops (RGB, ~30fps, replication) 1 average per client 0.8 0.6 0.5625 0.518519 0.5 0.484848 0.4 0.2 0 16 18 20 22 simultaneous consumers Increasing consumers – Replicated (average) 55
Channel Group Video Producer Thread Node 1 (producer) Node 2 Video Channel Video Consumer Thread Node 3 Audio Channel Audio Producer Thread Audio Consumer Thread Experimental Setup 56
Channel Group Average ms of skew per frame (motion JPEG, ~30fps) 40 Without Groups With Groups 35 33.632 33.332 33.237 32.887 32.383 30 25 20 15 8.647 8.693 10 8.378 8.381 8.569 5 1 2 3 4 5 test run 56
Backend Comparison time in backend (%) 54 48.22 50 46 42 12 8.81 8 5.38 4 0.29 0 fs1 gpfs1 mysql null Time spent in backend 57
Backend Comparison daemon daemon fs+disk daemon kernel app be_lib app app fs+disk app kernel 600 OProfile samples (in thousands) 500 400 300 200 100 0 fs1 gpfs1 mysqlnull System-wide time 57
Backend Comparison Backend fs1 gpfs1 mysql null Total Instructions (Millions) 128 143 5,477 126 Instructions in Backend 1.99% 1.83% 97.50% 0.03% NOTE: Instructions in Backend percentages do not include initialization instructions. Callgrind (single producer) 57
Overhead of Storage Subsystem 250 200 Microseconds 150 100 50 0 0 25 50 75 100 Number of Items ∼ 118 µ secs 58
Persistence Automatic Load Adjustment 4 0.02 3.5 3 0.015 2.5 seconds 2 seconds 0.01 1.5 1 0.005 0.5 0 0 0 100 200 300 400 500 600 600 1200 1800 2400 3000 item number item number • Per-item storage latency • Automatically adjusting to overload • Dynamically vary pickling handler 59
Mixed Workload by Distribution 45 uniform 10 40 milliseconds zipf 9.5 35 9 binomial 30 8.5 25 8 20 50 60 15 10 5 0 0 20 40 60 80 100 percentage of live gets Per-get time with historical query distribution 60
Unary Feature Detector – main int main ( int argc , char ∗∗ argv ) { rt_sys_handle_t ∗ rtsh ; conn_endpt_t inchan , outchan ; cmdargs_t args ; // Parse command − l i n e args parse_args (& args ) ; // Connect to f r o n t end and contact super − nodes . rtsh = init rt ( argv [ 1 ] ) ; // Get channel d e s c r i p t o r . inchan = add chan ( rtsh , args . input_name ) ; outchan = add chan ( rtsh , args . output_name ) ; // Run the transform f u n c t i o n transform ( rtsh , & args , & inchan , & outchan ) ; // . . . 61
Unary Feature Detector – Transformation void transform ( rt_sys_handle_t ∗ rtsh , args_t ∗ args , cendpt_t in_ch , cendpt_t out_ch ) { to in // Create l o c a l copy of i n c h // . . . while ( ! done ) { // Get one item item − > status = conn get 1 i ( in , & low , & up , item ) ; // S u c c e s s f u l get i f ( item − > status == 0 && item − > buf_len > = 1) { in buf2 // Process item , r e s u l t rval = conn put ( out_ch , buf2 , buf2_sz , & item − > ts ) ; // . . . 62
Unary Feature Detector – Optical Flow Example decode_vid_hdr ( item − > buf , & height , & width , & bpp , & c ) ; // make g r a y s c a l e rgb_to_grayscale ( buf3 , DATA_OFFSET ( item − > buf ) , width , height , bpp ) ; // halve framerate i f ( i & 1 == 1) { goto skip ; } // do OpenCV o p t i c a l flow n = detect_optflow1 ( buf3 , width , height , buf2 ) ; // . . . send r e s u l t s in buf2 . . . 63
Historical Query Generator while ( true ) { // Generate a random time i n t e r v a l based // on rp ’ s params random_gen (& lower , & upper , percentage , & rp ) ; // Get rp . n items rval = conn_get_n ( chan , & lower , & upper , bufs , bufsizes , rp . n , metadata ) ; . . . } 64
Binary Feature Detector – main // . . . // Parse command − l i n e args // Connect to f r o n t end and contact super − nodes . // Get channel d e s c r i p t o r . inchan = add chan ( rtsh , args . input_name ) ; inchan2 = add chan ( rtsh , args . input_name2 ) ; outchan = add chan ( rtsh , args . output_name ) ; // Create channel group int ch_nums [ 2 ] = { inchan . chan_num , inchan2 . chan_num } ; make group ( rtsh , ch_nums , gr_nums , 2 , 1 ) ; // . . . Create background get thread . . . // Run the transform f u n c t i o n 65
Binary Feature Detector – Background Fetching Thread // . . . while ( ! done ) { // Get one item item − > status = conn get 1 i ( in , & low , & up , item , gr_nums [ 1 ] ) ; // S u c c e s s f u l get i f ( item − > status == 0 && item − > buf_len > = 1) { low = time add (& item − > ts , & one_micro ) ; ATOMIC_PUT (& globbg , item − > buf ) ; // . . . } } 66
Binary Feature Detector – Foreground Separation decode_vid_hdr ( item − > buf , & height , & width , & bpp , & c ) ; // grab c u r r e n t background ATOMIC_GRAB (& buf , & globbg ) ; cvSetImageData (& bgImg , DATA_OFFSET ( buf ) , width ∗ 3); // do foreground s e p a r a t i o n res = detect_fgdata1 ( DATA_OFFSET ( item − > buf ) , & bgImg , & tmpIn , width , height , buf2 ) ; // . . . send r e s u l t s in buf2 . . . 67
Binary Feature Detector – Camera Change Example decode_vid_hdr ( item − > buf , & height , & width , & bpp , & c ) ; // histogram scene change res = detect_schist1 ( DATA_OFFSET ( item − > buf ) , & lastHist , width , height , corr_hist , ncorr_hist , & curr_start , 0 . 5 ) ; . . . i f ( res > threshold ) { upper = now ( ) ; // f e t c h BG images from bg chan to confirm lower = TIME SUB ( item − > ts , TIME10MS ) ; conn get n ( bg_chan , & lower , & upper , bufs , bufsizes , 10 , NULL , gr_nums [ 1 ] ) ; // . . . 68
Airport Surveillance Optical Optical 2 2 Sensor Video Flow Flow 2 4 Producers Producers Node B Node C Agent 1 Face Face 4 RGB Video 2 2 Detector Detector 2 Sensors Node A Node D Node E Optical Flow Face Det. Video Sensor 1 1 4 2 Feature Feature Historical Historical Aggregator Aggregator Queries Queries Node H Node G Node H Topology 69
Airport Surveillance Component Configuration Total +Nodes Agent 1 per node, hosts 4 vid. / 2 sensor 2 2 Producers 6 per ASAP node, one per stream 12 - Historical Query 6 per dedicated node 12 2 Face Detection 2 per dedicated node 8 4 Optical Flow 2 per dedicated node 8 4 Face Aggregator 1 per dedicated node 1 1 Optical Flow Aggregator 1 per dedicated node 1 1 The “+Nodes” column specifies the additional nodes required by each column. 69
Airport Surveillance 100 Standalone 0RGB 90 1RGB 2RGB 80 1RGB-30mins Latency in ms 1RGB-45mins 70 60 50 40 30 20 10 0 Face FaceAgg Opt OptAgg Component Latency in ms. 69
Airport Surveillance 40 0RGB 1RGB 2RGB 1RGB-30mins 30 Time in ms 1RGB-45mins 20 10 0 RGB Compressed All Historical Query Time in ms. 69
Airport Surveillance Data Video Face Optical Flow Sensor Payload 230,408 128 128 1024 Response Header 64 64 64 64 Item Metadata 20 20 20 20 Total System Data 84 84 84 84 Total 230,492 212 212 1108 Overhead % 0.036% 39.62% 39.62% 7.58% System Data Overheads in bytes 69
Airport Surveillance Data Video Face Optical Flow Sensor Payload 230,408 128 128 1024 Response Header Len 2 2 2 2 Response Header 13 13 13 13 Data Header Len 2 2 2 2 Data Header 4 3 3 3 Item Metadata Len 2 2 2 2 Item Metadata 1 15-21 14-20 14-20 14-20 Total System Data 38-42 36-40 36-40 36-40 Max Total 230,450 168 168 1064 Overhead % 0.018% 23.81% 23.81% 3.76% System Data Overheads (Variable Length) 69
Port Asset Tracking – Drools Syntax Example rule ” Syntax Example” when [ condition 1] [ condition 2] . . . then [ action 1] [ action 2] . . . end 70
Port Asset Tracking – Truck Entering Port rule ”Truck Entering Port ” when : TruckEnterEvent () $tee then log ( $tee ) end 71
Port Asset Tracking – Truck Leaving Port rule ”Truck Leaving Port ” when : TruckLeaveEvent ( $tsl : tmstamp , $id1 : id ) $tle : TruckEnterEvent ( $tse : tmstamp , id == $id1 ) $tee then diff = ( $tsl . getTime () − $tse . getTime ( ) ) / 1000) log ( ”Truck %d l e a v i n g at %d secs ” , $id1 , diff ) retract ( $tle ) retract ( $tee ) insert ( new TruckCycle ( $tee , $tle , diff )) end 72
Port Asset Tracking – Phantom Truck Leaving Port rule ”Phantom Truck Leaving Port ” when : TruckLeaveEvent ( $id1 : id ) $tle not ( exists ( TruckEnterEvent ( id == $id1 )) ) then log ( ”No enter event : phantom truck %d” , $id1 ) // diagnose anomaly diagnosePhantom ( $tle ) end 73
Port Asset Tracking – Phantom Truck Actions // r u l e ”Phantom Truck Leaving Port ” a c t i o n s void diagnosePhantom ( TruckLeaveEvent tle ) { Long ts = tle . getTimestamp ( ) . getTime () / 1000; // based on estimated tr u ck timing , f i n d p o t e n t i a l // video of e n t r y Long tentry = ts − getAvgCycleTime ( ) ; Long tdelta = getAvgCycleTime () / 10; boolean done = found = f a l s e ; while ( ! done && ! found ) { // ± 10% around p o t e n t i a l e n t r y Item [ ] i = entryCam . getItems ( tentry − tdelta , tentry + tdelta ) ; // show human o p e r a t o r oo = showOperator ( i ) ; OpOutput // u si n g o p e r a t o r feedback , conti nue s e a r c h i n g tentry = oo . getTEntry ( ) ; tdelta = oo . getTDelta ( ) ; found = oo . isFound ( ) , done = oo . isDone ( ) ; } 74 // . . .
Port Asset Tracking – Truck Leaving Port Late rule ”Truck Leaving Port Late ” when // > 55 mins i s l a t e ( time i s in seconds ) $tc : TruckCycle ( time > (60 ∗ 55)) then log ( ”Truck Le ft Late ! ” ) // diagnose anomaly diagnoseLateDeparture ( $tc ) end 75
Port Asset Tracking – Late Departure Actions // r u l e ”Truck Leaving Port Late ” a c t i o n s void diagnoseLateDeparture ( TruckCycle tc ) { Long len = tc . getTruckLeaveEventTime () − tc . getTruckEnterEventTime ( ) ; Long tdelta = len / 10; // do a b i n a r y s e ar c h f o r the tr u ck d e l a y ; f i n d // the f i r s t p o i n t where the tr u ck s t a r t s running l a t e // s t a r t at the middle of the j o u r n e y Long ts = tc . getTruckEnterEventTime () + ( len / 2 ) ; while ( ! done ) { // get a p p r o p r i a t e channel f o r time i n d e l i v e r y c y c l e chan = getVideoChannelByTime ( tc , ts ) ; // . . . Item [ ] i = chan . getItems ( ts − tdelta , ts + tdelta ) ; // show o p e r a t o r video OpOutput oo = showOperator ( i ) ; // modify s e ar c h based on feedback // . . . 76
Port Asset Tracking – Truck Missed Checkpoint rule ”Truck Missed Checkpoint ” when // i f we have a gap i n c h e c k p o i n t s $ce1 : CheckptEvent ( $id1 : id , $point : point > 1) not ( ( CheckptEvent ( id == $id1 && exists point == ( $point − 1)) )) // c o l l e c t a l l p r i o r c h e c k p o i n t s i n t o a l i s t : ArrayList () $prior from collect ( CheckptEvent ( id == $id1 , $p2 : point < $point ) ) // f i n d the most r e c e n t checkpoint b e f o r e m i s s i n g : Number () $lp from accumulate ( CheckptEvent ( id == $id1 , $p2 : point < $point ) , max ( $p2 ) ) then log ( ”Truck %d m i s s i n g checkpoint # %d” , $id1 , $point − 1) log ( ” Last checkpoint # %d : %s ” , $lp , $prior ) // diagnose anomaly diagnoseMissedCheckpoint ( $ce1 , $prior , $lp ) end 77
Port Asset Tracking – Missed Checkpoint Actions // r u l e ”Truck Missed Checkpoint ” a c t i o n s void diagnoseMissedCheckpoint ( CheckptEvent ce , ArrayList < CheckptEvent > lst , Long lastPointId ) { HashMap < Long , CheckptEvent > map = pointListToMap ( lst ) ; 5 CheckptEvent last = map . get ( lastPointId ) ; // . . . Channel c = getVideoChanByCheckpt ( lastPointId +1); 10 // get data between two good c h e c k p o i n t s Item [ ] i = c . getItems ( last . getTimestamp ( ) , ce . getTimestamp ( ) ) ; 15 // use video f o r d i a g n o s i s // . . . } 20 78
Recommend
More recommend