CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann - PowerPoint PPT Presentation

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science Carnegie Mellon University Carnegie Mellon University Carnegie Mellon

CMU @ TRECVID 2009 E CMU @ TRECVID 2009 Event Detection t D t ti � CMU submitted all 10 event detection tasks � Part-based generic approach • Local features extracted from videos - Local features describe both appearance and motion - Bag of word features represent video content Bag of word features represent video content • Robust to action deformation, occlusion and illumination � Sliding window detection approach lidi i d d i h • Extend part-based method to detection tasks • False alarm reduction is a critical task False alarm reduction is a critical task 2

3 System overviw i t S

M SIFT MoSIFT – feature detection f t d t ti � MoSIFT detects spatial interest points in multiple scales p p p • Local maximum of Difference of Gaussian (DoG) � MoSIFT computes optical flow to detect moving areas � MoSIFT detects video interest areas by local maximum of DoG and optical flows 4

M SIFT MoSIFT – feature description f t d i ti � Descriptor of shape p p • Histogram of Gradient (HoG) • Aggregate neighbor areas as 4x4 grids; each grid is described as 8 orientations • 4x4x8 = 128 dimensional vector to describe shape of interest areas 4 4 8 128 di i l t t d ib h f i t t � Descriptor of motion • Histogram of Optical Flow (HoF); the same format as HoG Histogram of Optical Flow (HoF); the same format as HoG • 128 dimensional vector to describe motion of interest areas � 256 dimensional vectors as feature descriptors 5

E Event detection t d t ti � K-mean cluster algorithm is applied to quantize feature points extracted g pp q p from videos • K is chosen by cross-validation � A video codebook is built by clustering result id d b k i b il b l i l • A visual code is a category of similar video interest points � Bag of word (BoW) feature is constructed for each video sequence � Bag of word (BoW) feature is constructed for each video sequence • Soft weight is used to construct BoW feature � Event models are trained by Support Vector Machine (SVM) y pp ( ) • X 2 kernel is applied � Sliding window approach creates video sequence in both training and t testing sets ti t 6

E Evaluation metric - DCR l ti t i DCR � Normalized Detection Cost Rate (NDCR) is used to evaluate performances. ( ) p ( ) ( ) = • + • ( , ) , , DetectionC ost S E Cost P S E Cost R S E Miss Miss FA FA [0,1] [0, ∞ ) � Strongly penalize false alarms • NDCR doesn’t encourage to detect more positive examples as much as reducing false alarms alarms • Reducing false alarms is then extremely important to improve NDCR scores 7

F l False alarm reduction l d ti � Cascade architecture is highly used to reduce false alarm in detect g y tasks � We applied the idea of cascade algorithm in test phase to reduce false alarm • Two positive biased classifiers are built (due to computation, it can extend to more layers) y ) • Windows pass both classifiers will be predicted as positive All windows T T Detected windows M1 M2 F F Rejected windows 8

False alarm reduction (Cont.) F l l d ti (C t ) � Lesson from last year, multi-scale sliding window approach has a lot y , g pp of false alarm � We do not apply multi-scale this year � Instead of several short positive predictions, we aggregated consecutive positive predictions as a long positive segment • Reduce number of positive predictions • Reduce number of positive predictions � Performance improves 80% by cascade algorithm � Performance improves 40% by concatenating short predictions to long Performance improves 40% by concatenating short predictions to long predictions 9

S System set up t t � MoSIFT features are extracted via 3 different scales every 5 frames y • approximate 2160 hours for a single core to extract MoSIFT features � A sliding window (25 frames) slides every 5 frames � 1000 video codes � Soft weighted BoW feature representation (4 nearest clusters) � One against all SVM model for each action of each camera view • 50 models are built (10 actions * 5 camera views) 10

P Performance comparison f i 1.4 1.2 1 0.8 DCR CMU Min D Median Median Best 0.6 0.4 0.2 0 y s t g r e w t p e u a r e n t c U n r P o n e E u u a i t l M t t E R t o r c F i n c l b T o e g p n i e i o P m N n S l j l o b P l p e e E i e s r s O o C o k l r o e p e t a a p o P P T v v p p e e e e O O P P l E Actions 11

C Correct detection comparison t d t ti i 700 600 500 CMU CorDet Median 400 Num of C Max M 300 200 100 0 y t s r e w t g e u p r e a n t n c U r P o E n e u u a i l t E t M t R t F o r c i n c l b T o e g p n i e i o P m N n S l j l o P l b p e e i e s r E O s o k C o r l o e p e a t a p P o P T v p e e O P l E Action 12

13 2009) Performance (2008 v.s. 2009) (2008 f P

Hi h l High level feature extraction l f t t ti � Motion related high level features g • 7 motion related concepts • Airplane flying, Person playing soccer, Hand, Person playing a musical instrument Person riding a bicycle Person eating People dacing instrument, Person riding a bicycle, Person eating, People dacing MAP MM 0.24 PKU 0.21 TITG 0.20 CMU 0.18 FTRD FTRD 0 18 0.18 VIREO 0.18 Eurecom 0.18 14

C Conclusion & future work l i & f t k � Conclusion: • A generic approach to detect events • MoSIFT features captures both shape and motion information • Perform robust over all tasks P f b t ll t k • False alarm reduction is critical to improve DCR � Future work: • The approach can’t localize where the action is • The approach can further fuse with people tracking and global features • Bag of word representation is lack of spatial constraints 15

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann - PowerPoint PPT Presentation

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science Carnegie Mellon University Carnegie Mellon University Carnegie Mellon CMU @ TRECVID 2009 E CMU @ TRECVID 2009 Event Detection t D t ti CMU

Event Detection in Airport Surveillance The TRECVid 2008 Evaluation The TRECVid 2008 Evaluation

CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection(SED) Yang Cai , Qiang Chen , Lisa

George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID

Columbia HLF: TRECVID2006 TRECVID TRECVID TRECVID 2005 2005 2005 (development)

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Learning From Video Browse Behavior Learning From Video Browse Behavior TRECVID 2009 TRECVID

TRECVID 2008 CBCD TRECVID 2008. CBCD MCG-ICT-CAS MCG-ICT-CAS Sheng Tang Yongdong Zhang Ke Gao

TRECVID 2010 K TRECVID 2010 Known item Search it S h by NUS by NUS Xiangyu Chen, Jin Yuan

Adaptive Feature Discovery for TRECVID Broadcast News Video Story Segmentation @TRECVID Workshop

Conclusions TRECVID 2009 Conclusions TRECVID 2009 Multi Multi- -frame is true performance

Conclusions TRECVID 2008 Conclusions TRECVID 2008 Good settings for Bag Good settings

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

FACT: A Diagnostic for Group Fairness Trade-offs Joon Kim, CMU (joonsikk@cs.cmu.edu ) Jiahao Chen,

The bluetides simulation Tiziana DiMatteo (CMU ) Yu Feng (Berkeley), Rupert Croft (CMU ), Aklant

A New Boosting Algorithm Using Input-Dependent Regularizer Rong Jin rong+@cs.cmu.edu Yan Liu

Design parameters depend on demand pattern 24x7 water service Water consump tion 4am 6am

The PREP Process An Implementing Partner's Guide to a Quality PREP Melissa Joy,

2020 Data Users Meeting April 21, 2020 Joe Parsons Chair, Agricultural Statistics Board

FERC Order 841: Leveling the Playing Field for Energy Storage Resource Market Participation

Transplant Debate: How Old is Too Old for Kidney Transplant Chris E. Freise, M.D UCS F

Center for Medicare and Medicaid Services Shari M Ling, MD

Maryland Health Services Cost Review Commission Steering Committee Meeting September 11, 2020

Federal Perspective Murray Sheldon, MD Center for Devices and Radiological Health U.S. Food and

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann - PowerPoint PPT Presentation

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science Carnegie Mellon University Carnegie Mellon University Carnegie Mellon CMU @ TRECVID 2009 E CMU @ TRECVID 2009 Event Detection t D t ti CMU

Event Detection in Airport Surveillance The TRECVid 2008 Evaluation The TRECVid 2008 Evaluation

CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection(SED) Yang Cai *, Qiang Chen *, Lisa

George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID

Columbia HLF: TRECVID2006 TRECVID TRECVID TRECVID 2005 2005 2005 (development)

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Learning From Video Browse Behavior Learning From Video Browse Behavior TRECVID 2009 TRECVID

TRECVID 2008 CBCD TRECVID 2008. CBCD MCG-ICT-CAS MCG-ICT-CAS Sheng Tang Yongdong Zhang Ke Gao

TRECVID 2010 K TRECVID 2010 Known item Search it S h by NUS by NUS Xiangyu Chen, Jin Yuan

Adaptive Feature Discovery for TRECVID Broadcast News Video Story Segmentation @TRECVID Workshop

Conclusions TRECVID 2009 Conclusions TRECVID 2009 Multi Multi- -frame is true performance

Conclusions TRECVID 2008 Conclusions TRECVID 2008 Good settings for Bag Good settings

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

FACT: A Diagnostic for Group Fairness Trade-offs Joon Kim, CMU (joonsikk@cs.cmu.edu ) Jiahao Chen,

The bluetides simulation Tiziana DiMatteo (CMU ) Yu Feng (Berkeley), Rupert Croft (CMU ), Aklant

A New Boosting Algorithm Using Input-Dependent Regularizer Rong Jin rong+@cs.cmu.edu Yan Liu

Design parameters depend on demand pattern 24x7 water service Water consump tion 4am 6am

The PREP Process An Implementing Partner's Guide to a Quality PREP Melissa Joy,

2020 Data Users Meeting April 21, 2020 Joe Parsons Chair, Agricultural Statistics Board

FERC Order 841: Leveling the Playing Field for Energy Storage Resource Market Participation

Transplant Debate: How Old is Too Old for Kidney Transplant Chris E. Freise, M.D UCS F

Center for Medicare and Medicaid Services Shari M Ling, MD

Maryland Health Services Cost Review Commission Steering Committee Meeting September 11, 2020

Federal Perspective Murray Sheldon, MD Center for Devices and Radiological Health U.S. Food and

CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection(SED) Yang Cai , Qiang Chen , Lisa